Apple's research division, in collaboration with researchers from the University of California, Santa Barbara, has introduced the multimodal artificial intelligence model MGIE for image editing. Users can make changes to an image by simply describing what they want in natural language.
MGIE (Multimodal Large Language Model-Guided Image Editing) can be used for various image editing tasks, such as adding or removing objects. When given a command, the model interprets the user's words and then "imagines" how the altered image will correspond to those instructions.
The article describing MGIE provides several examples of its functionality. For instance, when editing a photo of pizza with the request to "make it healthier," the model added vegetable toppings. In another case, when presented with an overly dark picture of a cheetah in the desert and asked to "add contrast, simulating more light," the image became brighter.
While image editing is supported by some AI generators like OpenAI DALL-E 3, and generative AI features exist in Photoshop through the Adobe Firefly model, Apple does not position itself as a major player in the field of artificial intelligence, unlike Microsoft, Meta, or Google. However, Apple's CEO, Tim Cook, has stated that new AI features will be added to their devices this year. In December of last year, the company released the open platform MLX for training AI models on Apple Silicon chips.