Apple introduces AI model that edits photos with text instructions

February 8, 2024  16:14

Apple's research division, in collaboration with researchers from the University of California, Santa Barbara, has introduced the multimodal artificial intelligence model MGIE for image editing. Users can make changes to an image by simply describing what they want in natural language. 

MGIE (Multimodal Large Language Model-Guided Image Editing) can be used for various image editing tasks, such as adding or removing objects. When given a command, the model interprets the user's words and then "imagines" how the altered image will correspond to those instructions. 

The article describing MGIE provides several examples of its functionality. For instance, when editing a photo of pizza with the request to "make it healthier," the model added vegetable toppings. In another case, when presented with an overly dark picture of a cheetah in the desert and asked to "add contrast, simulating more light," the image became brighter.

mgie_1.jpg (226 KB)

MGIE is available for free download on GitHub, and users can try it out on the Hugging Face Spaces platform. Apple has not specified the company's plans for the model beyond the research project.

While image editing is supported by some AI generators like OpenAI DALL-E 3, and generative AI features exist in Photoshop through the Adobe Firefly model, Apple does not position itself as a major player in the field of artificial intelligence, unlike Microsoft, Meta, or Google. However, Apple's CEO, Tim Cook, has stated that new AI features will be added to their devices this year. In December of last year, the company released the open platform MLX for training AI models on Apple Silicon chips.

mgie_2 (1).jpg (460 KB)

  • Archive