Apple released its own neural network that works with text and images

December 26, 2023  20:17

Apple, with the support of researchers from Cornell University, released its own multimodal large language model, Ferret, to the public back in October.

Although Apple's release of Ferret on GitHub in October wasn't accompanied by major announcements, the project later attracted participation from industry experts.

Ferret works by studying a specified image fragment, identifying objects in that area, and outlining them. Recognized objects in the image fragment are treated as part of the query, and the system provides a textual response.

For example, a user can highlight an image of an animal and ask Ferret to recognize it, with the model providing information about the type of animal, allowing for additional context-related questions.

The open-source Ferret model is described as a system capable of "referencing and reasoning about anything, anywhere, and in any detail," explained Zhe Gan, a researcher from Apple's AI division.

 Industry experts highlight the importance of Apple's project release in this format, demonstrating openness from a traditionally closed company.

According to one interpretation, Apple took this step to compete with Microsoft and Google, lacking comparable computational resources. As a result, Apple couldn't rely on launching its own ChatGPT competitor and had to choose between partnering with a cloud hyperscaler or releasing the project in an open format, as Meta did previously.


 
 
 
 
  • Archive