Techno Blender
Digitally Yours.

Researchers published details about new MM1 model

0 22


Ahead of the announcement of iOS 18, which is expected to be packed with AI features, Apple researchers published a paper highlighting how they’re training a new large language model (LLMs).

Called MM1, this LLM can integrate text and visual information as one. The paper was submitted last week and offers an interesting look at the importance of various architectural components and data choices. The researchers say they were able to “demonstrate that for large-scale multimodal pre-training using a careful mix of image-caption, interleaved image-text, and text-only data is crucial for achieving state-of-the-art (SOTA) few-shot results across multiple benchmarks, compared to other published pre-training results.”

In addition, they showed that “the image encoder together with image resolution and the image token count has a substantial impact, while the vision-language connector design is of comparatively negligible importance.”

Apple’s MM1 AI model uses a family of multimodal models with up to 30B parameters, consisting of both dense models and mixture-of-experts (MoE) variants, that are SOTA in pre-training metrics and achieve competitive performance after supervised fine-tuning on a range of established multimodal benchmarks.

Apple’s AI features could include Google’s or OpenAI’s functions

Image source: José Adorno for BGR

Apple has teased its AI applications for almost a year now. In the past two earning calls, the company’s CEO has said they have many features to announce. More interestingly, while Apple has been publishing papers and teasing upcoming AI features, Bloomberg’s Mark Gurman shared that Apple is also in talks to use Google Gemini with iOS 18.

Apple is apparently in talks with Google to license Gemini after having previously considered OpenAI’s ChatGPT.

While there’s no telling if Apple will partner with Google, the move isn’t necessarily surprising. Gemini already powers generative AI features on the Pixel 8 and the Galaxy S24. The latter certainly made an impression earlier this year. One of the Galaxy S24’s highlights comes from Google.

That said, there’s a lot to expect from Apple. BGR will make sure to let you know about all the company’s upcoming AI features.


Ahead of the announcement of iOS 18, which is expected to be packed with AI features, Apple researchers published a paper highlighting how they’re training a new large language model (LLMs).

Called MM1, this LLM can integrate text and visual information as one. The paper was submitted last week and offers an interesting look at the importance of various architectural components and data choices. The researchers say they were able to “demonstrate that for large-scale multimodal pre-training using a careful mix of image-caption, interleaved image-text, and text-only data is crucial for achieving state-of-the-art (SOTA) few-shot results across multiple benchmarks, compared to other published pre-training results.”

In addition, they showed that “the image encoder together with image resolution and the image token count has a substantial impact, while the vision-language connector design is of comparatively negligible importance.”

Apple’s MM1 AI model uses a family of multimodal models with up to 30B parameters, consisting of both dense models and mixture-of-experts (MoE) variants, that are SOTA in pre-training metrics and achieve competitive performance after supervised fine-tuning on a range of established multimodal benchmarks.

Apple’s AI features could include Google’s or OpenAI’s functions

Image source: José Adorno for BGR

Apple has teased its AI applications for almost a year now. In the past two earning calls, the company’s CEO has said they have many features to announce. More interestingly, while Apple has been publishing papers and teasing upcoming AI features, Bloomberg’s Mark Gurman shared that Apple is also in talks to use Google Gemini with iOS 18.

Apple is apparently in talks with Google to license Gemini after having previously considered OpenAI’s ChatGPT.

While there’s no telling if Apple will partner with Google, the move isn’t necessarily surprising. Gemini already powers generative AI features on the Pixel 8 and the Galaxy S24. The latter certainly made an impression earlier this year. One of the Galaxy S24’s highlights comes from Google.

That said, there’s a lot to expect from Apple. BGR will make sure to let you know about all the company’s upcoming AI features.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment