Gemini Nano to Gain Multimodal Capabilities; Coming to Pixel Later in the Year

Google’s I/O 2024 event has emerged as a hub for AI innovations. In addition to announcing updates like an AI video generator named Veo, aiming to rival OpenAI’s Sora, and Gemini Flash 1.5, Google revealed plans to introduce multimodal capabilities to Gemini Nano. This on-device LLM model, known for its lightweight design, will soon be able to process audio, images, and files alongside text inputs.

https://twitter.com/Google/status/1790448941063950520?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E1790448941063950520%7Ctwgr%5E0c4d1e27bfe56e263a9cb4282df1aff11a87db00%7Ctwcon%5Es1_&ref_url=https%3A%2F%2Fbeebom.com%2Fgemini-nano-multimodal-capabilities%2F

Gemini Nano, introduced by Google last December alongside Gemini Ultra and Gemini Pro, is currently exclusive to the Google Pixel 8 series and Samsung Galaxy S24. However, the model currently only supports text inputs.

With the inclusion of multimodal capabilities, Gemini Nano will be able to collect contextual information from a variety of sources, such as audio, images, and spoken language. Google plans to roll out this feature to Pixel devices later this year.

Gemini Nano to Gain Multimodal Capabilities; Coming to Pixel Later in the Year

Share this article

Leave a Reply Cancel reply

Read next

WhatsApp Begins Rolling Out Usernames, Eliminating the Need to Share Phone Numbers

Samsung Prepares Wide Fold To Challenge Apple’s 2026 Foldable

Apple’s Foldable IPhone Could Replace Face ID With Touch ID

Lenovo Idea Tab Plus Debuts With 90Hz Screen And 10,200mAh Battery