Gemini Nano to Gain Multimodal Capabilities; Coming to Pixel Later in the Year


Google’s I/O 2024 event has emerged as a hub for AI innovations. In addition to announcing updates like an AI video generator named Veo, aiming to rival OpenAI’s Sora, and Gemini Flash 1.5, Google revealed plans to introduce multimodal capabilities to Gemini Nano. This on-device LLM model, known for its lightweight design, will soon be able to process audio, images, and files alongside text inputs.

Gemini Nano, introduced by Google last December alongside Gemini Ultra and Gemini Pro, is currently exclusive to the Google Pixel 8 series and Samsung Galaxy S24. However, the model currently only supports text inputs.

With the inclusion of multimodal capabilities, Gemini Nano will be able to collect contextual information from a variety of sources, such as audio, images, and spoken language. Google plans to roll out this feature to Pixel devices later this year.


What's Your Reaction?

hate hate
333
hate
confused confused
66
confused
fail fail
533
fail
fun fun
466
fun
geeky geeky
400
geeky
love love
200
love
lol lol
266
lol
omg omg
66
omg
win win
533
win

0 Comments

Your email address will not be published. Required fields are marked *