Google’s AI Agent “Project Mariner” Can Complete Tasks for You in Chrome

Last month, we reported that Google was working on an AI agent as a browser extension capable of performing tasks within the web browser. In today’s Gemini 2.0 announcement, Google has officially revealed Project Mariner, an early prototype that could pave the way for the future of human-agent interaction.

Project Mariner, powered by the latest Gemini 2.0 model, can analyze what’s displayed on your browser screen and use that information to complete tasks on your behalf. It can interpret various web elements such as forms, text fields, code, images, and more. This web extension, driven by Project Mariner, is capable of typing, scrolling, and clicking within the active tab. However, for sensitive actions like making a purchase, it requires the user’s final confirmation.

Google notes that the early prototype is currently slow and may not always be accurate, but it is expected to improve quickly. In a demo presented by Google, Project Mariner was able to recall company names from a Google Sheet, browse the web, locate company websites, and extract contact details.

In the WebVoyager benchmark, which evaluates the agentic capabilities of models on real-world web tasks, Project Mariner scored 83.5%, the highest result to date. Google says it is working with trusted testers to enhance Project Mariner, but has not shared any information regarding its release date.

Regarding Project Astra, announced at Google I/O 2024, Google reports that it can now understand multiple languages and utilize tools like Google Search, Maps, and Lens to enhance the user experience.Furthermore, Project Astra has enhanced its memory capabilities, now able to retain up to 10 minutes of in-session memory for improved personalization. Google has also made significant strides in reducing latency.

While the release date for Project Astra is unclear, Google has stated that its features will be integrated into the Gemini app and other devices, such as glasses.

In addition, Google revealed that it is collaborating with game developers to explore how its AI agents perform in games like Clash of Clans and Hay Day. Powered by Gemini 2.0, these AI agents can view the screen and provide real-time suggestions. They can also leverage Google Search to offer gaming insights on the go.

Finally, Google unveiled Jules, an AI code agent built for developers that integrates smoothly into a GitHub workflow. Jules can identify issues, create a plan, and execute it with the developer’s oversight. For more information about Jules, click here.

Share this article
Shareable URL
Leave a Reply

Your email address will not be published. Required fields are marked *

Read next
0
Share