AI

Gemini in Chrome adds 'Select from screen' tool while Gemini 3.5 Flash gains computer use capabilities

At a glance:

  • Gemini in Chrome introduces a 'Select from screen' tool in Chrome 149 for adding images to prompts.
  • Gemini 3.5 Flash model now includes native computer use capabilities, replacing the standalone Gemini 2.5 Computer Use model.
  • Developers can build cross-platform agents with enhanced automation and enterprise safety controls.

What's new in Gemini for Chrome

Google has enhanced its Gemini integration within the Chrome browser by introducing a new "Select from screen" tool. This feature, available in Chrome version 149, allows users to quickly incorporate visual content into their prompts. Located at the bottom of the '+' menu, the tool enables users to highlight their current tab and select any text or image to query Gemini. Once selected, the image is automatically added to the prompt box, streamlining the process of interacting with visual data.

The rollout of this tool underscores Google's focus on making AI interaction more intuitive and seamless within its browser ecosystem. Users may need to restart their browser to access the feature if it's not immediately visible. This update positions Gemini as a more integrated part of Chrome's user experience, particularly for tasks requiring visual context.

Gemini 3.5 Flash's computer use integration

The Gemini 3.5 Flash model now includes a built-in computer use tool, marking a significant advancement in AI-driven automation. This native integration replaces the standalone Gemini 2.5 Computer Use model, offering developers a more streamlined approach to building intelligent agents. The model can now "see, reason, and take action" across browser, mobile, and desktop environments, enabling more complex workflows.

Google highlights improvements in long-horizon and enterprise automation tasks, such as continuous software testing and knowledge work across professional applications. For instance, the model can analyze the Gemini app and categorize its features, demonstrating its ability to process and organize information autonomously. This capability is now accessible via the Gemini API, with a demo environment available through Browserbase for testing.

Developer tools and applications

Developers can leverage the updated Gemini 3.5 Flash to create custom agents tailored for cross-platform environments. The model's enhanced reasoning and action capabilities allow for more sophisticated automation, particularly in enterprise settings. Google provides a reference implementation and documentation through the Gemini API and Gemini Enterprise Agent Platform to facilitate adoption.

The integration with Search and Maps grounding further expands the model's utility, enabling it to combine real-time data with contextual understanding. This positions Gemini 3.5 Flash as a versatile tool for developers aiming to build AI-driven solutions that interact with both digital and physical environments. The availability of a demo environment also lowers the barrier for experimentation and prototyping.

Safety and enterprise controls

Enterprise users benefit from robust safety features in Gemini 3.5 Flash. Organizations can require explicit user confirmation for sensitive or irreversible actions, ensuring human oversight in critical processes. Additionally, the model automatically halts tasks if an indirect prompt injection is detected, mitigating risks of malicious input manipulation.

These safeguards reflect Google's emphasis on secure AI deployment in professional contexts. The combination of automation capabilities and safety measures makes the model suitable for high-stakes applications, such as software testing or data analysis. By integrating these controls natively, Google aims to address concerns around AI autonomy while maintaining functionality.

How to try and build with Gemini

Users can test the new capabilities through a demo environment hosted by Browserbase, providing a hands-on experience with the model's features. For developers, Google offers comprehensive resources via the Gemini API and Gemini Enterprise Agent Platform, including documentation and code examples. These tools are designed to accelerate the development of custom AI agents and automation workflows.

The availability of both testing and building resources signals Google's intent to foster a developer community around Gemini's evolving capabilities. Early adopters can explore the potential of computer use integration while contributing feedback for future iterations. This approach aligns with Google's strategy of iterative development and community-driven innovation in AI technologies.

Editorial SiliconFeed is an automated feed: facts are checked against sources; copy is normalized and lightly edited for readers.

FAQ

What is the 'Select from screen' tool in Gemini for Chrome?
The 'Select from screen' tool, introduced in Chrome 149, allows users to quickly add images to their prompts by selecting content directly from their current tab. Located in the '+' menu, it streamlines the process of incorporating visual data into AI interactions. Users may need to restart their browser to access the feature if it's not immediately visible.
How does Gemini 3.5 Flash's computer use capability work?
Gemini 3.5 Flash integrates a native computer use tool that enables developers to build agents capable of seeing, reasoning, and acting across browser, mobile, and desktop environments. This replaces the standalone Gemini 2.5 Computer Use model and improves performance in long-horizon tasks like software testing. The model can analyze apps and categorize features, as demonstrated in Google's example with the Gemini app.
What safety measures are included for enterprise users?
Enterprise customers can require explicit user confirmation for sensitive actions and configure automatic task halting upon detecting indirect prompt injection. These controls ensure secure AI deployment in professional settings. The features are part of Google's effort to balance automation with oversight, particularly for high-stakes applications.

More in the feed

Prepared by the editorial stack from public data and external sources.

Original article