News

Google releases updated Gemini 2.5 Flash: -50% costs and enhanced AI performance

Article Highlights:
  • Google releases updated Gemini 2.5 Flash with 50% cost reduction for Flash-Lite
  • 5% improvement on SWE-Bench Verified and enhanced agentic performance
  • New "-latest" aliases for simplified access to newest models
  • Enhanced multimodal capabilities with better image understanding and audio transcription
  • 24% token reduction for standard Gemini 2.5 Flash
  • Immediate availability on Google AI Studio and Vertex AI
  • 2-week email notice for updates and deprecations
  • Superior performance in complex multi-step and agentic applications
Google releases updated Gemini 2.5 Flash: -50% costs and enhanced AI performance

Introduction

Google has announced the release of updated versions of Gemini 2.5 Flash and Flash-Lite, available on Google AI Studio and Vertex AI. These updates promise significant operational cost reductions of up to 50% and substantial performance improvements, representing an important step in the evolution of accessible artificial intelligence.

Key improvements in updated Gemini 2.5 Flash

The new Gemini 2.5 Flash introduces improvements in two fundamental areas based on developer feedback:

Enhanced agentic tool usage

Google has significantly enhanced the model's ability to use complex tools, achieving superior performance in multi-step and agentic applications. The improvement is quantifiable: a 5% increase on SWE-Bench Verified, rising from 48.9% to 54% compared to the previous version.

Superior operational efficiency

With "thinking" mode enabled, the model achieves higher quality outputs using fewer tokens, simultaneously reducing latency and operational costs by 24%.

"The new Gemini 2.5 Flash model offers a remarkable blend of speed and intelligence. Our evaluation on internal benchmarks revealed a 15% leap in performance for long-horizon agentic tasks."

Yichao 'Peak' Ji, Co-Founder & Chief Scientist at Manus

Gemini 2.5 Flash-Lite: optimized performance

The Flash-Lite version was developed following three key themes that make it particularly effective for high-throughput applications:

  • Better instruction following: The model follows complex instructions and system prompts significantly more accurately
  • Reduced verbosity: Produces more concise responses, a key factor in the 50% token cost reduction
  • Enhanced multimodal capabilities: More accurate audio transcription, better image understanding, and superior translation quality

Simplified access with "-latest" aliases

Google introduces an alias system to simplify access to the latest models. Developers can now use:

  • gemini-flash-latest
  • gemini-flash-lite-latest

These aliases always point to the most recent versions, eliminating the need to update code for each release. Google guarantees a 2-week email notice before any updates or deprecation.

Cost and performance impact

The economic improvements are substantial: a 50% reduction in output tokens for Flash-Lite and 24% for standard Flash. These improvements make AI more accessible for large-scale applications while maintaining or improving output quality.

Conclusion

The Gemini 2.5 Flash update represents an optimal balance between performance and economic sustainability. Significant cost reductions, combined with enhanced capabilities, open new possibilities for AI implementation in complex production scenarios.

FAQ

How much does it cost to use the new Gemini 2.5 Flash?

Costs are reduced by 24% for Flash and 50% for Flash-Lite compared to previous versions, thanks to fewer required output tokens.

How can I access the new Gemini 2.5 Flash versions?

You can use the models through Google AI Studio and Vertex AI with strings gemini-2.5-flash-preview-09-2025 or aliases gemini-flash-latest.

What are the main improvements in updated Gemini 2.5 Flash?

Improvements include optimized agentic tool usage, superior efficiency, and a 5% increase on SWE-Bench Verified benchmark.

Does the new Gemini Flash-Lite support multimodal applications?

Yes, Flash-Lite offers enhanced multimodal capabilities with more accurate audio transcription and better image understanding.

What does the "-latest" alias mean for Gemini models?

The "-latest" alias always points to the most recent model version, simplifying access without code modifications.

When do Gemini 2.5 Flash preview versions become stable?

These releases are not intended to become stable versions, but help Google gather feedback for future stable releases.

Introduction Google has announced the release of updated versions of Gemini 2.5 Flash and Flash-Lite, available on Google AI Studio and Vertex AI. These Evol Magazine
Tag:
Google Gemini