AI

Gemini user hits 5-hour usage cap after a single prompt, Google responds

At a glance:

  • A Google AI Pro subscriber hit the five-hour usage cap after a single failed video-generation prompt using the avatar feature.
  • Google's new compute-based limits factor in prompt complexity, making usage unpredictable for subscribers.
  • Google has acknowledged the complaint and is investigating, amid broader user dissatisfaction with tighter quotas.

The triggering incident

A Google AI Pro user, Ashutosh Shrivastava, reported exhausting his five-hour usage allowance in mere minutes after attempting a single video-generation task. Shrivastava used Gemini's avatar-based video generation feature, which creates animated avatars from text prompts. According to his account on X, the prompt ran for approximately three to four minutes but ultimately failed, yet it consumed 100% of his rate limit. He supplemented his claim with a video as proof, showing the rapid depletion of his quota. This incident underscores a growing pain point with Google's recent shift to a compute-based allocation system, where even unsuccessful tasks can incur significant costs.

How the new compute-based system works

Google recently replaced its previous fixed-prompt limit with a dynamic, compute-based credit system for Gemini. This new approach calculates usage quotas based on multiple factors: the inherent complexity of prompts, the specific features employed (such as video generation or advanced reasoning), and the overall length of conversations. Under the Google AI Pro plan, these limits refresh every five hours until users eventually reach a broader weekly quota. While intended to distribute computational resources more efficiently during high demand, the system has introduced uncertainty. Users now struggle to gauge how much a single task will consume, as tasks like video generation—which require substantial processing power—can burn through credits quickly, as evidenced by Shrivastava's experience.

User frustration and historical context

Complaints about Gemini's updated quota system are mounting, with many subscribers voicing concerns on platforms like the Gemini subreddit. Users argue that the new limits feel significantly more restrictive compared to the predictable, fixed-prompt model of the past. The opacity of the compute-based system exacerbates frustration, as it's difficult to estimate usage in advance. This sentiment is echoed in multiple posts criticizing Google for tightening access without clear communication. Interestingly, Google has boosted usage quotas for Antigravity users—a specific tier or group—by as much as 9x compared to the immediate period after initial restrictions were implemented. However, for most regular Google AI Pro subscribers, the broader caps appear unchanged, leading to a perception of uneven treatment and unresolved pain points for the general user base.

Google's response and the path forward

Google has responded directly to the high-profile complaint. Josh Woodward, the lead of the Gemini team, replied to Shrivastava's post on X with a succinct acknowledgment: "Yikes, let us take a look!" This indicates that the company is aware of the issue and is conducting an investigation. However, the incident highlights a critical need for Google to address systemic problems with the compute-based limits. To restore user confidence, Google could consider making the system more transparent—perhaps by providing real-time usage estimators or clearer guidelines—or by loosening restrictions and increasing overall quotas for paying subscribers. After all, premium AI tools are expected to offer reliability and accessibility, not leave users stranded after a single prompt. The outcome of this scrutiny may set a precedent for how AI platforms balance resource management with user experience in an increasingly competitive landscape.

Broader implications for AI service reliability

The episode with Gemini reflects a wider challenge in the AI industry: how to monetize and scale powerful models without alienating users through unpredictable limitations. As AI services become more computationally intensive, providers like Google must navigate the tension between cost recovery and user satisfaction. For businesses and creators relying on tools like Gemini for video generation, sudden usage caps can disrupt workflows and erode trust. This incident may prompt Google to refine its metering approach, potentially offering tiered add-ons or more granular control over resource allocation. Monitoring user feedback and competitor strategies—such as those from OpenAI or Anthropic—will be crucial as the market evolves. Ultimately, transparency and flexibility could determine whether compute-based systems enhance or hinder adoption of premium AI offerings.

What to watch next

Key developments to monitor include Google's official findings from its investigation into the five-hour cap issue and any subsequent policy adjustments. Users should watch for updates on whether the company will introduce more predictable usage metrics or expand quotas for Google AI Pro subscribers. Additionally, the response from the broader community—especially on forums like Reddit—will signal if this is an isolated incident or part of a larger pattern of dissatisfaction. Competitors may seize on this as an opportunity to highlight their own usage models, so Google's handling of the situation could impact its standing in the AI platform race. Finally, any changes to the compute-based system will serve as a bellwether for how AI companies balance technical constraints with user expectations in the subscription economy.

Editorial SiliconFeed is an automated feed: facts are checked against sources; copy is normalized and lightly edited for readers.

FAQ

What caused the user to hit the five-hour usage cap so quickly?
The user, Ashutosh Shrivastava, attempted a video-generation prompt using Gemini's avatar feature. The prompt ran for three to four minutes but failed, consuming the entire five-hour allowance in that short time. This highlights how the new compute-based system can rapidly deplete limits based on task complexity, as video generation requires significant computational resources.
How does Google's new compute-based usage limit system work for Gemini?
Instead of a fixed number of prompts, Gemini now uses a credit-style system that accounts for prompt complexity, features used, and conversation length. Limits refresh every five hours until a broader weekly quota is reached. This dynamic approach aims to allocate resources efficiently during peak times but has led to unpredictability, as users cannot easily estimate how much a single task will cost in terms of credits.
Has Google responded to the complaints about the usage caps?
Yes, Josh Woodward, the lead of Gemini, responded to the complaint on X with 'Yikes, let us take a look!' Google has acknowledged the issue and is investigating. However, users continue to express frustration on platforms like the Gemini subreddit, citing tighter limits compared to the previous system. The company may need to increase transparency or adjust quotas to address widespread dissatisfaction.

More in the feed

Prepared by the editorial stack from public data and external sources.

Original article