Making AI Rate Limits User-Friendly: A Lesson in Resilient Design

The Frustration of Unfair Rate Limiting

Imagine trying to use a service, it fails due to an external outage, and then you find yourself locked out for the rest of the day, even though you never got to use the service successfully. This frustrating scenario recently plagued users of our devlog-ist/landing project's AI generation features. During a temporary outage of an underlying AI provider, users attempting to generate content were met with errors, but critically, their daily rate limit quota was still being consumed with each failed attempt.

The Problem: Premature Quota Consumption

Our initial implementation of the daily AI generation rate limit was designed for immediate feedback: the RateLimiter::hit() method was called before the actual AI generation process. While this ensures a quick rejection for users exceeding their quota, it inadvertently penalized users during system failures. If the AI provider failed to respond, the user's quota was already decremented, leading to messages like "You have exceeded your CV generation limit" even when no CV was successfully generated. This created a poor user experience and led to unnecessary support requests.

Consider this simplified (and problematic) flow:

// In CvController.php
if (RateLimiter::tooManyAttempts('generate-cv:' . auth()->id(), 10)) {
    return response()->json(['message' => 'Too many attempts'], 429);
}

RateLimiter::hit('generate-cv:' . auth()->id()); // Quota consumed here!

try {
    $aiResponse = $this->aiService->generateCv();
    // ... process successful response ...
    return response()->json(['data' => $aiResponse], 200);
} catch (Exception $e) {
    // AI service failed, but quota was already consumed
    return response()->json(['message' => 'AI generation failed'], 500);
}

The Solution: Post-Success Quota Consumption

To address this, we refactored the rate limiting logic across our main entry points (CvController, Filament CvGenerator page, LinkedinShareController). The core change was to move the RateLimiter::hit() call to after a successful AI generation. The tooManyAttempts check, however, remains at the beginning to ensure that genuinely blocked users still receive an immediate 429 response without incurring the cost of an AI service call.

Here's the corrected conceptual flow:

// In CvController.php (Revised)
if (RateLimiter::tooManyAttempts('generate-cv:' . auth()->id(), 10)) {
    return response()->json(['message' => 'Too many attempts'], 429); // Fail fast for blocked users
}

try {
    $aiResponse = $this->aiService->generateCv();
    RateLimiter::hit('generate-cv:' . auth()->id()); // Quota consumed ONLY on success
    // ... process successful response ...
    return response()->json(['data' => $aiResponse], 200);
} catch (Exception $e) {
    // AI service failed, and quota was NOT consumed
    return response()->json(['message' => 'AI generation failed'], 500);
}

This small but significant change ensures that users' daily quotas are only decremented when they receive a valid, successful AI-generated output. Failed attempts, whether due to system issues or other errors, no longer unfairly consume a user's allowance.

Empowering Support: The ai:clear-rate-limit Command

To swiftly assist users affected by the previous implementation, we also introduced a new Laravel Artisan command: php artisan ai:clear-rate-limit. This command allows administrators to unblock specific users by clearing their rate limits for CV generation, LinkedIn generation, or all AI-related limits. It accepts either a user ID or email address.

# Clear all AI generation limits for a user by ID
php artisan ai:clear-rate-limit 123 --type=all

# Clear only CV generation limits for a user by email
php artisan ai:clear-rate-limit [email protected] --type=cv

This command provides a crucial tool for incident response and user support, allowing for quick resolution of lockout issues without manual database intervention.

Key Takeaway

When implementing rate limiting, especially for paid or resource-intensive operations, carefully consider when the quota is consumed. Only decrement the allowance upon successful completion of the intended action. This practice not only provides a fairer experience for users but also builds trust and reduces support overhead during inevitable system fluctuations. Always prioritize a robust and user-centric approach to resource management.

Making AI Rate Limits User-Friendly: A Lesson in Resilient Design
GERARDO RUIZ

GERARDO RUIZ

Author

Share: