Graceful AI API Handling: Migrating to Gemini 2.5 Flash and Robust Rate Limit Management

On the landing project, which focuses on providing tools for users, a key feature is our CV generation service. This service relies on external AI models to process and generate content.## The ProblemRecently, our CV generation service encountered persistent issues due to the deprecation of the gemini-2.0-flash AI model. This older model began returning 429 RESOURCE_EXHAUSTED errors even for minimal requests, and its pinned versions quickly became 404 NOT_FOUND, indicating they were no longer supported for new users. The critical problem was how these failures manifested: users attempting to generate CVs were met with generic 500 Internal Server Error messages, and our super admins were deluged with fan-out emails for every single failed attempt, creating operational overhead and a poor user experience.## The SolutionTo address these challenges, we implemented a two-fold solution: migrating to a supported AI model and hardening our API error handling.First, we transitioned the default AI model to gemini-2.5-flash, the current flagship flash model. This involved: - Updating our system's configuration to recognize and use gemini-2.5-flash with correct pricing. - Deactivating gemini-2.0-flash across our system. - Implementing tenant-level migrations to automatically switch existing tenants still pointing to the deprecated 2.0 model to 2.5, while preserving any custom AI model choices. - Updating our application's list of available models and the global default.Second, we enhanced the error handling within our CV generation controller to specifically catch AI provider rate-limiting and overload exceptions. Instead of allowing these to bubble up as 500 Internal Server Error, we now intercept PrismRateLimitedException and PrismProviderOverloadedException. This allows us to return a more informative 503 Service Unavailable status code, coupled with a localized, user-friendly message, preventing confusing 500 errors and the associated admin alert spam.php<?phpnamespace App\Http\Controllers;use App\Services\CvGeneratorService;use App\Exceptions\PrismRateLimitedException;use App\Exceptions\PrismProviderOverloadedException;use Illuminate\Http\Request;use Symfony\Component\HttpFoundation\Response;use Illuminate\Support\Facades\Lang;class CvController extends Controller{ protected $cvGeneratorService; public function __construct(CvGeneratorService $cvGeneratorService) { $this->cvGeneratorService = $cvGeneratorService; } public function generate(Request $request): Response { try { $cvContent = $this->cvGeneratorService->generateCv($request->user(), $request->input('prompt')); return response()->json(['cv_content' => $cvContent]); } catch (PrismRateLimitedException $e) { report($e); return response()->json([ 'message' => Lang::get('portfolio.cv_provider_unavailable', ['reason' => 'rate_limit']) ], Response::HTTP_SERVICE_UNAVAILABLE); } catch (PrismProviderOverloadedException $e) { report($e); return response()->json([ 'message' => Lang::get('portfolio.cv_provider_unavailable', ['reason' => 'overloaded']) ], Response::HTTP_SERVICE_UNAVAILABLE); } catch (\Exception $e) { report($e); return response()->json([ 'message' => Lang::get('portfolio.generic_error_message') ], Response::HTTP_INTERNAL_SERVER_ERROR); } }}This PHP example illustrates how the CvController now gracefully handles PrismRateLimitedException and PrismProviderOverloadedException by returning a 503 Service Unavailable response with a localized message, ensuring a better user experience and preventing internal server error floods.## Results/BenefitsWith these changes, our CV generation service is now more resilient and user-friendly. Users are no longer presented with obscure 500 errors; instead, they receive a clear message when the AI service is temporarily unavailable. This significantly improves the user experience by providing transparency and managing expectations. Furthermore, the volume of erroneous administrative alerts has drastically decreased, allowing our operations team to focus on genuine issues. The migration ensures we leverage the latest, supported AI technology, maintaining service reliability and performance.## Key InsightProactive management of external API dependencies and robust error handling are crucial for maintaining a reliable service. Don't let deprecated APIs or external service interruptions cascade into generic internal server errors for your users. Implement specific catch blocks for anticipated external service failures and provide clear, localized feedback.

Graceful AI API Handling: Migrating to Gemini 2.5 Flash and Robust Rate Limit Management

Reason for reporting

Related Posts

Reactivating Reprobated Mentees: Preventing Duplicate Booking Errors

Elevating Code Quality with Consistent Code Review Practices

Boosting UI Responsiveness: Seamless `contenteditable` Integration with Livewire and Alpine.js