Mastering E2E Testing: Architecting a Secure Playwright Suite with Faked AI Services and CI/CD Integration

Introduction

For Breniapp, our user onboarding process is a critical first impression, featuring complex multi-step flows and integrations with AI services. Ensuring its robustness, especially after continuous development, demands comprehensive end-to-end (E2E) testing. Historically, our CI workflows lacked E2E coverage for onboarding and only ran basic checks on select PRs. This introduced a significant risk of regressions in our most vital user journey.

This post details how we addressed this challenge by implementing a robust Playwright E2E test suite, complete with a securely sandboxed testing environment and integration into our GitHub Actions CI/CD pipeline. The core innovation lies in our ability to simulate complex user interactions while deterministically faking external AI service calls, all without exposing sensitive testing logic to production.

The Challenge: Onboarding E2E

Testing an onboarding flow presents unique complexities. It's inherently multi-step, requires various user inputs (text, selections, file uploads), and crucially, in Breniapp's case, involves dynamic responses from AI services. Running real AI services for every E2E test run is slow, expensive, and introduces non-deterministic behavior, making tests flaky and unreliable. Moreover, we needed a safe way to provision test tenants and interact with the application during testing without affecting production data or services.

Implementing Playwright Tests

Our solution began with Playwright, a powerful browser automation library, to build three core E2E specifications covering our main onboarding paths: predefined, manual definition, and scratch-based chat. We developed a generic "walker" utility that intelligently navigates through each onboarding step, reading the current step type from data-current-step-type attributes in the UI. This walker provides sensible defaults for inputs (e.g., filling text fields, selecting the first non-'other' option for multi-cards/multi-selects, and gracefully skipping optional file uploads or dropping a tiny PNG fixture when required).

Building a Safe E2E Backdoor

To facilitate isolated and deterministic testing, we engineered a secure E2E backdoor. This mechanism allows our Playwright tests to interact with the application in a controlled environment, provision test data, and bypass components that are unsuitable for E2E testing (like real AI calls). The backdoor is double-gated for maximum security:

  1. Environment Check: The APP_ENV must be explicitly set to testing, e2e, or local.
  2. Explicit Flag: An E2E_TESTING_ENABLED=true environment variable must also be present.

This dual-check ensures that the testing routes and fake services are never accessible in production environments. We leveraged Laravel's middleware pattern to guard dedicated /testing/* routes and prevent any accidental exposure.

Faking External AI Services

A pivotal part of achieving deterministic E2E tests was faking our AI services (BrandBookAnalysisService, VisualDnaAnalysisService, OnboardingChatService). When the E2E testing environment is enabled, these services are bound to deterministic fakes. This means instead of making actual LLM calls, our application receives predefined, consistent responses during test runs. For direct AI text calls, such as those made via Prism::text() in our scratch chat path, we configured Prism::fake() within a dedicated E2EServiceProvider to intercept and provide faked data.

Here’s a simplified example of how such fakes are bound within a service provider:

namespace App\Providers;

use App\Services\AI\OnboardingChatService;
use App\Services\AI\Fakes\FakeOnboardingChatService;
use Illuminate\Support\ServiceProvider;

class E2EServiceProvider extends ServiceProvider
{
    public function register(): void
    {
        // Check for E2E testing environment and flag
        if (
            in_array(env('APP_ENV'), ['testing', 'e2e', 'local']) &&
            (bool) env('E2E_TESTING_ENABLED')
        ) {
            // Bind the fake service for deterministic responses
            $this->app->singleton(
                OnboardingChatService::class,
                FakeOnboardingChatService::class
            );

            // Configure similar fakes for external APIs
            // e.g., Prism::fake('text_generation_api')->return('Deterministic response');
        }
    }
}

This ensures that every interaction with our AI logic during E2E tests yields predictable outcomes, making our tests reliable and fast.

CI/CD with GitHub Actions

To fully integrate these improvements, we introduced a new tests.yml GitHub Actions workflow. This workflow runs on every pull request and push to main, executing PHPStan for static analysis, PHPUnit for unit/integration tests, and our new Playwright E2E suite—all in parallel. On failure, the Playwright HTML report is uploaded as an artifact, providing immediate visual feedback for debugging.

Results and Benefits

With these changes, Breniapp now benefits from robust, stable E2E coverage for its critical onboarding flows. Our Playwright tests consistently pass locally within approximately 2.5 minutes in parallel, providing rapid feedback to developers. The secure E2E backdoor and faked AI services ensure that these tests are deterministic and efficient, greatly reducing flakiness. The new CI/CD workflow means that all critical quality checks, including E2E tests, run automatically, safeguarding against regressions and maintaining a high standard of code quality.

Future Enhancements

While this implementation significantly boosts our testing capabilities, future work could involve expanding the Playwright suite to cover more application areas, exploring visual regression testing, and integrating more sophisticated reporting tools for deeper insights into test failures. Additionally, refining our test data provisioning within the sandboxed environment could further streamline test setup.

GERARDO RUIZ

GERARDO RUIZ

Author

Share: