Boosting Laravel Queue Performance: Eliminating N+1 Queries in Massive Uploads

Processing large datasets in background jobs can be a major performance bottleneck. When queue workers start struggling with memory or taking too long, it's often a sign of inefficient database interactions. This was precisely the challenge faced by the Reimpact/platform project, specifically within its 'massive uploads' feature.

The Symptoms

The Reimpact/platform project handles massive data uploads across several modules (Packaging, Textiles, Batteries, Tires, AEEAndCells). Recently, after optimizing Horizon queue workers by reducing their count from ten to three to alleviate Out-Of-Memory (OOM) issues, a new problem emerged: massive uploads became significantly slower. The Horizon dashboard showed jobs taking much longer than expected, particularly when processing chunks of 50 items or more.

The Investigation

The root cause was quickly identified as a classic N+1 query pattern. Each item within a processing chunk would trigger multiple individual database queries (between 2 to 6 per item). For a chunk of 50 items, this meant hundreds of redundant database calls, turning what should be efficient batch processing into a slow, sequential bottleneck. This not only added latency but also placed unnecessary load on the database.

For example, a SaveProducts or SaveRecipes job might look something like this before the fix:

foreach ($items as $item) {
    $product = Product::firstOrCreate(['sku' => $item->sku]); // DB Query 1
    $warehouse = Warehouse::where('name', $item->warehouseName)->first(); // DB Query 2
    // ... more individual queries per item
    $product->associateWarehouse($warehouse);
}

The Culprit

Seven specific Save* job files across five modules were identified as having this N+1 pattern. These jobs were responsible for saving entities like products, recipes, materials, brands, warehouses, and providers. Each iteration in their processing loops involved distinct database lookups for related resources, rather than loading all necessary resources efficiently upfront.

Furthermore, in some cases, job dispatches (e.g., AssociatePriorityProductBatch) were also being performed N times within a loop, adding to the overhead.

The Fix

To resolve this, a preloadResources() pattern was introduced. This approach involves bulk-loading all necessary related entities (like products, materials, warehouses, brands, providers) into in-memory, keyed maps before the main processing loop begins. This transforms N individual database queries into a single whereIn() query, followed by O(1) in-memory map lookups during item processing.

Here’s a conceptual example of the preloadResources() pattern:

class SaveProductsJob implements ShouldQueue
{
    // ... job properties

    protected array $preloadedProducts;
    protected array $preloadedWarehouses;

    public function handle(): void
    {
        $this->preloadResources();

        foreach ($this->itemsInChunk as $itemData) {
            // O(1) in-memory lookup instead of DB query
            $product = $this->preloadedProducts[$itemData->sku] ?? null;
            if (!$product) {
                $product = Product::firstOrCreate(['sku' => $itemData->sku]);
                $this->preloadedProducts[$itemData->sku] = $product; // Cache newly created
            }
            
            $warehouse = $this->preloadedWarehouses[$itemData->warehouseName] ?? null;
            if (!$warehouse) {
                $warehouse = Warehouse::firstOrCreate(['name' => $itemData->warehouseName]);
                $this->preloadedWarehouses[$itemData->warehouseName] = $warehouse; // Cache newly created
            }

            // ... process item with preloaded resources
        }

        // Batch associate jobs after processing the chunk
        AssociatePriorityProductBatch::dispatch($productIds);
    }

    protected function preloadResources(): void
    {
        $skus = collect($this->itemsInChunk)->pluck('sku')->unique()->toArray();
        $warehouseNames = collect($this->itemsInChunk)->pluck('warehouseName')->unique()->toArray();

        $this->preloadedProducts = Product::whereIn('sku', $skus)
                                        ->get()
                                        ->keyBy('sku')
                                        ->toArray();

        $this->preloadedWarehouses = Warehouse::whereIn('name', $warehouseNames)
                                           ->get()
                                           ->keyBy('name')
                                           ->toArray();
    }
}

Key design decisions included:

  • Map Update on Create: If an entity (e.g., brand, warehouse) is created because it wasn't found in the preloaded map, it's immediately added back into the map for subsequent lookups within the same chunk.
  • Race Condition Fallback: Existing try/catch patterns for concurrent creation conflicts were preserved, with the result also being cached in the map after a database re-query.
  • Memory Bounded: The whereIn() clauses ensure that only entities relevant to the current chunk's SKUs or names are loaded, preventing excessive memory usage.
  • Batch Dispatches: Where applicable, job dispatches like AssociatePriorityProductBatch were batched to occur once per chunk instead of once per item.

This optimization resulted in significant improvements:

Job Before (50 items) After Reduction
Packaging SaveRecipes ~300 queries ~54 queries ~82%
Packaging SaveProducts ~250 queries + 50 dispatches ~53 queries + 1 dispatch ~79%
Textiles SaveRecipes ~100 queries ~52 queries ~48%
Other SaveProducts ~150 queries ~53 queries ~65%

The Lesson

When dealing with high-volume background jobs, especially those processing chunks of data, aggressively optimizing database interactions is paramount. The N+1 query pattern is a silent killer of performance. By adopting a preloadResources() strategy that leverages bulk database queries and in-memory maps, you can dramatically reduce database load and accelerate processing times. Always look for opportunities to shift from N individual lookups to 1 bulk load followed by N fast in-memory lookups within your long-running processes.

Boosting Laravel Queue Performance: Eliminating N+1 Queries in Massive Uploads
GERARDO RUIZ

GERARDO RUIZ

Author

Share: