Boosting Laravel Queue Performance: Eliminating N+1 Queries in Massive Uploads
Processing large datasets in background jobs can be a major performance bottleneck. When queue workers start struggling with memory or taking too long, it's often a sign of inefficient database interactions. This was precisely the challenge faced by the Reimpact/platform project, specifically within its 'massive uploads' feature.
The Symptoms
The Reimpact/platform project handles massive data uploads across several modules (Packaging, Textiles, Batteries, Tires, AEEAndCells). Recently, after optimizing Horizon queue workers by reducing their count from ten to three to alleviate Out-Of-Memory (OOM) issues, a new problem emerged: massive uploads became significantly slower. The Horizon dashboard showed jobs taking much longer than expected, particularly when processing chunks of 50 items or more.
The Investigation
The root cause was quickly identified as a classic N+1 query pattern. Each item within a processing chunk would trigger multiple individual database queries (between 2 to 6 per item). For a chunk of 50 items, this meant hundreds of redundant database calls, turning what should be efficient batch processing into a slow, sequential bottleneck. This not only added latency but also placed unnecessary load on the database.
For example, a SaveProducts or SaveRecipes job might look something like this before the fix:
foreach ($items as $item) {
$product = Product::firstOrCreate(['sku' => $item->sku]); // DB Query 1
$warehouse = Warehouse::where('name', $item->warehouseName)->first(); // DB Query 2
// ... more individual queries per item
$product->associateWarehouse($warehouse);
}
The Culprit
Seven specific Save* job files across five modules were identified as having this N+1 pattern. These jobs were responsible for saving entities like products, recipes, materials, brands, warehouses, and providers. Each iteration in their processing loops involved distinct database lookups for related resources, rather than loading all necessary resources efficiently upfront.
Furthermore, in some cases, job dispatches (e.g., AssociatePriorityProductBatch) were also being performed N times within a loop, adding to the overhead.
The Fix
To resolve this, a preloadResources() pattern was introduced. This approach involves bulk-loading all necessary related entities (like products, materials, warehouses, brands, providers) into in-memory, keyed maps before the main processing loop begins. This transforms N individual database queries into a single whereIn() query, followed by O(1) in-memory map lookups during item processing.
Here’s a conceptual example of the preloadResources() pattern:
class SaveProductsJob implements ShouldQueue
{
// ... job properties
protected array $preloadedProducts;
protected array $preloadedWarehouses;
public function handle(): void
{
$this->preloadResources();
foreach ($this->itemsInChunk as $itemData) {
// O(1) in-memory lookup instead of DB query
$product = $this->preloadedProducts[$itemData->sku] ?? null;
if (!$product) {
$product = Product::firstOrCreate(['sku' => $itemData->sku]);
$this->preloadedProducts[$itemData->sku] = $product; // Cache newly created
}
$warehouse = $this->preloadedWarehouses[$itemData->warehouseName] ?? null;
if (!$warehouse) {
$warehouse = Warehouse::firstOrCreate(['name' => $itemData->warehouseName]);
$this->preloadedWarehouses[$itemData->warehouseName] = $warehouse; // Cache newly created
}
// ... process item with preloaded resources
}
// Batch associate jobs after processing the chunk
AssociatePriorityProductBatch::dispatch($productIds);
}
protected function preloadResources(): void
{
$skus = collect($this->itemsInChunk)->pluck('sku')->unique()->toArray();
$warehouseNames = collect($this->itemsInChunk)->pluck('warehouseName')->unique()->toArray();
$this->preloadedProducts = Product::whereIn('sku', $skus)
->get()
->keyBy('sku')
->toArray();
$this->preloadedWarehouses = Warehouse::whereIn('name', $warehouseNames)
->get()
->keyBy('name')
->toArray();
}
}
Key design decisions included:
- Map Update on Create: If an entity (e.g., brand, warehouse) is created because it wasn't found in the preloaded map, it's immediately added back into the map for subsequent lookups within the same chunk.
- Race Condition Fallback: Existing
try/catchpatterns for concurrent creation conflicts were preserved, with the result also being cached in the map after a database re-query. - Memory Bounded: The
whereIn()clauses ensure that only entities relevant to the current chunk's SKUs or names are loaded, preventing excessive memory usage. - Batch Dispatches: Where applicable, job dispatches like
AssociatePriorityProductBatchwere batched to occur once per chunk instead of once per item.
This optimization resulted in significant improvements:
| Job | Before (50 items) | After | Reduction |
|---|---|---|---|
| Packaging SaveRecipes | ~300 queries | ~54 queries | ~82% |
| Packaging SaveProducts | ~250 queries + 50 dispatches | ~53 queries + 1 dispatch | ~79% |
| Textiles SaveRecipes | ~100 queries | ~52 queries | ~48% |
| Other SaveProducts | ~150 queries | ~53 queries | ~65% |
The Lesson
When dealing with high-volume background jobs, especially those processing chunks of data, aggressively optimizing database interactions is paramount. The N+1 query pattern is a silent killer of performance. By adopting a preloadResources() strategy that leverages bulk database queries and in-memory maps, you can dramatically reduce database load and accelerate processing times. Always look for opportunities to shift from N individual lookups to 1 bulk load followed by N fast in-memory lookups within your long-running processes.