When managing a robust CRM like CiviCRM, you will eventually encounter tasks that push the limits of standard web server configurations. Whether you are importing 50,000 contacts, merging thousands of duplicate records, or running a major database upgrade, these long-running jobs require a deep understanding of how CiviCRM communicates with your server.

If you have ever stared at a progress bar that seems to hang or received a '504 Gateway Timeout' error in the middle of a critical data migration, this guide is for you. We will break down the mechanics of the CiviCRM import process, the behavior of the Queue system, and how to tune your server environment to ensure your data stays intact.

The Lifecycle of a CiviCRM Contact Import

To troubleshoot issues with long-running jobs, you first need to understand what is happening under the hood when you click 'Import.' Most standard import jobs in CiviCRM (such as Contacts, Contributions, or Activities) operate using a specific interaction between your browser and the server.

The Heavy Lifting: The POST Request

When you initiate an import, your browser sends a substantial POST request to the server. This request contains the configuration of your import and, crucially, triggers the actual processing logic. Unlike some modern web applications that immediately offload tasks to a background worker, the standard CiviCRM import typically begins its work within the context of this initial request.

Progress Monitoring: AJAX Polling

While the POST request is busy processing your CSV file, CiviCRM doesn't want to leave you in the dark. To provide a progress bar, your browser begins sending GET requests to a status endpoint, typically looking like civicrm/ajax/status?id=somelongid.

This somelongid is a unique identifier generated when the job starts. The GET requests are purely for UI feedback; they do not perform the actual import work. Instead, they check a temporary data store (like a cache or a specific database table) to see how many rows have been processed and then update the progress bar on your screen.

Server Architecture and Timeouts

A common misconception is that the AJAX status requests keep the import alive. In reality, the success of your import is often tied to how your web server handles the primary POST request. If that request is cut short, the import may fail or stop prematurely.

Apache with mod_php

In a traditional Apache environment using mod_php, the relationship is direct. If the script exceeds the max_execution_time defined in your php.ini, or if the browser connection is severed, Apache will typically terminate the PHP process. In this scenario, your import stops exactly where it was, leading to partial data imports.

Nginx and PHP-FPM

Modern stacks using Nginx and PHP-FPM behave differently. Nginx acts as a proxy; it sends the request to PHP-FPM and waits for a response. If the import takes longer than Nginx's configured proxy_read_timeout or fastcgi_read_timeout, Nginx will return a '504 Gateway Timeout' to your browser.

However, the PHP-FPM process may continue to run in the background. Because PHP-FPM is a separate process manager, it will often finish the job regardless of whether Nginx is still listening. This is why you might see a timeout error, but find all your contacts successfully imported ten minutes later.

The CiviCRM Queue System

While standard imports use the POST/Polling method, other parts of CiviCRM use the more advanced Queue system. This includes tasks like dedupe merging, bulk mailing, and database upgrades. The Queue system is designed to handle massive tasks by breaking them into smaller, manageable 'tasks' or 'items.'

Browser-Based Queues

When you run a queue in the browser, CiviCRM processes one 'chunk' of data at a time. After each chunk is finished, the browser sends a new request to start the next one.

  • Pros: It bypasses global server timeouts because each individual request is short.
  • Cons: You must keep your browser window open. If you close the tab, the chain of requests breaks, and the job stops. Furthermore, refreshing the page or opening the same process in a second tab can lead to race conditions and data corruption.

CLI-Based Queues

For the most reliable performance, CiviCRM queues can be processed via the Command Line Interface (CLI) using tools like cv or drush.

# Example of running a queue via CV
cv api Job.execute

When run via the CLI, the process is not bound by web server timeouts or browser stability. This is the gold standard for large-scale data operations.

Best Practices for Handling Large Imports

If you are consistently running into timeout issues, consider these strategies to stabilize your CiviCRM environment:

  1. Increase Specific Timeouts: Instead of increasing the timeout for your entire site, target the specific paths used by imports in your Nginx or Apache configuration. For Nginx, you might increase fastcgi_read_timeout specifically for the civicrm/ajax/ locations.
  2. Chunk Your Data: If you have 100,000 rows to import, do not do them all at once. Break your CSV files into batches of 5,000 to 10,000. This ensures that even if a failure occurs, the blast radius is limited.
  3. Check PHP Memory Limits: Long-running jobs often consume significant memory. Ensure your memory_limit in php.ini is sufficient (at least 512M or 1G for large sites).
  4. Monitor the Logs: Always keep an eye on your CiviCRM error log (found in ConfigAndLog) and your server's error logs. They will tell you if a process was killed by the OS (OOM Killer) or timed out by the server software.

Frequently Asked Questions

What happens if I close my browser during a contact import?

If your server is running Nginx and PHP-FPM, the import will likely continue until it hits the PHP max_execution_time. If you are using Apache or a browser-based Queue task (like a dedupe merge), the process will likely stop, and you will need to resume it manually or clean up partial data.

Why does my progress bar get stuck at 0% or 100%?

This usually indicates a communication failure between the status polling (GET request) and the background process. If the POST request crashed due to a database error or memory exhaustion, the status will never update. Check your browser's 'Network' tab in Developer Tools to see if the status requests are returning a 500 error.

Can I run multiple imports simultaneously?

While technically possible, it is highly discouraged. Multiple imports can lead to database deadlocks, especially if they are interacting with the same tables (like civicrm_contact or civicrm_address). It is always safer to run large jobs sequentially.

Wrapping Up

Understanding the mechanics of CiviCRM's long-running jobs is the key to maintaining a healthy, performant CRM. By distinguishing between standard POST-based imports and the Queue system, you can better diagnose where a failure is occurring.

For most users, the combination of optimized server timeouts and smaller import batches provides the best balance of speed and reliability. However, as your database grows, embracing CLI-based tools will become an essential part of your CiviCRM toolkit.