Thursday 9 December 2010

Stability issues with AsyncRemoveCompletedJobs

CRM 4.0 UR 3 brought in a useful feature, the ability to configure the CRM Asynchronous Service to automatically delete records from completed asynchronous operations, and hence keep the size of the asyncoperationbase SQL table down to a reasonable size. This behaviour is configured by the registry values AsyncRemoveCompletedJobs and AsyncRemoveCompletedWorkflows

However, I recently met an issue with this behaviour, where the CRM Asynchronous Service appears to get in a state where all it is doing is deleting completed jobs, to the exclusion of all other activity. This can leave the CRM Asynchronous Service to have effectively hung (not responding to service control requests, nor polling for new jobs to process) and not to process any new jobs for a considerable period of time (in one environment, this could be several hours).

The main symptoms are:

  • No jobs being processed for a considerable period of time
  • The Crm Asynchronous Service not responding to service control requests (i.e. you cannot stop it through the Services console, so you have to kill the process)
  • No values reported for most performance counters (e.g. 'Total Operations Outstanding', 'Threads in use')
  • If you do restart the service, you see a burst of activity (including performance counters) whilst outstanding jobs are processed, then it reverts to the same behaviour as above
  • If you look at the SQL requests submitted by the Crm Asynchronous Service (I use the SQL dynamic management views sys.dm_exec_requests and sys.dm_exec_sessions) you see just one DELETE request and no other SQL activity

At the moment, the only workaround I have is to remove the registry values, and to use a scheduled SQL job to periodically clear out the asyncoperationbase table. Here is an example of such a script.