Monday, 28 February 2011

Possible SQL Gotcha - use of 'Not In' with NULLs and the customer attribute

I was recently putting together a bit of SQL to illustrate the use of a NOT IN clause for a forum answer, and got some unexpected results. The query was a relatively simple example; find all accounts with no associated opportunities. So, I tried this:

SELECT name FROM FilteredAccount
WHERE accountid NOT IN (SELECT accountid FROM FilteredOpportunity)

Nice, simple query, but it returned no data (and it should have done). However, the following works fine:

SELECT name FROM FilteredAccount
WHERE accountid NOT IN (SELECT customerid FROM FilteredOpportunity)

The only difference is the use of customerid instead of accountid in the subquery. If I'd have expected the first query to work instead of the second query, as customerid is a generated field (it's generated within the Opportunity via by the SQL function COALESCE(accountid, contactid)).

This all seems weird, but it comes down to what happens with nulls. An opportunity will be associated with one of an account, or a contact. So, the subquery 'SELECT accountid FROM FilteredOpportunity' could return a null (if you have an opportunity against a contact), but 'SELECT customerid FROM FilteredOpportunity' will always return non-null values. Don't ask me why, but the presence of nulls in the subquery cause the NOT IN query to misbehave.

One way to confirm this is with another variation on the query above, which also works:

SELECT name FROM FilteredAccount
WHERE accountid NOT IN (SELECT accountid FROM FilteredOpportunity WHERE accountid is NOT NULL)

This query explicitly excludes nulls from the results on the subquery, and so it works fine.

The main lesson I took from this is to always test for nulls in the subquery when using NOT IN; another lesson is to pay close attention when using attributes that represent the composite Customer data type in CRM

For reference, the reason why I was doing this is because this is a classic example of a query that cannot be done through FetchXML, and hence cannot be written with an Advanced Find in CRM. If the primary entity is an account, contact or lead then you have a manual workaround in CRM, for example:
  • Create a marketing list, and populate it with all accounts
  • Use Advanced Find to remove from the list all accounts that have an opportunity
  • This will then leave you with a marketing list that contains all accounts without an opportunity