In our previous blog post Different data types in case expression, we have explored the behavior of CASE expression in case of different data types in “THEN” and “ELSE” part. Now in this post “Mutable logical condition in CASE expression”, we are going to unfold the behavior of CASE expression when we have a non deterministic function in logical expression (WHEN part).
Mutable logical condition in CASE expression
Have a look on the below query:
--Query 1 SELECT CASE ABS(CHECKSUM(NEWID())) % 3 WHEN 0 THEN 'Case 0' WHEN 1 THEN 'Case 1' WHEN 2 THEN 'Case 2' END
In above query, logical expression “ABS(CHECKSUM(NEWID())) % 3” of CASE expression, has a non deterministic function NEWID() wrapped inside a CHECKSUM function to generate some random number on fly. Going further the CHECKSUM value has also wrapped inside a ABS function which insures a positive number. Finally there is a Mod operator with value 3 which narrows down the output result to 0 or 1 or 2 only. To catch this output, inside the WHEN part of CASE expression, we have three switches 0, 1 and 2 respectively to return some demo value.
If you execute the above code repeatedly for 6 – 7 times (may vary a bit), you may get unexpected “NULL” value in the output. As we have all possible conditions in WHEN part to catch the output of logical condition, why we are getting this unexpected NULL value? Its just because of the nature of execution of CASE expression. Above query (Query 1) is equivalent to the below query (Query 2):
--Query 2 SELECT CASE WHEN ABS(CHECKSUM(NEWID())) % 3 = 0 THEN 'Case 0' WHEN ABS(CHECKSUM(NEWID())) % 3 = 1 THEN 'Case 1' WHEN ABS(CHECKSUM(NEWID())) % 3 = 2 THEN 'Case 2' END
Logical condition inside CASE expression executes for each WHEN part. Even we have used the logical expression “ABS(CHECKSUM(NEWID())) % 3” once in Query 1, it will execute for each WHEN part (3 times in our case).
Why NULL?
When first WHEN clause (WHEN 0) calls the logical condition, it might have got 1 or 2, when second WHEN clause (WHEN 1) calls, it might have got 0 or 2. And finally when third WHEN clause (WHEN 2) calls, it might have got 0 or 1 and all WHEN parts fails to catch. CASE expression always has an ELSE part even we don’t put it explicitly there and here we are getting the NULL from this ELSE part.
Switch Case | Result received from logical condition | Action |
WHEN 0 | Got 1 or 2 | By passed |
WHEN 1 | Got 0 or 2 | By passed |
WHEN 2 | Got 0 or 1 | By passed |
ELSE | Always there even not written explicitly | Caught and returned NULL |
Handle mutable logical condition in CASE expression
To handle such kind of situations, simply use derived table to fix the output of logical condition first, then from there, we can create switch cases to catch the output as below:
--Query 3 SELECT CASE WHEN DT.Col = 0 THEN 'Case 0' WHEN DT.Col = 1 THEN 'Case 1' WHEN DT.Col = 2THEN 'Case 2' END FROM (SELECT ABS(CHECKSUM(NEWID())) % 3 AS Col) DT
In above query (Query 3), we have fixed the output in a derived table first and then written switch cases from there. In case of Query 3 it will never output the unexpected NULL value.
I hope you have enjoyed this post and thanks for your reading. I would also like to ask you to share your method in comment section to handle this scenario.
Do post your comments and suggestions, rate this post and most important, please keep sharing.
A very informative article indeed! Just out of curiosity, do you have a particular reason to have used a derived table in this case to work around the unexpected NULL values? I would think a declaring a variable and then using it instead of a derived table would work just fine.
Something like this:
DECLARE @I TINYINT = ABS(CHECKSUM(NEWID())) % 3;
SELECT
CASE ABS(CHECKSUM(NEWID())) % 3
WHEN 0 THEN ‘Case 0’
WHEN 1 THEN ‘Case 1’
WHEN 2 THEN ‘Case 2’
END
Good article nontheless.
Hi Ananthram,
Of course you can use a variable in such scenarios where you have to work with one column and some trivial queries. But in case you have a complex query with multiple columns where data is coming from multiple tables, derived tables, views, functions etc by joining them and you have this variable CASE condition in between somewhere. It might not be possible to deal with variables. And in that case, we can use derived tables as above.
Thanks for your question and suggestion too.
If we put the ELSE part which one is more performance oriented?
Hi Arun,
I am little bit confuse about your question. What i could guess at my best, it seems that you have a concern about the performance of hidden ELSE part and explicitly mentioned ELSE part in CASE expression.
As best of my knowledge and as I have already mentioned in my post, there is always an ELSE part either you mention it or not. So it should not have any performance impact on the query.
Hello,
Nice and informative article. I would use IIF() instead of CASE expression.
Doron
An interesting and informative article! Thanks!! I must say though that you use “mutable” in a novel fashion. Usually in CS, mutability is a property of an object that denotes that it can be modified (or not) after creation. The discussion often concerns functional languages. In F# for example, all objects are immutable by default, though they can be marked mutable — often needed for inter-operation with other languages. The usage in this article does not concern an object in that sense of course. That is,
ABS(CHECKSUM(NEWID())) % 3
does not create an object per se. “non-deterministic” or even just “variable” might have been my choice.