In this post, we are going to generate serial number for each consecutive set of numbers in a given column. We have a table tbl_Donation with columns EmpId and DonationYear. For each consecutive set of years of each employee, we need to generate serial numbers. If employee “X” donated in 2006, 2007 and 2008 years consecutively and then in 2010 and 2011, we have to assign 1 for 2005, 2 for 2006, 3 for 2007 followed by 1 for 2010 and 2 for 2011. We need to generate incremental series for each consecutive set of years but as and when a gap occurs between donation years, series has to be restarted with 1.
PARTITION BY clause in a ROW_NUMBER() window function could be able to generate the required output, if we had same number along with donation years for each consecutive set of years for all employees. Initially, we don’t have any such kind of value in the table but we can generate same number for each set of consecutive years for all employees, then this column can be used inside a PARTITION BY clause to generate the required serial number series.
Lets have a look on the column details of tbl_Donation table, before we proceed:
- EmpId – Id of Employee who makes the donation
- DonationYear – Year in which employee makes the donation
We can use below script to create a dummy table with sample data.
SET NOCOUNT ON; IF(OBJECT_ID('tempdb..#tbl_Donation') IS NOT NULL) DROP TABLE #tbl_Donation CREATE TABLE #tbl_Donation ( EmpId INT NOT NULL, DonationYear INT NOT NULL ) --Insert dummy data in table INSERT INTO #tbl_Donation (EmpId, DonationYear) SELECT DT_Emp.EmpId, DT_Years.DonationYear FROM ( SELECT 1 AS EmpId UNION ALL SELECT 2 UNION ALL SELECT 3 )DT_Emp CROSS JOIN (SELECT 2006 AS DonationYear UNION ALL SELECT 2007 UNION ALL SELECT 2008 UNION ALL SELECT 2011 UNION ALL SELECT 2015 UNION ALL SELECT 2016)DT_Years
In above script, we have three dummy employee ids (1, 2 and 3) with few dummy donation year values. We have cross joined the employee and donation year values to generate donation years for all employees. Table tbl_Donation should look like below now:
SELECT * FROM #tbl_Donation
Required output from above table is as below;
Get the required output in set based approach:
We can achieve the required output using a while loop or recursive CTE (it is also a kind of loop) or even using a cursor. As we know that, SQL Server is optimized for set based operations, we are going to achieve this task in a set based approach. I always try to follow set based approach rather than a RBAR (Row by agonizing row i.e. Loop) approach.
Set based approach uses joins and sub-queries rather than loops and recursions. We are using below set based query to extract the required output in a set based approach:
;WITH CTE AS ( SELECT EmpId, DonationYear, DonationYearPrev, CASE WHEN DonationYear - DonationYearPrev <> 1 THEN DonationYear ELSE DonationYear - DonationYearPrev END AS DiffAndPrev FROM ( SELECT EmpId, DonationYear, LAG(DonationYear, 1, 0) OVER(ORDER BY EmpId, DonationYear) AS DonationYearPrev FROM #tbl_Donation )DT ) SELECT EmpId, CTE1.DonationYear, ROW_NUMBER() OVER(PARTITION BY EmpId, OA1.ValueToPartition ORDER BY OA1.ValueToPartition) AS Seq FROM CTE AS CTE1 OUTER APPLY ( SELECT TOP 1 DiffAndPrev AS ValueToPartition FROM CTE AS CTE2 WHERE CTE2.EmpId = CTE1.EmpId AND CTE2.EmpId = CTE1.EmpId AND CTE2.DonationYear <= CTE1.DonationYear ORDER BY DiffAndPrev DESC )OA1
Discuss the trick
Lets discuss the process step by step:
- Extract the previous year value with each row based on employee id as “DonationYearPrev”.
SELECT EmpId, DonationYear, LAG(DonationYear, 1, 0) OVER(ORDER BY EmpId, DonationYear) AS DonationYearPrev FROM #tbl_Donation
- If difference of current year and previous year is 1 (which means, the current row year is the next year of previous row year), put the DiffAndPrev column value as 1 else put the current row’s donation year value in the DiffAndPrev column in upper table inside CTE.
CASE WHEN DonationYear - DonationYearPrev <> 1 THEN DonationYear ELSE DonationYear - DonationYearPrev END AS DiffAndPrev
- In outer apply, we get the value of column ValueToPartition from column DiffAndPrev based on employee id where donation year is less than or equal to current row donation year ordered by DiffAndPrev in descending way. As we have taken either 1 or previous year value in step 2, we would get previous year value (which is not 1) from DiffAndPrev column for all 1’s of column DiffAndPrev.
SELECT TOP 1 DiffAndPrev AS ValueToPartition FROM CTE AS CTE2 WHERE CTE2.EmpId = CTE1.EmpId AND CTE2.EmpId = CTE1.EmpId AND CTE2.DonationYear <= CTE1.DonationYear ORDER BY DiffAndPrev DESC
To explain this point, I have manipulated the final query (used below CTE) as below
SELECT EmpId, CTE1.DonationYear, DiffAndPrev, OA1.ValueToPartition FROM CTE AS CTE1 OUTER APPLY ( SELECT TOP 1 DiffAndPrev AS ValueToPartition FROM CTE AS CTE2 WHERE CTE2.EmpId = CTE1.EmpId AND CTE2.EmpId = CTE1.EmpId AND CTE2.DonationYear <= CTE1.DonationYear ORDER BY DiffAndPrev DESC )OA1
Have a look on the output of the manipulated query:
- Finally, we use column ValueToPartition in PARTITION BY clause with ROW_NUMBER() window function to generate the required serial numbers.
ROW_NUMBER() OVER(PARTITION BY EmpId, OA1.PrevYear ORDER BY OA1.PrevYear) AS Seq
Hope, you have enjoyed this reading, please post your comments and queries (if any) below the post. Don’t forget to rate and share this blog post.
Another idee, same result
SELECT
EmpID , DonationYear,
Row_Number()OVER(Partition By EmpID ,Diff ORDER BY EmpID, DonationYear,Diff) AS RN
FROM
(
select *
,ROW_NUMBER()OVER(PARTITION BY EmpID ORDER bY EmpID, DonationYear) as RN
,Diff = DonationYear – ROW_NUMBER()OVER(PARTITION BY EmpID ORDER bY EmpID, DonationYear)
from #tbl_Donation
)A
ORDER BY EmpID, DonationYear
Brilliant trick! Thanks for sharing SAbin….