In this post, we are going to learn a technique to combine all the values of a row (based on a particular condition) with a separator, in a column along with other columns. What i meant by this let me explain it in more detail. Have a look on below question as raised by a colleague;
I have a Students table and a Subjects table. One student has been assigned to multiple Subjects in a transaction table. I want to extract the name of all the Subjects for which the student is assigned in a comma separated way along with other details of the student. If student is not assigned yet, i want to show a “Not Assigned Yet” value for those records.
To achieve this task, i have created a demo script and sharing the same with all the details. Below you can find the table structures and demo data.
We are creating three tables as Students, Subjects and StudentSubjectMappings for our demo.
--Creating Students Table CREATE TABLE dbo.Students ( StudentID INT, StudentName NVARCHAR(100), Class VARCHAR(10), ) GO --Creating Subjects Table CREATE TABLE dbo.Subjects ( SubjectID INT, SubjectName NVARCHAR(100), ) GO --Creating Student and subject mapping in a transaction table named StudentSubjectMappings CREATE TABLE dbo.StudentSubjectMappings ( StudentID INT, SubjectID INT ) GO
Insert some demo values in all these created tables as below;
INSERT INTO dbo.Students(StudentID, StudentName, Class) VALUES(1, 'Student 1', 'X'), (2, 'Student 2', 'XI'), (3, 'Student 3', 'XII'), (4, 'Student 4', 'X'), (5, 'Student 5', 'XI'), (6, 'Student 6', 'XII') GO INSERT INTO dbo.Subjects(SubjectID, SubjectName) VALUES(1, 'Physics'), (2, 'Chemistry'), (3, 'Mathematics'), (4, 'English'), (5, 'Social Science'), (6, 'Hindi') GO INSERT INTO dbo.StudentSubjectMappings(StudentID, SubjectID) VALUES(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 1), (2, 2), (2, 3), (2, 4), (3, 5), (3, 6) GO
Now have a look on the tables data.
SELECT * FROM dbo.Students
SELECT * FROM dbo.Subjects
SELECT * FROM dbo.StudentSubjectMappings
Required Output:
Now to get the final output as required, use this query;
SELECT StudentID, StudentName, Class, ISNULL(STUFF((SELECT ', ' + SubjectName FROM dbo.StudentSubjectMappings SSM INNER JOIN dbo.Subjects SUB ON SUB.SubjectID = SSM.SubjectID WHERE SSM.StudentID = STD.StudentID FOR XML PATH('') ), 1, 1, ''), 'Not Assigned Yet') AS SubjectList FROM dbo.Students STD
Let’s discuss this trick:
To achieve this task, i have used “FOR XML” clause with “PATH” mode inside a sub-query. FOR XML clause is used to retrieve the result of a SQL query as XML data and “PATH” mode for the “FOR XML” clause provides a way to mix elements and attributes.
In above query, i have used this trick. I have used a blank element name for PATH mode inside PATH(”). It will cause to create a XML data with blank element name. Also nothing is specified as a column name for expression ‘, ‘ + SubjectName which is an expression for a column in our subquery. This blank column name will cause a blank child element name for the created XML. So in our XML data, we don’t have any element name and it’s just a string created from all the rows.
I have prefixed all the row values with a comma and then removed it from start which is easy to find. To remove it, i have used a STUFF function to do so. Adding the same comma at end and then removing it from end will need to find the length of the string to remove it from end. Now for each row in upper query, the sub query will execute and find all the rows in “dbo.Subject” table and then it will combine all the values in one row as described above.
##Below is an addition from comment section to highlight this important point in main post##
HTML character handling in FOR XML PATH
Pointed by : [Jeff Moden] & Answered by : [John Imel]
Thanks to @Jeff Moden for highlighting this point and @John Imel for answering this nicely in comment section of this post. It’s really appreciable and helpful. Hats off to both of you!
From comment section i am here adding a provision to handle html characters in above code. I am leaving the above code untouched, hence in case you have to use the output of above query in an html page where html encoded values will be decoded back, it will work fine. But in case you have to display it as it is somewhere, it will encode the html characters like;
& sign to @amp; and < sign to < and > sign to > etc.
To avoid such kind of encoding use this code;
SELECT StudentID, StudentName, Class, ISNULL(STUFF((SELECT ', ' + COALESCE(SubjectName, '') FROM dbo.StudentSubjectMappings SSM INNER JOIN dbo.Subjects SUB ON SUB.SubjectID = SSM.SubjectID WHERE SSM.StudentID = STD.StudentID FOR XML PATH(''), TYPE).value('.[1]', 'NVARCHAR(MAX)') , 1, 1, ''), 'Not Assigned Yet') AS SubjectList FROM dbo.Students STD
Here, TYPE directive forces the data type of this FOR XML query as xml data type on which we can apply xml type methods. In this query, value method has been applied with ‘.[1]’ as XQuery expression and NVARCHAR(MAX) as SQLType.
##Addition ends here##
A Question For You:
Are you interested to answer a simple question asked below?
Q : What will be the output of the below query:
SELECT LEN('ABC ')
Length of ABC string followed by 3 blank spaces using LEN function in SQL Server.
It may be a silly question for you but trust me, it can help some others to explore the way LEN function works.
I request you to try it without executing it in SQL and put your answer in comment section honestly. You can also put your explanation to help others.
Thanks for your support and patience.
Thank You So Much It’s Very HelpFull……
Sorry sir but your query was not working in my case the case was same as you shown???
so plz tell what can I do the query was no fire and it not showing any error?
Thank you for publishing this example, it helped me a lot!
ORDER CHEAP TADACIP ON LINE WITHOUT PRESCRIPTION!
* TOP PHARMACY LIST!
* MEDICATION WITHOUT A PRESCRIPTION!
* YOU CAN FIND: THE BEST QUALITY PILLS!
* WE GUARANTEE THAT ONCE YOU HAVE PURCHASED A PRODUCT FROM US YOU WILL GET THAT PRODUCT
WE THANK YOU FOR VISITING APPROVED ONLINE PHARMACY © 2013.
[b]Tags:[/b][b]Tadacip Over The Counter, Tadacip Online No Prescription, Where To Buy Tadacip, Tadacip Without Prescriptions, Can You Buy Tadacip Online, Where To Buy Tadacip[/b]
// Create coloumn with seprated column value.from row value
SELECT DISTINCT p.[Account ID],p.[Contact ID],p.[Account Name],p.[Email],p.[Contact Role],
STUFF((SELECT distinct ‘,’ + p1.Role
FROM finalsheet p1
WHERE p.[Contact ID]= p1.[Contact ID]
FOR XML PATH(”), TYPE
).value(‘.’, ‘NVARCHAR(MAX)’)
,1,1,”) RoleS
FROM finalsheet p
–group by p.[Account ID],p.[Contact ID],p.[Account Name],p.[Email],p.[Contact Role]
And here is another way to do it if you are trying avoid XML. Though xml is more flexible, efficient and dynamic.
;With TT as (
select ST.StudentID, CLASS, s.SubjectID, SubjectName from dbo.Subjects s join dbo.StudentSubjectMappings gs on s.SubjectID = gs.SubjectID
RIGHT JOIN STUDENTS ST ON GS.STUDENTID = ST.STUDENTID)
, TTT AS
(select *
from ( select StudentID,CLASS, SubjectID, SubjectName from TT) src
pivot
( MAX(SubjectName)for SubjectID in ([1], [2], [3],[4], [5], [6])) piv)
, TTTT AS
(SELECT STUDENTID,CLASS, ISNULL([1], ”)+’,’+ISNULL([2],”)+’,’+ISNULL([3],”)+’,’+ISNULL([4],”)+’,’+ISNULL([5],”)+’,’+ISNULL([6], ”)[SUBJECT]
FROM TTT )
,TTTTT AS(SELECT STUDENTID,CLASS, REPLACE([SUBJECT], ‘,,’,”) [SUBJECT] FROM TTTT)
SELECT StudentID ,Class, CASE
WHEN LEFT([SUBJECT], 1)= ‘,’ THEN ‘Not Assigned Yet’
ELSE [SUBJECT] END AS [SUBJECTLIST] FROM TTTTT ORDER BY StudentID
p.s. Keep up the good work, Gopal.
Gopal,
No… not “geniuses”. As John said, we’ve both been bit by these problems in the past. It’s just a bit of experience. Wink
Spot on, John.
There’s another impact to adding the “de-entitization” to the XML concatenation by using TYPE… it causes the query to take approximately 30% longer.
Here are the run results I get on a million row test…
–===== Test the XML concatenation WITH de-entitization.
(50000 row(s) affected)
SQL Server Execution Times:
CPU time = 6692 ms, elapsed time = 7509 ms.
–===== Test the XML concatenation WITHOUT de-entitization.
(50000 row(s) affected)
SQL Server Execution Times:
CPU time = 5148 ms, elapsed time = 5446 ms.
Don’t take my word for it though. Do your own testing. Here’s the test code I used. Remember not to take the first set of results because data will be cached in real life. Try adding some indexes and see what happens.
DROP TABLE #TestTable
GO
–===== Create a million row test table.
— SomeInt wil contain values from 1 to 50,000.
— SomeLetters2 will contain values from “AA” thru “ZZ”
— This only takes 2 seconds on my laptop. It uses a
— “pseudeo-cursor” contained in every SELECT (thay all
— loop behind the scenes and is amplified in quantity
— by the CROSS JOIN.
SELECT TOP 1000000
RowNum = IDENTITY(INT,1,1)
,SomeInt = ABS(CHECKSUM(NEWID()))%50000+1
,SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
INTO #TestTable
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
;
GO
———————————————————————
PRINT ‘–===== Test the XML concatenation WITH de-entitization.’
SET STATISTICS TIME ON
;
SELECT SomeInt
,CSVList =
STUFF(
(
SELECT ‘, ‘ + SomeLetters2
FROM #TestTable tt2
WHERE tt2.SomeInt = tt1.SomeInt
ORDER BY SomeLetters2
FOR XML PATH(”),TYPE
).value(‘.[1]’, ‘VARCHAR(max)’
), 1, 1, ”)
FROM #TestTable tt1
GROUP BY tt1.SomeInt
;
SET STATISTICS TIME OFF
;
GO
———————————————————————
PRINT ‘–===== Test the XML concatenation WITHOUT de-entitization.’
SET STATISTICS TIME ON
;
SELECT SomeInt
,CSVList =
STUFF(
(
SELECT ‘, ‘ + SomeLetters2
FROM #TestTable tt2
WHERE tt2.SomeInt = tt1.SomeInt
ORDER BY SomeLetters2
FOR XML PATH(”)
)
, 1, 1, ”)
FROM #TestTable tt1
GROUP BY tt1.SomeInt
;
SET STATISTICS TIME OFF
;
GO
Hi @Jeff Moden, thanks for highlighting this point missed completely by me and @John Imel, thanks for answering this nicely. It’s really appreciable and helpful.
Both of you have been credited for your nice and helping effort in main post with an addition flag tag. You really deserve this credit.
Hats off to both of you genius! Please keep posting your comments and suggestions.
here you go JeffSmile
1) 6
2) 3 even though its nvarchar len still returns number of characters not number of bytes used less blanks
3) 12 as its nvarchar and thus uses 2 bytes per character ( this one got me a few times in the past) so it returns 12 ie 6 character x 2bytes per character = 12
the & sign would be converted to & ie for xml path converts special characters to html entities as its an xml specific function.
here is the fix to it
SELECT StudentID, StudentName, Class,
ISNULL(STUFF((SELECT ‘, ‘ + SubjectName
FROM dbo.StudentSubjectMappings SSM
INNER JOIN dbo.Subjects SUB ON SUB.SubjectID = SSM.SubjectID
WHERE SSM.StudentID = STD.StudentID
FOR XML PATH(”), TYPE ).value(‘.[1]’, ‘nvarchar(max)’
), 1, 1, ”), ‘Not Assigned Yet’) AS SubjectList
FROM dbo.Students STD
Impact to query or better the results is dependent on your usage, if this was for display on an html page then no impact as the & would still display as & but if its not then you got html entities in your lists. where this has gotten me into issues before is when you need to then compare the csv string to something and “Algebra & Trig” does not equal “Algebra & Trig”
Nice questions JeffSmile
The answer to the question is “3” because LEN does not consider trailing blanks.
Let’s continue in that same vein. What do the following return. As you say, answer without running the code.
1. SELECT DATALENGTH(‘ABC ‘);
2. SELECT LEN(N’ABC ‘);
3. SELECT DATALENGTH(N’ABC ‘);
Next question. If we change “Mathematics” to “Algebra & Trig” the “Subjects” table in your concatenation example, what fault will occur, how can it be fixed, and what is the impact to the query?
If no one answers by Wednesday, I’ll provide the answers.