System: SQL Server from 2005 and up
Visualising data
In this blog post, I'll present one of my favorites methods, to easily visualize data in a smart way. I'm going to explain the method, using these 2 tables:
Items:
Id Name ColorId
----------- -------------------------------------------------- -----------
1 Cranberries 1
2 Plants 2
3 Shrimps 1
4 Ocean 3
5 Sky 3
6 Roses 1
And
Colors:
Id Color
----------- --------------------------------------------------
1 Red
2 Green
3 Blue
The most used way to show, the color for each item would be to join the tables like:
SELECT Items.Id, Items.Name, Color FROM Items
INNER JOIN Colors ON Items.ColorId = Colors.Id
Which gives:
Name Color
-------------------------------------------------- -----------------
Cranberries Red
Plants Green
Shrimps Red
Ocean Blue
Sky Blue
Roses Red
It is readable, and if we sort by color it might be more readable. It might be most readable, if we could group the items by color. It requires a bit more SQL; but the result is better:
SELECT
STUFF(
(SELECT ', ' + Name
FROM Items
where Items.ColorId = Colors.Id
FOR XML PATH (''))
, 1, 1, '') AS Items,
Color
FROM Colors;
Gives:
Items Color
------------------------------------ ------------------------------------------
Cranberries, Shrimps, Roses Red
Plants Green
Ocean, Sky Blue
Explanation
The magic happens in the inner SQL. FOR XML PATH is used as a smart way for concatenating string columns to one string column. By adding the ',' to Name, we are getting a comma separated list, where the first char is ','. For more about FOR XML PATH see: http://technet.microsoft.com/en-us/library/ms189885.aspx
STUFF are used to remove the first char(in this case ','), from the comma separated lists. Stuff is a sql command for inserting strings into strings, but is can also remove chars. In this case we are overwriting first char with a empty string. For more about STUFF see: http://technet.microsoft.com/en-us/library/ms188043.aspx
STUFF are used to remove the first char(in this case ','), from the comma separated lists. Stuff is a sql command for inserting strings into strings, but is can also remove chars. In this case we are overwriting first char with a empty string. For more about STUFF see: http://technet.microsoft.com/en-us/library/ms188043.aspx
Why STUFF? Substring is simpler and everybody knows
ReplyDeleteThat is a very good question indeed. The answer is:
DeleteLooking at substrings parameters:
SUBSTRING ( expression ,start , length )
You will notice, that substring requires a length. Providing the length of the expression in this case is quite difficult. The problem is, we are not storing the string from the inner select, so can't use LEN(). You could extends the snippet, to store the string from the inner select, and then use SUBSTRING combined with LEN(stored string), but then you are increasing complexity, and losing the compactness of the snippet.
In this case I just use 999999999999 for length
ReplyDeleteIt is also a way to do it :-)
DeleteI'm not sure it is pretty solution in my book, because there is a t-sql function(STUFF), which can do the work without i have to provide a magic number. But it all comes down to taste
According to the query plan and statistics, the cost is more or less the same.
So _performance_ wise, I can't find any evidence for using STUFF over SUBSTRING, in this case.