SQLTeam.com | Weblogs | Forums

How to read records with a variable number of rows into columns


#1

Hi, I have very limited SQL experience but need to urgently need some assistance to achieve the following:

I have data in a table as shown in the example below. Each record has a variable number of attributes with unique contents. Each separate Attribute for a record is stored in a new row.

Record | Attribute | Contents
123 | Attr1 | Cont1
123 | Attr2 | Cont2
123 | Attr3 | Cont3
124 | Attr1 | Cont4
124 | Attr2 | Cont5
125 | Attr1 | Cont6
126 | Attr4 | Cont7
126 | Attr5 | Cont8

I would like to read the data into a new table, but converting the Attributes to columns, effectively creating 1 row for each Record. Example below.

Record | Attr1 | Attr2 | Attr3 | Attr4 | Attr5
123 | Cont1 | Cont2 | Cont3 | NULL | NULL
124 | Cont4 | Cont5 | NULL | NULL | NULL
125 | NULL | NULL | NULL | Cont7 | NULL
126 | NULL | NULL | NULL | NULL | Cont8

Any assistance with the syntax would be greatly appreciated!

Gordon.


#2

I strongly advise you not to create a table to store the Attributes in Columns - unless perhaps it is just to cache the data for performance and the original data will be retained, and maintained, in the original "vertical" table.

You can "convert" your vertical data into Columns using PIVOT


-- Pivot
SELECT	*
FROM
(
        SELECT	[pivot_col]=[Attribute], [Record], [Contents]
	FROM	YourDataTable
) AS T 
PIVOT 
(
	MIN(Contents)
	FOR pivot_col IN
	(
		[Attr1], [Attr2], [Attr3], [Attr4], [Attr5]
	)
) AS P

#3

Results:

Record      Attr1 Attr2 Attr3 Attr4 Attr5 
----------- ----- ----- ----- ----- ----- 
123         Cont1 Cont2 Cont3 NULL  NULL
124         Cont4 Cont5 NULL  NULL  NULL
125         Cont6 NULL  NULL  NULL  NULL
126         NULL  NULL  NULL  Cont7 Cont8
    
(4 row(s) affected)

#4

Thank you so much - worked like a charm!


#5

Another approach (I personally find the PIVOT/UNPIVOT syntax less readable)

--Setup Test Data
DECLARE	@Chuff TABLE
(
	Record INT NOT NULL,
	Attribute VARCHAR(10) NOT NULL,
	Contents VARCHAR(10) NOT NULL
)
INSERT INTO @Chuff (Record,Attribute,Contents)
VALUES	(123,'Attr1','Cont1'),
		(123,'Attr2','Cont2'),
		(123,'Attr3','Cont3'),
		(124,'Attr1','Cont4'),
		(124,'Attr2','Cont5'),
		(125,'Attr1','Cont6'),
		(126,'Attr4','Cont7'),
		(126,'Attr5','Cont8');
		
--Run Query
SELECT	C.Record,
		Attr1 = MAX(CASE WHEN C.Attribute = 'Attr1' THEN C.Contents END),
		Attr2 = MAX(CASE WHEN C.Attribute = 'Attr2' THEN C.Contents END),
		Attr3 = MAX(CASE WHEN C.Attribute = 'Attr3' THEN C.Contents END),
		Attr4 = MAX(CASE WHEN C.Attribute = 'Attr4' THEN C.Contents END),
		Attr5 = MAX(CASE WHEN C.Attribute = 'Attr5' THEN C.Contents END)
FROM	@Chuff AS C
GROUP	BY C.Record;

#6

Works just as well and does read easier - thank you. Are there potential performance differences with the 2 types of syntax when running against 10,000+ records each with a varying number of attributes per record with form a possibility of 1000 attributes?


#7

I expect so! You'd need to try it to see which performs best. I too find the PIVOT syntax hard to read. My instinct is that SQL should be able to better optimise the PIVOT syntax as you are, in effect, giving it all the information it needs for MS DEVs to build the most optimal method :smile: but of course you might find that a series of CASE statements works better.

I presume you can only have one "Attr1" for a given row in [YourDataTable]? If so then another way to consider would be to LEFT JOIN the Child Table multiple times, once for each attribute. That would avoid any sort of Aggregating process and it might turn out to be faster. I also think it might be? easier to set up Indexes for that solution which improve performance

--Setup Test Data
DECLARE	@MasterTable TABLE
(
	Record INT NOT NULL,
	Col2 VARCHAR(10) NOT NULL,
	Col3 VARCHAR(10) NOT NULL
)

DECLARE	@ChildTable TABLE
(
	Record INT NOT NULL,
	Attribute VARCHAR(10) NOT NULL,
	Contents VARCHAR(10) NOT NULL
)

INSERT INTO @MasterTable (Record,Col2,Col3)
VALUES	(123,'AAA','XXX'),
	(124,'BBB','YYY'),
	(125,'CCC','ZZZ'),
	(126,'DDD','WWW')

INSERT INTO @ChildTable (Record,Attribute,Contents)
VALUES	(123,'Attr1','Cont1'),
		(123,'Attr2','Cont2'),
		(123,'Attr3','Cont3'),
		(124,'Attr1','Cont4'),
		(124,'Attr2','Cont5'),
		(125,'Attr1','Cont6'),
		(126,'Attr4','Cont7'),
		(126,'Attr5','Cont8');
--Run Query
SELECT	M.Record,
	M.Col2,
	M.Col3,
	Attr1 = Attr1.Contents,
	Attr2 = Attr2.Contents,
	Attr3 = Attr3.Contents,
	Attr4 = Attr4.Contents,
	Attr5 = Attr5.Contents
FROM	@MasterTable AS M
	LEFT OUTER JOIN @ChildTable AS Attr1
		 ON Attr1.Record = M.Record
		AND Attr1.Attribute = 'Attr1'
	LEFT OUTER JOIN @ChildTable AS Attr2
		 ON Attr2.Record = M.Record
		AND Attr2.Attribute = 'Attr2'
	LEFT OUTER JOIN @ChildTable AS Attr3
		 ON Attr3.Record = M.Record
		AND Attr3.Attribute = 'Attr3'
	LEFT OUTER JOIN @ChildTable AS Attr4
		 ON Attr4.Record = M.Record
		AND Attr4.Attribute = 'Attr4'
	LEFT OUTER JOIN @ChildTable AS Attr5
		 ON Attr5.Record = M.Record
		AND Attr5.Attribute = 'Attr5'
ORDER BY M.Record