SQLTeam.com | Weblogs | Forums

JOIN to Child Table multiple times to flatten the data


#1

I have PARENT and CHILD tables. The Child table contains ID of parent, and a Type code (e.g. "Application" and "Permit"). There can be multiple Applications and Permits, but I'm only interested in the first one or two. The Child Table contains a Sequence column, but the rows are not necessarily numbered contiguously 1, 2, 3 ... - e.g. they might be numbered 2, 5, 17, ...

In reality (but not in my example) there are some child row Type codes for which there can only be one row (3rd party APP, I suppose it is written this way to be Flexible, but it means I have to join the Child Table multiple times to get a handful of values that could be individual columns in the PARENT table).

So I'm looking for efficient / creative ways to "flatten" this Child Data

Where there is only one Child Record (for a given type) and provided that I am confident that the sequence number is always = 1 (I do have cases where that IS the case) then I can just JOIN to get that.But its a dozen JOINs to the same table ... is there a better way?

Where there are multiple Child Records then I have very little confidence that the first two Sequence Numbers will be 1 and 2 (in practice there is also a "Ceased flag", and those rows will be ignored, which is what causes the actual Sequence Numbers to be non-contiguous, so for these I've been trying a PIVOT instead. That may be horrifically inefficient though?

Other methods / suggestions?

data:


CREATE TABLE #PARENT
(
	P_ID		int NOT NULL
	, P_Name	varchar(20) NOT NULL
	, PRIMARY KEY
	(
		P_ID
	)
)
CREATE TABLE #CHILD
(
	C_P_ID		int NOT NULL
	, C_Type	varchar(5) NOT NULL
	, C_Sequence	int NOT NULL
	, C_Name	varchar(20) NOT NULL
	, PRIMARY KEY
	(
		C_P_ID
		, C_Type
		, C_Sequence
	)
)

INSERT INTO #PARENT
VALUES(1, 'Bill')
,(2, 'Wendy')

INSERT INTO #CHILD
VALUES(1, 'A', 1, 'BillApplication01')
,(1, 'A', 3, 'BillApplication02')
,(1, 'P', 1, 'BillPermit01')
,(1, 'P', 4, 'BillPermit02')
,(2, 'A', 1, 'WendyApplication01')
,(2, 'A', 4, 'WendyApplication02')

Simple query to just get first Application and Permit for each PARENT

SELECT	[P]=P.P_Name, [A1]=A1.C_Name, [P1]=P1.C_Name
FROM	#PARENT AS P
	LEFT OUTER JOIN #CHILD AS A1
		 ON A1.C_P_ID = P.P_ID
		AND A1.C_Type = 'A'
		AND A1.C_Sequence = 1
	LEFT OUTER JOIN #CHILD AS P1
		 ON P1.C_P_ID = P.P_ID
		AND P1.C_Type = 'P'
		AND P1.C_Sequence = 1

Add a JOIN to get the Second Permit (i.e. where the Sequence number might NOT be 2)

-- SELECT add:
	, [P2Seq]=P2.C_Sequence, [P2]=P2.C_Name

	LEFT OUTER JOIN
	(
		-- Get 2nd child (for Type="P")
		SELECT	C_P_ID
			, C_Sequence
			, C_Name
			, [C_ItemNo] = ROW_NUMBER()
				OVER
				(
					PARTITION BY C_P_ID
					ORDER BY C_Sequence
				)
		FROM	#CHILD
		WHERE	C_Type = 'P'
	) AS P2
		 ON P2.C_P_ID = P.P_ID
		AND P2.C_ItemNo = 2

Full query using a PIVOT to get First and Second of Application and Child records

SELECT	[P]=P.P_Name, [A1]=A1.C_Name, [P1]=P1.C_Name
	, [P2Seq]=P2.C_Sequence, [P2]=P2.C_Name
	, [A1], [A2], [P1], [P2]
FROM	#PARENT AS P
	LEFT OUTER JOIN #CHILD AS A1
		 ON A1.C_P_ID = P.P_ID
		AND A1.C_Type = 'A'
		AND A1.C_Sequence = 1
	LEFT OUTER JOIN #CHILD AS P1
		 ON P1.C_P_ID = P.P_ID
		AND P1.C_Type = 'P'
		AND P1.C_Sequence = 1
	LEFT OUTER JOIN
	(
		-- Get 2nd child (for Type="P")
		SELECT	C_P_ID
			, C_Sequence
			, C_Name
			, [C_ItemNo] = ROW_NUMBER()
				OVER
				(
					PARTITION BY C_P_ID
					ORDER BY C_Sequence
				)
		FROM	#CHILD
		WHERE	C_Type = 'P'
	) AS P2
		 ON P2.C_P_ID = P.P_ID
		AND P2.C_ItemNo = 2
	-- The PIVOT JOIN:
	LEFT OUTER JOIN 
	(
		SELECT	C_P_ID
			, [A1], [A2], [P1], [P2]
		FROM
		(
			SELECT	C_P_ID
				, [C_Type_ItemNo] = C_Type + CONVERT(varchar(20), C_ItemNo)
				, C_Name
			FROM
			(
				SELECT
					C_P_ID, C_Type, C_Sequence, C_Name
					, [C_ItemNo] = ROW_NUMBER()
						OVER
						(
							PARTITION BY C_P_ID,  C_Type
							ORDER BY C_Sequence
						)
				FROM	#CHILD
				WHERE	C_Type IN('A', 'P')
			) AS T
		) AS C
		PIVOT
		(
			MIN(C_Name)
			FOR	C_Type_ItemNo IN ([A1], [A2], [P1], [P2])
		) AS T_Pivot
	) AS T_Pivot
		 ON T_Pivot.C_P_ID = P_ID

Cleanup:

GO
DROP TABLE #PARENT
GO
DROP TABLE #CHILD
GO

results:

P     A1                 P1           P2Seq P2           A1                 A2                 P1           P2
----- ------------------ ------------ ----- ------------ ------------------ ------------------ ------------ ------------
Bill  BillApplication01  BillPermit01 4     BillPermit02 BillApplication01  BillApplication02  BillPermit01 BillPermit02
Wendy WendyApplication01 NULL         NULL  NULL         WendyApplication01 WendyApplication02 NULL         NULL

#2

I'm not going to tell you anything that you don't already know and I make no assertion about the efficiency...
If the Sequence number is not guaranteed to be 1-based and sequential, you can certainly impose that condition using the ROW_NUMBER() option, perhaps in a WITH construct or a subquery.
If the Type defines certain values that are guaranteed to need only one join while others would require potentially multiple joins, you break the query into two parts and UNION ALL them together. Something along the lines of:[code]select ...
from Parent join Child c1 on...
where Type in ('A', 'B', 'C')

union all

select ...
from Parent join Child c1 on...
join child c2 on ...
join child c3 on ...
...
where Type NOT in ('A', 'B', 'C')[/code]Just some stray thoughts. Hope they are in some way useful.


#3

I am not going to tell you anything you don't already know either; I was looking at your sample data and the final result set (which I assume is what you want to get) and using "manual" pivoting.

;WITH cte AS
(	SELECT *,
		ROW_NUMBER() OVER (PARTITION BY C_P_ID, C_TYPE ORDER BY C_SEQUENCE) AS RN
	FROM #CHILD
)
SELECT
	p.P_Name,
	MAX(CASE WHEN c.C_Type = 'A' AND RN=1 THEN c.C_Name END) AS A1,
	MAX(CASE WHEN c.C_Type = 'P' AND RN=1 THEN c.C_Name END) AS P1,
	MAX(CASE WHEN c.C_Type = 'P' AND RN=2 THEN c.C_Sequence END) AS P2Seq,
	MAX(CASE WHEN c.C_Type = 'P' AND RN=2 THEN c.C_Name END) AS P2,
	MAX(CASE WHEN c.C_Type = 'A' AND RN=1 THEN c.C_Name END) AS A1,
	MAX(CASE WHEN c.C_Type = 'A' AND RN=2 THEN c.C_Name END) AS A2,
	MAX(CASE WHEN c.C_Type = 'P' AND RN=1 THEN c.C_Name END) AS P1,
	MAX(CASE WHEN c.C_Type = 'P' AND RN=2 THEN c.C_Name END) AS P2
FROM
	#PARENT p
	INNER JOIN cte c ON
		c.C_P_ID = p.P_ID
GROUP BY
	p.P_Name;

#4

Ah yes, good idea, and much better than having a dozen JOINs, to the same table, in my main query, thanks.

I wish someone would!!

In fact you (both :slight_smile: ) just did:

Off the top of my head I think I much prefer that to using PIVOT. The Child Table has a lot of stuff, and in the main I only want a small sub-set of it, so a Manual Pivot may be much "tighter"

At the very least it will be worth me doing a Performance Comparison.

Thanks Chaps.