A better way to join two tables using multiple criterias

I have below two tables which i want to combine using if table 1 ref1 column or table 1 ref2 column matches with table 2 ref1 column. Currently i'm using 2 select statements with union all. I'm interested to know any other alternatives.
--tb1
CREATE TABLE [TEST].[dbo].[tb1]
(Ref1 FLOAT, Ref2 FLOAT, Value FLOAT)

INSERT INTO [TEST].[dbo].[tb1] VALUES
(1001,null,2000),
(1003,1005,5000),
(2004,2003,4000),
(3001,null,4000)

--tb2
CREATE TABLE [TEST].[dbo].[tb2]
(Ref1 FLOAT, NewValue FLOAT)

INSERT INTO [TEST].[dbo].[tb2] VALUES
(1001,1),
(1005,1),
(3001,1)
--my code
SELECT tb1.[Ref1]
,tb1.[Ref2]
,tb1.[Value]
,tb2.Ref1
,tb2.NewValue
FROM [TEST].[dbo].[tb1] tb1
FULL OUTER join [TEST].[dbo].[tb2] tb2
ON tb1.Ref1 = tb2.Ref1
WHERE (tb1.Ref1 = tb2.Ref1 or
tb1.Ref2 = tb2.Ref1)
UNION all
SELECT tb1.[Ref1]
,tb1.[Ref2]
,tb1.[Value]
,tb2.Ref1
,tb2.NewValue
FROM [TEST].[dbo].[tb1] tb1
FULL OUTER join [TEST].[dbo].[tb2] tb2
ON tb1.Ref2 = tb2.Ref1
WHERE (tb1.Ref1 = tb2.Ref1 or
tb1.Ref2 = tb2.Ref1)

Result i'm getting is ;
Ref1 Ref2 Value Ref1 NewValue
1001 NULL 2000 1001 1
3001 NULL 4000 3001 1
1003 1005 5000 1005 1

You have hit upon my personal pet peeve so forgive me if I come off as exasperated...
[Soapbox]You are performing an OUTER JOIN which can and usually does return Null values on the joining columns and then REQUIRING in the Where clause that those same columns have non-Null values. Stop doing that immediately, This is mathematically equivalent to performing an INNER JOIN which is going t be more performant.[/Soapbox]
Moving on...
You know the data better than I but let me point out that the the UNION ALL could return the same row twice if both tb1.Ref1 and tb1.Ref2 are equal to tb2.Ref1. If you are convinced that such a thing could never happen then the UNION ALL will work. My personal experience is that impossible data combinations are eventually going to occur but, like I said, you know the data better than I.
Moving on...
A possible alternative to consider is to just perform the single SELECT but use this join logic:INNER JOIN test.dbo.tb2 ON tb2.Ref1 in (tb1.Ref1, tb1.Ref2)This may or may not perform better. You'll nee to test to see if this out performs the two SELECTs, using an INNER JOIN and combining them with the UNION ALL.
HTH

1 Like

My experience too ... "It depends" ... but if the number of columns in the SELECT is small how about a pair of OUTER JOINS and just COALESCE the data from each of them?

Can't remember actually doing this (as an alternative to UNION ALL) so would be interest in whether you think it worthy of consideration? :slight_smile:

SELECT tb1.[Ref1]
    ,tb1.[Ref2]
    ,tb1.[Value]
    , [tb2.Ref1] = COALESCE(TB2A.Ref1, TB2B.Ref1)
    , [tb2.NewValue] = CASE WHEN TB2A.Ref1 IS NOT NULL THEN TB2A.NewValue ELSE TB2B
FROM [TEST].[dbo].[tb1] AS tb1
    LEFT OUTER join [TEST].[dbo].[tb2] AS TB2A
        ON TB2A.Ref1 = tb1.Ref1
    LEFT OUTER join [TEST].[dbo].[tb2] tb2 TB2B
        ON TB2B.Ref1 = tb1.Ref2

That isn''t going to get anything from [tb2] which has no matching key in [tb1] - dunno if the purpose of the FULL JOIN was to achieve that?