Select Statement doesn't Tally

Hi,

I have a table generated as;

INSERT INTO dbo.CPRDLkupMedical (Snomed,SnomedDescription,EmisCode,ReadCode,Term) SELECT DISTINCT Snomed,SnomedDescription, case WHEN left(Emiscode,7) = 'EMISATT' THEN 'EMISATT' WHEN left(Emiscode,5)= 'PCSDT' THEN 'PCSDT' ELSE Emiscode END AS Emiscode, CASE WHEN left(Readcode,5)= 'PCSDT' THEN 'PCSDT' WHEN left(Readcode,7)= 'EMISATT' THEN 'EMISATT' ELSE Readcode END AS Readcode, Term FROM dbo.CPRDALLMedicalCodes with (tablock)

Then, I have another Select statement as;

SELECT     EMISJun15Btch1.dbo.Referral.ReferralGuidDigest, dbo.EmisPatient.PatientID, dbo.EmisConsultation.ConsultationID, dbo.EmisStaff.StaffID, 
                      EMISJun15Btch1.dbo.Referral.EffectiveDateTime AS ReferralDate, EMISJun15Btch1.dbo.Referral.Status AS ReferralStatus, 
                      EMISJun15Btch1.dbo.Referral.Mode AS ReferralMode, EMISJun15Btch1.dbo.Referral.ServiceType, EMISJun15Btch1.dbo.Referral.Urgency, 
                      EMISJun15Btch1.dbo.Referral.Direction, EMISJun15Btch1.dbo.Referral.Transport, EMISJun15Btch1.dbo.Referral.EndedDate, 
                      EMISJun15Btch1.dbo.Referral.ReceivedDate, dbo.CPRDLkupMedical.MedicalCodeID, EMISJun15Btch1.dbo.Referral.Term, 
                      EMISJun15Btch1.dbo.Referral.SnomedCTConceptId, EMISJun15Btch1.dbo.Referral.SnomedCTDescriptionId, EMISJun15Btch1.dbo.Referral.EmisCode, 
                      EMISJun15Btch1.dbo.Referral.ReadCode
FROM         EMISJun15Btch1.dbo.Referral WITH (tablock) INNER JOIN
                      dbo.EmisPatient ON EMISJun15Btch1.dbo.Referral.PatientGuidDigest = dbo.EmisPatient.PatientGuidDigest LEFT OUTER JOIN
                      dbo.EmisConsultation ON EMISJun15Btch1.dbo.Referral.ConsultationGuidDigest = dbo.EmisConsultation.ConsultationGuidDigest INNER JOIN
                      dbo.EmisStaff ON EMISJun15Btch1.dbo.Referral.AuthorisingUserRoleGuidDigest = dbo.EmisStaff.StaffGuidDigest LEFT OUTER JOIN
                      dbo.CPRDLkupMedical ON (EMISJun15Btch1.dbo.Referral.SnomedCTConceptId = dbo.CPRDLkupMedical.Snomed OR
                      EMISJun15Btch1.dbo.Referral.SnomedCTConceptId IS NULL AND dbo.CPRDLkupMedical.Snomed IS NULL) AND 
                      (EMISJun15Btch1.dbo.Referral.SnomedCTDescriptionId = dbo.CPRDLkupMedical.SnomedDescription OR
                      EMISJun15Btch1.dbo.Referral.SnomedCTDescriptionId IS NULL AND dbo.CPRDLkupMedical.SnomedDescription IS NULL) AND 
                      ISNULL(EMISJun15Btch1.dbo.Referral.Term, N'(novalue)') = ISNULL(dbo.CPRDLkupMedical.Term, N'(novalue)') AND 
                      ISNULL(CASE WHEN LEFT(EMISJun15Btch1.dbo.Referral.ReadCode, 5) = 'PCSDT' THEN 'PCSDT' WHEN LEFT(EMISJun15Btch1.dbo.Referral.Readcode, 7) 
                      = 'EMISATT' THEN 'EMISATT' ELSE EMISJun15Btch1.dbo.Referral.ReadCode END, N'(novalue)') COLLATE SQL_Latin1_General_CP1_CS_AS = ISNULL(dbo.CPRDLkupMedical.Readcode,
                       N'(novalue)') COLLATE SQL_Latin1_General_CP1_CS_AS AND ISNULL(CASE WHEN LEFT(EMISJun15Btch1.dbo.Referral.Emiscode, 5) 
                      = 'PCSDT' THEN 'PCSDT' WHEN LEFT(EMISJun15Btch1.dbo.Referral.Emiscode, 7) = 'EMISATT' THEN 'EMISATT' ELSE EMISJun15Btch1.dbo.Referral.EmisCode END, 
                      N'(novalue)') COLLATE SQL_Latin1_General_CP1_CS_AS = ISNULL(dbo.CPRDLkupMedical.Emiscode, N'(novalue)') COLLATE SQL_Latin1_General_CP1_CS_AS

Now, EmisJun15batch1.dbo.Referral has 11,330,536 records

Running the 2nd Select statement gives me 11,350,426 records.

My question is - why am I receiving extra fields in the select statement and not as the same number of records as dbo.Referral in EmisJun15Batch1 DB.

Thanks

Because one of the joined tables
dbo.EmisPatient
dbo.EmisConsultation
dbo.EmisStaff
dbo.CPRDLkupMedical
is matching more than one record

Actually, seeing that CPRDLkupMedical is your "populated table" it might be that:

You have an OUTER JOIN to CPRDLkupMedical so you may be matching records NOT in that table

Change OUTER join to INNER and see if you get the right number of rows.

... or add a WHERE clause:

WHERE dbo.CPRDLkupMedical.Snomed IS NULL

to see the rows which did NOT match anything in CPRDLkupMedical. the COUNT of that resultset might be the number that makes up the difference

As A TEST;

I did the following;

SELECT     EMISJun15Btch1.dbo.Referral.ReferralGuidDigest, dbo.EmisPatient.PatientID, dbo.EmisConsultation.ConsultationID
FROM         EMISJun15Btch1.dbo.Referral INNER JOIN
                      dbo.EmisPatient ON EMISJun15Btch1.dbo.Referral.PatientGuidDigest = dbo.EmisPatient.PatientGuidDigest LEFT OUTER JOIN
                      dbo.EmisConsultation ON EMISJun15Btch1.dbo.Referral.ConsultationGuidDigest = dbo.EmisConsultation.ConsultationGuidDigest

Results:
11,350,426 records.

When INNER JOIN with patient table the records were accurate at 11,330,536 records

So I can confirm that the problem is at the Consultation table. My question is ConsultationGuidDigest is the primary key at the dbo.EmisConsultation.ConsultationGuidDigest. Therefore, no duplicates ConsultationGuidDigest.

Secondly, I am using a LEFT OUTER JOIN, therefore, I am expecting the records to be the same as the ones in EMISJun15Btch1.dbo.Referral, Correct?

Any help received will indeed be highly appreciated.

Thank you

Provided that the relationship between both Referral and EmisPatient, and Referral and EmisConsultation is 1-to-1 (albeit that EmisConsultation is "optional") then yes.

If you have a SINGLE column with a Unique ID in each table (if not then it can be done, but its "harder") then:

SELECT COUNT(*), COUNT(DISTINCT MyUniqueReferralID)
FROM EMISJun15Btch1.dbo.Referral

Both Counts should be the same (otherwise MyUniqueReferralIDis not unique, Natch!)

SELECT COUNT(*), COUNT(DISTINCT MyUniqueReferralID), COUNT(DISTINCT MyUniquePatientID) FROM EMISJun15Btch1.dbo.Referral INNER JOIN dbo.EmisPatient ON EMISJun15Btch1.dbo.Referral.PatientGuidDigest = dbo.EmisPatient.PatientGuidDigest

Again, all 3 counts should be the same. If the MyUniqueReferralID count is LESS then before:

Not all Referral records have a matching EmisPatient
AND multiple EmisPatient records match some Referral records

If MyUniqueReferralID count is the same as before, but the overall Count has gone up, then:

All Referral records have a matching EmisPatient
AND multiple EmisPatient records match some Referral records

Then you can add more INNER JOINS / OUTER JOIN to see where/when the counts change

Note that a JOIN of Tables A, B and C might give you the right overall rowcount, but you might have some missing rows from B and some double-matched rows from C such that, magically, the overall count is correct - hence it is worthwhile testing each JOIN separately (or by using a COUNT(DISTINCT MyUniqueChildTableID) for each table.