LEFT JOIN only returns partial data - Synonyms

Hi SQLTeam

I have a LOT of experience with T-SQL, but here I have met a issue that I SIMPLY can't understand

On SQL2014 using SSMS 2016, I have 2 views (linked from another SQL Server), one containing the users work hours per day then other the users registered hours per day. Users might not have registred hours on each day where they have work hours. I therefore join the 2 tables using a LEFT OUTER JOIN.

When I add a date to the where clause then SOME (4th of July is OK 5th is not - not specific to these 2 dates though) of the data that exists on the right table disappear. If I test it by moving the data from the 2 views to TEMP tables then I get the expected result.

Here is the different sql statements

Driving me MAD, anyone with a good idea ?
THANK YOU

-- INNER JOIN RETURNS CORRECT RESULT
select TimeSheetHours.*
, Hours_Worked.RegisteredHours
, Hours_Worked.OverTimeAdjustment
, Hours_Worked.Hours_Worked
from
vw_TimeSheetHours TimeSheetHours
INNER JOIN
vw_HoursWorked Hours_Worked
ON TimeSheetHours.EMPLOYEENUMBER = Hours_Worked.EMPLOYEENUMBER AND TimeSheetHours.TheDate = Hours_Worked.ENTRYDATE
where TimeSheetHours.EMPLOYEENUMBER = '11252'
and TimeSheetHours.TheDate between '4-jul-2016' and '5-jul-2016'
order by TimeSheetHours.TheDate

-- LEFT OUTER JOIN RETURNS THE CORRECT RESULT WHEN NO DATE IN WHERE WHERE CLAUSE
select TimeSheetHours.*
, Hours_Worked.RegisteredHours
, Hours_Worked.OverTimeAdjustment
, Hours_Worked.Hours_Worked
from
vw_TimeSheetHours TimeSheetHours
LEFT OUTER JOIN
vw_HoursWorked Hours_Worked
ON TimeSheetHours.EMPLOYEENUMBER = Hours_Worked.EMPLOYEENUMBER AND TimeSheetHours.TheDate = Hours_Worked.ENTRYDATE
where TimeSheetHours.EMPLOYEENUMBER = '11252'
order by TimeSheetHours.TheDate

-- LEFT OUTER JOIN LOOSES THE LEFT SIDE WHEN A DATE IS ADDED TO THE WHERE CLAUSE
select TimeSheetHours.*
, Hours_Worked.RegisteredHours
, Hours_Worked.OverTimeAdjustment
, Hours_Worked.Hours_Worked
from
vw_TimeSheetHours TimeSheetHours
LEFT OUTER JOIN
vw_HoursWorked Hours_Worked
ON TimeSheetHours.EMPLOYEENUMBER = Hours_Worked.EMPLOYEENUMBER AND TimeSheetHours.TheDate = Hours_Worked.ENTRYDATE
where TimeSheetHours.EMPLOYEENUMBER = '11252'
and TimeSheetHours.TheDAte between '4-jul-2016' and '5-jul-2016'
order by TimeSheetHours.TheDate

--select * into #vw_TimeSheetHours from vw_TimeSheetHours where EMPLOYEENUMBER = '11252'
--select * into #vw_HoursWorked from vw_HoursWorked where EMPLOYEENUMBER = '11252'

-- SAME LOGIC AS BEFORE, BUT NOW THE RECORDS FROM THE 2 VIEWS ARE PUT INTO TEMP TABLES, NOW THE EXPECTED RESULT IS ACHIEVED
select TimeSheetHours.*
, Hours_Worked.RegisteredHours
, Hours_Worked.OverTimeAdjustment
, Hours_Worked.Hours_Worked
from
#vw_TimeSheetHours TimeSheetHours
LEFT OUTER JOIN
#vw_HoursWorked Hours_Worked
ON TimeSheetHours.EMPLOYEENUMBER = Hours_Worked.EMPLOYEENUMBER AND TimeSheetHours.TheDate = Hours_Worked.ENTRYDATE
where TimeSheetHours.EMPLOYEENUMBER = '11252'
and TimeSheetHours.TheDAte between '4-jul-2016' and '5-jul-2016'
order by TimeSheetHours.TheDate

Any chance one or more table schemas have changed since the views were created and the views do not have WITH SCHEMABINDING
?

Can you include the Hours_Worked.ENTRYDATE in your query to validate how that is matching? In order for this to match they must have the exact same date without times.

When you use BETWEEN you are only getting dates for '5-jul-2016 00:00:00.000'. If there are any time components to the dates you will not get those values returned.

Review the code for the views to see how those values are actually defined - I am guessing that you will find that the underlying date columns have time components and your BETWEEN statement is removing those items.

You can also try using:

WHERE TimeSheetHours.TheDate >= '4-jul-2016'
AND TimeSheetHours.TheDate < '6-jul-2016'

If that returns the expected values then you know it has something to do with the time portions of the date columns.

Hi Jeff

Thank you for responding

I have tried to explicitly cast the datetime to date, but that does not make a difference. I have validated that none of the date fields has any time values.

Note that first generating the temp tables results in the correct returned values.

I'll try your suggestions, for now I have "solved" the problem by using INNER JOIN and populating the right view with the values required. Dataset is quite small (< 100.000) rows.

Have a nice weekend

Michael

Hi gbritton

thank you, that was my thought as well, have tried to recreate the views.

I have a feeling that it is related to the linked server and the synonyms as the query run correctly on the server to where my query is linked.

/M

I would inspect the data VERY carefully.

SELECT TOP 1000 'HW>>', Hours_Worked.*, 'TSH>>', TimeSheetHours.*
FROM vw_HoursWorked AS Hours_Worked 
    LEFT OUTER JOIN vw_TimeSheetHours as TimeSheetHours
        ON TimeSheetHours.EMPLOYEENUMBER = Hours_Worked.EMPLOYEENUMBER
        AND TimeSheetHours.TheDate = Hours_Worked.ENTRYDATE 
WHERE Hours_Worked.EMPLOYEENUMBER = '11252'
      AND Hours_Worked.ENTRYDATE  BETWEEN  '4-jul-2016' and '5-jul-2016'

Change Employee Number and/or Entry Date Range to be sure to include the suspect data.

I note that you are providing EMPLOYEENUMBER as a text string. Perhaps the column is defined as an INT, but if not there is scope for some rogue character to have crept in there ... a leading space for example. If that is the case the relevant data will be missing from the resulset altogether (it won't match the EMPLOYEENUMBER = '11252' criteria)

What data do you get for for Hours_Worked that has no corresponding value in TimeSheetHours? Those are the rows where you need to figure out why the JOIN has failed.

Personally I would never use a date constant in for format '4-jul-2016', I can't imagine that one is ambiguous, but SQL parsing of date strings is open to interpretation - "jul" isn't going to work for certain languages, for example. Stick to "20160704' instead, that 8-digit string format will always be parsed unambiguously by SQL

Did you try changing the where clause to what I recommended? Have you tried including the ENTRYDATE values for those cases where it works?

Is it possible that ENTRYDATE is a calculated date value coming from your linked server?

I am thinking that this has to do with when the ENTRYDATE column is actually calculated and how SQL Server is sending that request over the linked server. It is possible that it is utilizing the underlying value from the linked server for the outer join - but using the converted value for the inner join.

If that is the case - and the ENTRYDATE column on the source system is not the same data type - then comparing the values prior to conversion to the matching data type in SQL Server would cause the situation you are seeing.

Hi Jeff

sorry for the late reply,

Whatever I did solved it. Upgrading to SQL 2016 did though, so I guess it was a bug.

Thank you for looking into it.

/M