Join and inner join

jollyegeorge · April 29, 2016, 7:19pm

I thought INNER JOIN and JOIN give me the same results,

I am finding lot of differences when I use both, can I know why ?

FROM CV3Order ORD (NOLOCK)
JOIN CV3ActiveVisitList AVL (NOLOCK)
ON ORD.ClientGUID = AVL.ClientGUID
AND ORD.ChartGUID = AVL.ChartGUID
JOIN CV3MedicationExtension MEX
ON ORD.GUID = MEX.GUID
AND ORD.ClientGUID = MEX.ClientGUID
JOIN CV3OrderCatalogMasterItem OCMI (NOLOCK)
ON OCMI.GUID = ORD.OrderCatalogMasterItemGUID
AND OCMI.CodedTimeGroupCode = 'RX_General'
AND OCMI.Active = 1 AND OCMI.ExpiryDate IS NULL

JamesK · April 29, 2016, 7:25pm

JOIN when used by itself (with no LEFT, RIGHT etc. prepended to it) implicitly assumes INNER JOIN, so you would get the same results whether you included the keyword INNER or not. You may be getting different results because of the NOLOCK hints. NOLOCK returns dirty data. Using NOLOCK is strongly discouraged.

jollyegeorge · April 29, 2016, 7:32pm

Well .. that explains it. . .. I removed the lock and all good . .. Thank you so much

Kristen · April 30, 2016, 10:35am

There are some really insidious side effects of NOLOCK, quite apart from the obvious inclusion of dirty-reads. NOLOCK seems to be used as a crutch by developers unaware of the side-issues and has traditionally been used to solve performance and contention issues. There was no other way way-back-when in SQL 7 and SQL 2000 ... but nowadays there are excellent alternatives (such as Read Committed Snapshot) and I very strongly recommend that you avoid NOLOCK. The only exception we make is to allow it on queries used only by developers to investigate Production systems where we do not want our query to intrude on the actual user's work. We do not allow NOLOCK to be used on code in Production systems.

But ... we have plenty of 3rd party APPs that use NOLOCK ... I know because I get email alerts several times a week - and those alerts ONLY trigger when code containing a NOLOCK has been aborted by SQL server, and that occurrence is much more rare than SQL returning a row twice, or failing to return a row at all, which (from my user's perspective) could cause them to take a business-critical decision incorrectly (and for data that is in the database, perfectly formed, but just excluded, or duplicated, in the query that the APP ran.

jollyegeorge · May 1, 2016, 4:02pm

Thanks a lot for the information Most of the people here I have seen using NOLOCK .. and this is something I should share.

Kristen · May 2, 2016, 8:04am

Probably for historical reasons. I see lots of places where it is still used in the belief that it is the correct solution.

There are two issues. The first is dirty data reads; your app reads some data which is then rolled back. In practice this pretty much never happens (well, not in the APPs I've ever been involved in!) so Developers consider it a non-issue, and that's probably fair enough.

The second, much more serious issue, is when your APP is pulling some data from the server. Lets say that it is traversing an index and is about to read Index Page 1234 and just before then another user has inserted a new row into that index page. However, the index page is already full, so SQL has to split that index page.

If you are using NOLOCK one of two things can happen:

The index page is split, your query reads the first "half page" but your query is running faster than the index page splitter function, so you read the next index page before the newly split half has been written back to the disk. So now you are missing some rows (i.e. rows that were present in the table a few moments ago, and will be back in the table in a few moments time)
You read the index page and then the other process splits the page. In this example your process is running slower, and the other process splits the page and writes the "second half" to a new index page, and THEN your process reads that page. You get that second half-page of index entries a second time.

There is also a third outcome where SQL determines that there is a conflict between the two processes and terminates your process. To my mind this is the best outcome because the user will not see some-missing or some-duplicated data ... but ... that depends on what your APP does when this error is triggered - its a very unusual edge-condition, so your APP might proceed and do something goofy, rather than just aborting. If the user reports an error you'll never be able to reproduce it ...

You'll also never be able to reproduce errors 1 & 2 easily, as the timing is critical to cause it to happen, but the fact that I get email alerts of Error #3 about once a week (there are 20 people in the office, its not a big organisation ...) suggests to me that my users are seeing Outcome #1 & #2 on a fairly frequent basis

Kristen · May 2, 2016, 8:06am

P.S> You may be able to turn on Read Committed Snapshot for your database and not require any code changes. It needs testing, because there are some, rare, situations where code change is needed, and it will impact on usage of TEMPDB which might hurt your, but it works fine for us and many others. (No code changes = you will still have to remove all the NOLOCK from the code ...)