Query Performance

GeoJul · March 8, 2017, 2:30pm

In my DB, there is a master table 'Users' which has 1126215 records.
As we are getting some query performance issues in the SSRS reports,
The SSRS report queries which uses the table 'Users' is under the scanner.

The DBA of the team has suggested two changes in the code. They are

a) Add 'With(NOLOCK) in the table joins involving table 'Users'

Example:

SELECT

	  SUBSTRING ( U.activityCode,0,13 )[Trail Code],
	  CONVERT(char(10),U.activityEndDate,126) [End Date],
	  St.Studycode [Study]


FROM 
	  dbo.Studies St WITH(NOLOCK) INNER JOIN  dbo.Users U WITH(NOLOCK)
	  St.UserId = U.UserId

b) Add MAXDOP for the query

Example:

SELECT 
	 
	  SUBSTRING ( U.activityCode,0,13 )[Trail Code],
	  CONVERT(char(10),U.activityEndDate,126) [End Date],
	  St.Studycode [Study]


FROM 
	  dbo.Studies St WITH(NOLOCK) INNER JOIN  dbo.Users U WITH(NOLOCK)
	  St.UserId = U.UserId

ORDER BY U.UserCode DESC

OPTION(MAXDOP 1)

I want to know whether these methods suggested will be ideal for improving the performance.

Thanks for your time.

gbritton · March 8, 2017, 3:59pm

Generally, NOLOCK is a bad idea, at least if you want consistent results. NOLOCK is a synonym for READUNCOMMITTED which is what it means: Read anything, committed or not. You might read rows that will be updated or deleted where those operations have not be committed. Usually you don't want that.

Looking at the join, the first question to ask is this: Is UserId indexed in both tables? If not, there's your problem, in all likelihood. If you run your query in SSMS and ask for the Execution Plan, you can see if indexes are used. What you don't want to see is nested loops on top of table or clustered index scans.

If they are indexed, the next thing to ask is when the statistics were last updated. Out of date statistics can cause a bad plan even if the indexing is correct.

If you're still stuck, the next thing to look at is who is hogging those tables (one or the other or both). That is when your query is running, what else is running? Adam Machanics sp_whoisactive can help here. You can find out who is blocking whom and how. Resolve that and your queries may run better.

ScottPletcher · March 8, 2017, 5:23pm

"WITH (NOLOCK)" is just a kludge. If your DBA suggested that as the first thing to look at, your DBA needs more knowledge and/or experience.

Likewise, rather than MAXDOP for this one query, the threshold for parallelism setting on the instance itself should be checked first. That value definitely needs raised if it's still at the default value. The MS default on installation is ridiculously too low.

The final critical part will be indexing. Presumably the Users table is already clustered on UserId. If it's not, that needs checked carefully.

Studies should also be clustered first on UserId, particularly if the clustering key was just "defaulted" to an identity instead. That's a terrible idea, since there's no such thing as a "default" clus key except by laziness.

I don't see any WHERE clause on the query. If all rows are read, then an ORDER BY U.UserCode DESC, or any sort on column(s) other than on UserId, would require a significant amount of overhead for 1M+ rows.

Kristen · March 8, 2017, 6:37pm

In addition to what @gbritton said about NOLOCK my advice on it is:

Using NOLOCK you may get some rows included twice, and some rows not included at all. I am referring to rows that have been in the table for ages and have not been changed recently, not specifically to rows that are "in the process of being written / might be rolled back". If you are OK that some rows are duplicated, and some rows are missing, from your query then it may be OK to use NOLOCK. (Note that NOLOCK will also, on less frequent occasions, cause the query to be aborted with error (because of "data movement". That's obvious a PITA, but might be a minor concern, it will be rare.)

The data missing / data duplicated side effect of using NOLOCK is virtually impossible to reproduce because it is caused by very subtle timing issues. So perhaps the worst side effect is that users report "problems" and IT can never reproduce them ...

If that is not acceptable then most definitely do not use NOLOCK.

For me "tuning the query" is usually the right answer. Reviewing which idnexes are being used, whether they are the expected indexes or, indeed, if new indexes need to be added, and so on.

Plus, as @ScottPletcher said, making sure that the Clustered Index is on the most appropriate column(s) for the table (the PKey might be something completely different, but frequently the Clustered Index is the PKey "by default" and without any thought)

gbritton · March 8, 2017, 7:04pm

From my experience, people who recommend NOLOCK think it's a magic "go faster" button without any understanding of the kinds of negative effects we've been talking about here. OTOH, if the table contains static data, then sure, why not? I can't imagine a static Users table or Studies table though. They might not be modified on a microsecond basis but there will still be lots of change.

Depends on the environment really.

Still, if NOLOCK did make the OPs query go faster, that tells me that there is some massive updating happening to one or both tables at the same time and the OP's query is waiting for locks to be released. I have a hard time believing that that is happening all the time though. And even if it were happening, unless the OP is OK with dirty data, NOLOCK would be exactly the wrong solution.