Partion part of ROW_NUMBER() is TSQL is extremely slow

A query for adding a row_number to a table of 140 Million records never finishes in VAL server and it takes few minutes on PROD server. Same dataset size
SELECT EO.ORDER_ID, EO.INSTANTIATED_TIME,
ROW_NUMBER() OVER (PARTITION BY EO.PAT_ID, EO.ENCOUNTER_ID, EO.PROC_ID, EO.TITLE, EO.[DESCRIPTION], EO.DISPLAY_NAME,
EO.ORDER_STATUS_C, EO.ORDER_STATUS, EO.ENC_TYPE_C
ORDER BY ISNULL(EO.INSTANTIATED_TIME, '9999-12-31 23:59:59'), EO.ORDER_INST DESC, EO.ORDER_ID DESC) AS RN
INTO #ALLCANCELEDORDERS
FROM dbo.ENCORDERS AS EO
WHERE EO.ORDER_STATUS_C = 4

The dataset may be the same size, but are the servers configured the same way? Same number of cores, same CPUs, same amount of RAM, same disk storage? Same average workload? Any difference in those factors could cause it to run longer.

And yeah, 140M rows can take a while to process, especially since you're dumping the results into a temp table. That will require enough space in tempdb, and if it's not there, then tempdb files will need to grow. Is tempdb configured the same on each server? (same file size, same free space, same growth increment, same kind of storage)

Also, the statistics and indexing on one server could be different, leading to a different execution plan. Are the execution plans the same on each system?

I don't think this would help, but since you are filtering on EO.ORDER_STATUS_C=4 in your WHERE clause, you don't need to PARTITION BY or ORDER BY that column, as it will never vary. It would only add processing overhead.

1 Like

Many thanks for your reply. I am the developer. The DBA is telling me they are almost the same. As to table structure and indexing, they are the same. 10 minutes in PROD versus 10 hours in VAL. The tempdb in VAL much larger than prod. The load is the same. I suspect something in the setup of VAL is different from PROD. Partion uses sorting and I do not know how to go after DBA that this is sever problem, not query or table problem

"Almost the same" really doesn't help. Any one of a number of differences could create significant performance discrepancies.

Another consideration: are VAL and PROD running the same Edition of SQL Server? If one is Enterprise and the other is Standard, or Express, that could affect the performance. Also, is there a difference in version? Which version, if they are the same?

SELECT...INTO may not run in parallel, it depends on SQL Server version, and some database settings:

You would need to get the query plans on each server and compare them, in order to determine if this is impacting.

Sincerely appreciate your help. The two servers have same version SSMS 19.0.1 on my end. The two estimated exec query plan is almost the same. Sort in prod is 78% and in VAL 79%. I need to save the result into a table. Since DBA says they are the same, then they both may or may not use parallels.

The problem is in VAL, the query runs forever. After 10 hrs , I kill it. Do not know what setup diff may be there

SSMS version <> SQL Server version. Run this query on both machines:

SELECT @@VERSION

Unless both are exactly the same (you may need to copy and paste to another window to compare), then they're not identically set up.

I would think that the main discrepancy is tempdb config, or there's a difference in storage. Different drive types, different number of tempdb files, different growth increment.

Also, percent of batch values are not really comparable. You need to look at execution times and query cost for each query operator. That would mean hovering over each operator and getting costs from each pop-over window, or opening the Properties window and then selecting each query operator and comparing.

SSMS 19 does have a query plan comparison tool, that might work better for you.

I'm not a dba but maybe you can check this:

You can split the tempdb in many files. Maybe it's 1 file instead of multiple files.

What is the waitstats/Querystore telling you?