Query Optimization

Villanuev · December 1, 2015, 3:10am

Hi,

I have a query that perform very slow maybe because it pulled out a large number of records. I come a a solution to breakdown the data into separate temporary table. I create a temp table for finished good history and another one for shipment history. I run the finished history by monthly then the shipment is year to date.
When i perform the query (see below) it takes time. my question. does will affect the performance of server if i will continue to run this query until completely obtain the records. This will create a slow down of server performance?
The records of finished good is 65000 the shipment is 900000 records. for 12 minutes of running i generate only 14612 records out of 65000. thank you.

select
x.esn_no
,x.prodid
,audit_date
,po_createddate
,fgdate as po_endeddate
,v.ship_date
,x.AGE_ESNReceipt_toPOCreated
,x.AGE_POCreated_toPOFG
,x.AGE_ESNreceipts_toFG
,datediff(dd,x.fgdate, v.ship_date ) as 'AGE_FG_toShipment'
,datediff(dd,x.audit_date, v.ship_date ) as 'AGE_ESNreceipt_toShipment'
from #tempjs_details_scans x
outer apply
( --this portion stored history transaction
select top 1 esn_no , ship_date from #tempjs_shipped_scans m where m.ship_date >=x.fgdate
and m.esn_no = x.esn_no COLLATE SQL_Latin1_General_CP850_CI_AS order by m.ship_date asc
) v
where op_desc = 'Finished Good'
order by esn_no, prodid

sz1 · December 1, 2015, 10:56am

How much faster without the Order By? do you need all columns.

ScottPletcher · December 1, 2015, 9:20pm

Create the temp tables with a clustered index:
Cluster #tempjs_details_scans on ( esn_no, prodid )
Cluster #tempjs_shipped_scans on ( esn_no, ship_date )

If you're doing SELECT ... INTO to build the tables, you'll need to do it twice; once a "dummy" row to create the table structure with the clustered index, then one to actually load the table. If you need help with that, let me know.

Villanuev · December 2, 2015, 1:56am

I tried to removed the Order by and rerun the query using 1 month data and its faster as compare before. it took 7 minutes for 65500 records.

Villanuev · December 2, 2015, 2:06am

Thanks Scott,

This is what i did with my query adding the clustered index and at the same time removed the order by in subquery which is a little bit faster as compare bfore. i'm using 1 month records which is about 65000 it took 6 minutes and 50 seconds.

My question : if ever i will removed the order by in subquery is there an effect on data accuracy specially capturing the nearest shipdate.

Why it is when tried to use Ctl -K it always appear insert hyperlink. i want the codes readable.

if object_id('tempdb..#tempjs_shipped_scans') is not null drop table #tempjs_shipped_scans;
CREATE Clustered Index Ix_esnshipdate on #tempjs_shipped_scans(esn_no,ship_date)
select m.esn_no,
md.model_code,
m.ship_date
into #tempjs_shipped_scans
from esn_dev_pi.dbo.master_esn m with(nolock)
inner join esn_dev_pi.dbo.model md with (nolock)
on m.model_id = md.model_id
where m.ship_date >= '2015-1-1' and m.ship_date < '2015-12-1'
order by m.esn_no;

if object_id('tempdb..#tempjs_details_scans') is not null drop table #tempjs_details_scans;
CREATE Clustered Index Ix_esnDate on #tempjs_details_scans(esn_no,prodid)
select *
into #tempjs_details_scans
from
(
select
*
from #tempjs_warehouse_scans t1
outer apply
( --pull production history
select top 1 t2.audit_date as fgdate, t2.prodid, t2.op_desc, t2.esn_no , t2.recoveredserial, t2.po_createddate
from #tempjs_production_scans t2 where t2.audit_date >= t1.audit_date and t2.tlcs_no=t1.tlcs_no order by t2.audit_date asc
)x
) prod
Where prod.op_desc ='Finished Good'

--Main Query
select
x.esn_no
,x.prodid
,audit_date
,po_createddate
,fgdate as po_endeddate
,v.ship_date
,x.AGE_ESNReceipt_toPOCreated
,x.AGE_POCreated_toPOFG
,x.AGE_ESNreceipts_toFG
,datediff(dd,x.fgdate, v.ship_date ) as 'AGE_FG_toShipment'
,datediff(dd,x.audit_date, v.ship_date ) as 'AGE_ESNreceipt_toShipment'
from #tempjs_details_scans x
outer apply
( --this portion stored history transaction
select top 1 esn_no , ship_date from #tempjs_shipped_scans m where m.ship_date >=x.fgdate
and m.esn_no = x.esn_no COLLATE SQL_Latin1_General_CP850_CI_AS-- order by m.ship_date asc
) v
where op_desc = 'Finished Good'
order by esn_no, prodid

Villanuev · December 2, 2015, 8:27am

Just removed the order by in subquery and it perform faster. Just pulled out 4 months data which contains 200Thousand plus records and it takes 9 minutes as compare before 1000 records for 5 minutes.

sz1 · December 2, 2015, 9:54am

The Order By wont make any odds to the data, (unless you are working with a DW perhaps with surrogates and timestamps where you want to align the data) its only there for specific ordering but takes longer as the query has to work harder to order result set. There will be other ways to increase speed too such as indexing and looking at alternative to TOP clause. Also, use full schemas when referring to tables...

ScottPletcher · December 2, 2015, 3:19pm

The code you posted won't work -- you can' create the temp table, then use "select ... into" for that table. Try this instead:

if object_id('tempdb..#tempjs_shipped_scans') is not null drop table #tempjs_shipped_scans;

select m.esn_no,
md.model_code,
m.ship_date
into #tempjs_shipped_scans
from esn_dev_pi.dbo.master_esn m with(nolock)
inner join esn_dev_pi.dbo.model md with (nolock)
on m.model_id = md.model_id
where 1 = 0 --no actual rows, just create matching table structure
order by m.esn_no;

CREATE Clustered Index Ix_esnshipdate on #tempjs_shipped_scans(esn_no,ship_date);

INSERT INTO #tempjs_shipped_scans
select m.esn_no,
md.model_code,
m.ship_date
from esn_dev_pi.dbo.master_esn m with(nolock)
inner join esn_dev_pi.dbo.model md with (nolock)
on m.model_id = md.model_id
where m.ship_date >= '20150101' and m.ship_date < '20151201'
order by m.esn_no, m.ship_date;

if object_id('tempdb..#tempjs_details_scans') is not null drop table #tempjs_details_scans;
select *
into #tempjs_details_scans
from
(
select
*
from #tempjs_warehouse_scans t1
outer apply
( --pull production history
select top (1) t2.audit_date as fgdate, t2.prodid, t2.op_desc, t2.esn_no , t2.recoveredserial, t2.po_createddate
from #tempjs_production_scans t2 where t2.audit_date >= t1.audit_date and t2.tlcs_no=t1.tlcs_no order by t2.audit_date asc
)x
) prod
Where 1 = 0 --no actual rows, just create matching table structure

CREATE Clustered Index Ix_esnDate on #tempjs_details_scans(esn_no,prodid)

insert into #tempjs_details_scans
select *
from
(
select
*
from #tempjs_warehouse_scans t1
outer apply
( --pull production history
select top (1) t2.audit_date as fgdate, t2.prodid, t2.op_desc, t2.esn_no , t2.recoveredserial, t2.po_createddate
from #tempjs_production_scans t2 where t2.audit_date >= t1.audit_date and t2.tlcs_no=t1.tlcs_no order by t2.audit_date asc
)x
) prod
Where prod.op_desc ='Finished Good'

--Main Query
select
x.esn_no
,x.prodid
,audit_date
,po_createddate
,fgdate as po_endeddate
,v.ship_date
,x.AGE_ESNReceipt_toPOCreated
,x.AGE_POCreated_toPOFG
,x.AGE_ESNreceipts_toFG
,datediff(dd,x.fgdate, v.ship_date ) as 'AGE_FG_toShipment'
,datediff(dd,x.audit_date, v.ship_date ) as 'AGE_ESNreceipt_toShipment'
from #tempjs_details_scans x
outer apply
( --this portion stored history transaction
select top 1 esn_no , ship_date from #tempjs_shipped_scans m where m.ship_date >=x.fgdate
and m.esn_no = x.esn_no COLLATE SQL_Latin1_General_CP850_CI_AS-- order by m.ship_date asc
) v
where op_desc = 'Finished Good'
order by esn_no, prodid

Villanuev · December 4, 2015, 3:29am

Thank you very much Scott. this is faster as compare before.
by the way, i have found a new scenario for my query where some of the data is not capture properly. just give you a sample data. I tried to modify the query by it takes time to process. any idea to optmize this query. thank you.

I will check if the t1.esnnum to t2.csnum or t1.esnnum to t2.esnum.

below is my smaple query

    --New query to get the nearest date

select *
from #t1 t1
outer apply
(
select top 1 t2.auditdate, t2.csnum, t2.esnnum, t2.recoveredesn, prodid
from #t2 t2
where t2.auditdate >=t1.auditdate and t2.csnum =t1.esnnum or t2.esnnum= t1.esnnum
order by t2.auditdate asc )x

ScottPletcher · December 7, 2015, 4:39pm

I have no idea what "#t1" and "#t2" are. But, in general, cluster #t2 by the lookup columns:
create clustered index t2_clus on #t2 ( esnnum, csnum, auditdate )

Query Optimization

if object_id('tempdb..#tempjs_shipped_scans') is not null drop table #tempjs_shipped_scans;

select m.esn_no, md.model_code, m.ship_date into #tempjs_shipped_scans from esn_dev_pi.dbo.master_esn m with(nolock) inner join esn_dev_pi.dbo.model md with (nolock) on m.model_id = md.model_id where 1 = 0 --no actual rows, just create matching table structure order by m.esn_no;

CREATE Clustered Index Ix_esnshipdate on #tempjs_shipped_scans(esn_no,ship_date);

CREATE Clustered Index Ix_esnDate on #tempjs_details_scans(esn_no,prodid)

select m.esn_no,
md.model_code,
m.ship_date
into #tempjs_shipped_scans
from esn_dev_pi.dbo.master_esn m with(nolock)
inner join esn_dev_pi.dbo.model md with (nolock)
on m.model_id = md.model_id
where 1 = 0 --no actual rows, just create matching table structure
order by m.esn_no;