Next date after a return date

phils · December 6, 2017, 2:31pm

Hello

Our company is in the garment hire business so we have items on an order that go out and then come back into our warehouse. When they come back their barcode is scanned at various stations within the warehouse. I want to flag items that are overdue (have not came back when they should have) and other statuses so my logic is to look at when the item was meant to be returned (OrderItems.ReturnDate) and find the next scan date we have on the system BarcodeScanHistory.DateTimeScan i can then do a datediff to workout what the status should be Overdue, long Overdue etc. My problem is performance. Each barcode can have up to 50 barcodescanhistory records so currently i'm doing a join on barcode and with DateTimeScan > returndate and taking the min date. doing this for 70k is quite slow. Below is a script to build the tables. the last query is the only way i can think of doing this so was looking for a better solution. The tables have no index or constraints currently but will have once i have a more elegant solution

--table for holding all the scans of a barcode over its lifetime
CREATE TABLE [dbo].[BarcodeScanHistory](
[DateTimeScan] [datetime] NOT NULL,
[Barcode] nvarchar NOT NULL
) ON [PRIMARY]

GO
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2015-04-10','21431791')
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2015-04-11','21431791')
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2015-04-12','21431791')

INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2016-04-10','01756395')
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2016-04-11','01756395')
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2016-04-12','01756395')

INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2017-04-01','00297271')
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2017-04-06','00297271')
INSERT INTO [dbo].[BarcodeScanHistory]VALUES('2017-04-09','00297271')

--table for holding items that go out and come back on orders
CREATE TABLE [dbo].[OrderItems](
[ItemNumber] NVARCHAR NOT NULL,
[ReturnDate] [DATE] NULL,
[Barcode] NVARCHAR NULL
) ON [PRIMARY]

INSERT INTO [dbo].[OrderItems]VALUES('T3444083','2015-04-10','21431791')
INSERT INTO [dbo].[OrderItems]VALUES('T3443206','2016-04-11','01756395')
INSERT INTO [dbo].[OrderItems]VALUES('T3444641','2017-04-09','00297271')

--return all the barcodes and the next date the system saw the items
SELECT
oi.*,
(SELECT CAST(MIN(gp.DateTimeScan) AS DATE) FROM [dbo].[BarcodeScanHistory] gp WHERE gp.Barcode = oi.Barcode AND gp.DateTimeScan > oi.[ReturnDate]) [NextScanDate]
FROM
[dbo].[OrderItems] oi

ScottPletcher · December 6, 2017, 2:43pm

Which specific version of SQL Server are you on? There are big differences here among SQL 2000, 2005, 2008 and 2012+.

phils · December 6, 2017, 2:54pm

sorry 2012

ScottPletcher · December 6, 2017, 4:08pm

Actually the query looks fine. To gain performance, cluster the [BarcodeScanHistory] table on:
( Barcode, DateTimeScan ).

Kristen · December 7, 2017, 6:55am

You should specify a size for VARCHAR / NVARCHAR columns (and anywhere you use it in your code), the default is not reliable - in some circumstances SQL will use a size=1 as the default ...

Personally I would be wanting to tackle the Indexes as part of developing the query, not afterwards.

A covering index on Barcode, DateTimeScan (whether a clustered index as Scott recommended, if you can, or a regular index if not) will perform like night & day compared to "no indexes".

You should also avoid having tables with no clustered index as you wind up with a Heap which usually hampers maintenance of the table. There are cases for a Heap table, but if they don't apply (and they are rare IME) then consider both Primary Key and Clustered Index (which don't have to be the same) as part of the initial table creation.

Presumably this SELECT * is just a shorthand for this post/example? You should not use SELECT * in queries, only select the columns that the APP/Report is going to use. If the APP uses all the columns name them all. Otherwise someone will add a, say, NOTES, column in future, users will stuff it full of data, and although that column won;t be used by any existing, legacy, APP/Report it will still be retrieved from the Database AND transmitted to the APP, and that bandwidth use will be a killer - I've seen it happen where it then took months to change every single SELECT * in the code because that only was crippling performance ...

harishgg1 · December 12, 2017, 6:52am

Hi

I did it like this
Please correct me if am wrong
Thanks

Data Script

Summary

USE [AdventureWorks2012] 

go 

DROP TABLE [harish_barcodescanhistory]; 

DROP TABLE [harish_orderitems]; 

--table for holding all the scans of a barcode over its lifetime  
CREATE TABLE [dbo].[harish_barcodescanhistory] 
  ( 
     [datetimescan] [DATETIME] NOT NULL, 
     [barcode]      NVARCHAR(500) NOT NULL 
  ) 
ON [PRIMARY] 

go 

INSERT INTO [dbo].[harish_barcodescanhistory] 
VALUES      ('2015-04-10', 
             '21431791') 

INSERT INTO [dbo].[harish_barcodescanhistory] 
VALUES      ('2015-04-11', 
             '21431791') 

INSERT INTO [dbo].[harish_barcodescanhistory] 
VALUES      ('2015-04-12', 
             '21431791') 

INSERT INTO [dbo].[harish_barcodescanhistory] 
VALUES      ('2016-04-10', 
             '01756395') 

INSERT INTO [dbo].[harish_barcodescanhistory] 
VALUES      ('2016-04-11', 
             '01756395') 

INSERT INTO [dbo].[harish_barcodescanhistory] 
VALUES      ('2016-04-12', 
             '01756395') 

INSERT INTO [dbo].[harish_barcodescanhistory] 
VALUES      ('2017-04-01', 
             '00297271') 

INSERT INTO [dbo].[harish_barcodescanhistory] 
VALUES      ('2017-04-06', 
             '00297271') 

INSERT INTO [dbo].[harish_barcodescanhistory] 
VALUES      ('2017-04-09', 
             '00297271') 

--table for holding items that go out and come back on orders  
CREATE TABLE [dbo].[harish_orderitems] 
  ( 
     [itemnumber] NVARCHAR(500) NOT NULL, 
     [returndate] [DATE] NULL, 
     [barcode]    NVARCHAR(500) NULL 
  ) 
ON [PRIMARY] 

INSERT INTO [dbo].[harish_orderitems] 
VALUES      ('T3444083', 
             '2015-04-10', 
             '21431791') 

INSERT INTO [dbo].[harish_orderitems] 
VALUES      ('T3443206', 
             '2016-04-11', 
             '01756395') 

INSERT INTO [dbo].[harish_orderitems] 
VALUES      ('T3444641', 
             '2017-04-09', 
             '00297271')

Query Script

Summary

USE [AdventureWorks2012] 

go 

-- My Query  
; 
WITH barcodescan_cte 
     AS (SELECT barcode, 
                datetimescan 
         FROM   [harish_barcodescanhistory]), 
     orderitems_cte 
     AS (SELECT barcode, 
                returndate 
         FROM   [harish_orderitems]) 
SELECT a.*, 
       b.* 
FROM   barcodescan_cte a 
       JOIN orderitems_cte b 
         ON a.barcode = b.barcode 
            AND a.datetimescan > b.returndate