SQLTeam.com | Weblogs | Forums

Select most recent previous receipt date and nearest date

Hi,

I have a query that pull the most recent previous receipt date but I have some records that the receipt_date is ahead from the log_date which I need also to include. I'm using SQL2008R2

Please see below sample data and DDL.

DECLARE	@Sample TABLE
	(
		id nvarchar(50),
		itemno nvarchar(50),
		receipt_dt datetime
	)

-- Populate sample data
INSERT	@Sample VALUES	('T000001','123456789012344','2017-04-11 18:11:45.000')
INSERT	@Sample VALUES	('T000002','123456789012345','2017-05-04 17:00:00.000')
INSERT	@Sample VALUES	('T000002','123456789012345','2017-07-20 23:12:00.000')
INSERT	@Sample VALUES	('T000004','123456789012355','2017-04-28 15:12:00.000')
INSERT	@Sample VALUES	('T000004','123456789012355','2017-05-28 20:12:00.000')



DECLARE	@details TABLE
	(
		itemno nvarchar(50),
		log_dt datetime
	)

-- Populate sample data

INSERT	@details VALUES	('123456789012344','2017-04-21 15:11:45.000')
INSERT	@details VALUES	('123456789012345','2017-04-27 01:45:17.000')
INSERT	@details VALUES	('123456789012345','2017-07-29 12:45:07.000')
INSERT	@details VALUES	('123456789012345','2017-07-30 14:50:07.000')
INSERT	@details VALUES	('123456789012355','2017-04-29 11:57:07.000')
INSERT	@details VALUES	('123456789012355','2017-05-30 12:45:07.000')



select	t2.id, t2.receipt_dt,  t1.log_dt, t1.itemno
from	@details t1
		left join @Sample t2
		on t1.itemno = t2.itemno
		and t1.log_dt >= t2.receipt_dt


id-------receipt_dt----------------log_dt-------------------itemno
============================================================================
T000001--2017-04-11 18:11:45.000---2017-04-21 15:11:45.000--123456789012344--single entry

T000002--2017-05-04 17:00:00.000---2017-04-27 01:45:17.000--123456789012345---receipt date is ahead over log_dt need also to display but receipt date is empty


T000002--2017-07-20 23:12:00.000---2017-07-29 12:45:07.000--123456789012345--multiple records 
T000002--2017-07-20 23:12:00.000---2017-07-30 14:50:07.000--123456789012345


T000004--2017-04-28 15:12:00.000---2017-04-29 11:57:07.000--123456789012355--single entry


T000004--2017-05-28 20:12:00.000---2017-05-30 12:45:07.000--123456789012355--single entry

Your requirements aren't 100% clear... If I understand correctly, you want the latest receipt that cam prior to a given log_dt and any receipt that cam after that date... If that's the case, the following should do the trick.
I don't have access to a any instances of 2008R2 to test, but I'm pretty sure the syntax should be valid...

SELECT 
    s2.id,
    s2.receipt_dt,
    d.log_dt,
    d.itemno
FROM 
    @details d
    JOIN (
        SELECT 
            s.itemno, 
            first_receipt = MAX(s.receipt_dt)
        FROM
            @details d
            JOIN @Sample s
                ON d.itemno = s.itemno
                AND d.log_dt >= s.receipt_dt
        GROUP BY 
            s.itemno
        ) AS fd
        ON d.itemno = fd.itemno
    JOIN @Sample s2
        ON fd.itemno = s2.itemno
        AND fd.first_receipt <= s2.receipt_dt;

I usually use a ROW_NUMBER OVER sub-query for that type of situation, with a suitable ORDER BY and then constrain on the Row Number value = 1 to just get the first matching row

There are several way of doing it, using MAX rather than ROW_NUMBER() avoids the sort operation that the ROW_NUMBER() would cause (in the absence of a POC index).

That said, getting rid of the derived table and using a CROSS APPLY does produce a much better execution plan...

    SELECT 
        s2.id,
        s2.receipt_dt,
        d.log_dt,
        d.itemno
    FROM 
        @details d
        CROSS APPLY (
                    SELECT 
                        MAX(s.receipt_dt)
                    FROM 
                        @Sample s
                    WHERE 
                        d.itemno = s.itemno
                        AND d.log_dt >= s.receipt_dt
                    ) frd (FistReceiptDate)
        JOIN @Sample s2
            ON d.itemno = s2.itemno
            AND frd.FistReceiptDate <= s2.receipt_dt;

The thing I was alluding too (but didn't say :slight_smile: ) was where, rather than just a single MAX value, you want "Several columns from the row that has the MAX value"

I haven't understood the O/P requirements here very well - I looked at it before the additional data was added, and I see more data has been added since, which might make it clearer, but I haven't had time to re-review it, so apologies if I have got the wrong end of the stick, but it looked like it might need the "Several columns from the row that has the MAX value" type of solution.

I'm not sure what that means or how it would apply to this particular situation. Then again, I wasn't aware that the OP had edited their post (I'm still new to this forum and don't know where to look to see if edits were made). So, I don't know what was or wasn't there when you 1st looked at it.

In any case, I simply took what I could discern for the OP's comments and tweaked the logic until it matched the expected output.

If we were talking about a normal "Top N of Group" scenario, I'd be with you 100%.

Hope I'm making sense... :slight_smile:

Hi Jason,

Thanks for the reply. Just run the script with cross apply and the result I obtain was the id='T000002' has 2 records while id='T000004' has 3 records which it should be the id='T000002' has 3 records while id='T000004' has 2 records.

sample expected result:

id-------receipt_dt----------------log_dt-------------------itemno
============================================================================
T000001--2017-04-11 18:11:45.000---2017-04-21 15:11:45.000--123456789012344--single entry

T000002--2017-05-04 17:00:00.000---2017-04-27 01:45:17.000--123456789012345---receipt date is ahead over log_dt need also to display but receipt date is empty


T000002--2017-07-20 23:12:00.000---2017-07-29 12:45:07.000--123456789012345--multiple records 
T000002--2017-07-20 23:12:00.000---2017-07-30 14:50:07.000--123456789012345