Percentage of transactions with a certain status within 2 hours after having another status

Hi All,

Can you help me with this query please .

I have a Customer table and an Orders table.

I need to query for the percentage of re-orders that were 'bad' within 2 hours of having a previous 'bad' order (from total orders in a particular period)

In other words...(for example)

From Monday to Sunday (31st Aug 2015 to 6th Sept 2015):
Customers placed orders,
Some were good (these have a status 1 in Orders table),
Some were bad (these have a status 2 in Orders table),
From those that were bad, some would have re-ordered within 2 hours and then had a good order.
Others would have had another bad order.

Relevant fields:

OrderID
Orderdate
Status (1 = good, 2 = bad)
CustID

So I need the percentage of orders/transactions that were bad again within 2 hours of a bad order (as a percentage of total orders in the period).

Let me know if I have missed any fields or if you have any questions.

Many thanks :smile:

This perhaps? Its not a percentage, but if it gives you the correct numbers then I figure you can sort the percentage bit out!!

-- Select Bad Orders within Date Range (where there is a next order within 2 hours which is good)
SELECT	[BadThenGood] = SUM(CASE WHEN ONext.status = 1 THEN 0 ELSE 1 END),
	[BadThenBad] = SUM(CASE WHEN ONext.status = 2 THEN 1 ELSE 0 END),
	[BadThenNothing] = SUM(CASE WHEN ONext.status IS NULL THEN 1 ELSE 0 END),
	[UnexpectedStatus] = SUM(CASE WHEN ONext.status IN (1, 2) THEN 0 ELSE 1 END)
FROM	Orders AS O
	OUTER APPLY
	(
		-- Next Order (within 2 hours)
		SELECT TOP 1 ONext.status 
		FROM	Order AS ONext
		WHERE	    ONext.CustID = O.CustID
			AND ONext.OrderID <> O.OrderID	-- Don't match same order!
			AND ONext.OrderDate >= O.OrderDate
			AND ONext.OrderDate < DATEADD(Hour, 2, O.OrderDate)
		ORDER BY ONext.OrderDate ASC
	) AS ONext
WHERE	    O.OrderDate >= '20150831'
	AND O.OrderDate <  '20150907'
	AND O.status = 2	-- Bad
1 Like

Adding an order by clause to Kristen's query.

WHERE	    ONext.CustID = O.CustID
		AND ONext.OrderID <> O.OrderID	-- Don't match same order!
		AND ONext.OrderDate >= O.OrderDate
		AND ONext.OrderDate < DATEADD(Hour, 2, O.OrderDate)
ORDER BY
		ONext.OrderDate ASC
1 Like

Oops! That's rather vital, isn't it?!!

Why's the colour coding gone for a Fruit Loop at the bottom there? I'm sure I don't need to spend brain-power worrying about such things but ... WHY? !!!

1 Like

Hi Kristen,

Thank you for that, I'm just trying to adapt that for my use.

Struggling a little with the 'C.CustID' bit. The 'C' really.

Also, I'm getting a Incorrect syntax near the keyword 'WHERE' error when I try to run it.

Add the alias ONext - see the first line in the code fragment below:

    	) AS ONext
 WHERE	    O.CustID = ONext.CustID
	AND O.OrderDate >= '20150831'
	AND O.OrderDate <  '20150907'
	AND O.status = 2
1 Like

For performance reasons, you want to use > rather than <> in the OrderId comparison. That is:
AND ONext.OrderID > O.OrderID -- Don't match same order!
rather than:
AND ONext.OrderID <> O.OrderID -- Don't match same order!

1 Like

Hi James,

Thanks for your help with this, I think we are really close to getting this to work but I am getting an error.

What could be causing this: Invalid column name 'CustID'

It is in line 18 (WHERE O.UserID = ONext.UserID) after the alias ONext.

Many thanks

You don't really need that with the APPLY clause, so change it to

    	) AS ONext
 WHERE	    O.OrderDate >= '20150831'
	AND O.OrderDate <  '20150907'
	AND O.status = 2
1 Like

I was actually thinking that but wanted the experts to say so before removing it :smile:
Thank you for that.

It's running now so just waiting for it to finish. :+1:

Any idea why when I add up the three numbers it's different to when I query

SELECT COUNT(o.OrderID)
FROM Orders o
WHERE O.OrderDate >= '20150831'
	AND O.OrderDate <  '20150907'
AND o.status = 2

Check if order id's are unique. If they are, the two counts in the following query should be the same.

SELECT 
	COUNT(o.OrderID),
	COUNT(DISTINCT o.OrderId)
FROM Orders o
WHERE O.OrderDate >= '20150831'
	AND O.OrderDate <  '20150907'
AND o.status = 2;

Also, check if there are any other statuses other than 1 and 2.

SELECT 
	o.[Status],
	COUNT(*)
FROM Orders o
WHERE O.OrderDate >= '20150831'
	AND O.OrderDate <  '20150907'
GROUP BY
	o.[Status];

That would include "GoodThenGood" ?

Sorry about that. Just to clarify: I started off including the Customer table (aliased as C) and then decided it wasn't necessary .... but left that JOINing bit behind :frowning:

Bonus points for posters that include CREATE TABLE and INSERT sample data DDL with their questions ... 'coz then I would have built a working example rather than doing it by Guess Work :smile:

1 Like

I've tidied up the fopas in my original post with Eagle-Eyed JamesK's fixes :slight_smile: ...

... and added an [UnexpectedStatus] count, in case it helps

Ah! makes sense.

I will try to include the CREATE TABLE and INSERT next time :smile:

@Kristen, @JamesK As I couldn't quite get the numbers to match for some reason, I started doing this a different way and got my friend to help me, see below and let me know your thoughts (if anything has been missed in the logic or if you can fine tune it) :smile: :

-----True Bad Orders------
DECLARE  @BadOrdersTemp as table (OrderID int,  OrderDate DATETIME, CustID int)
DECLARE  @BadOrders as table (OrderID int,  OrderDate DATETIME, CustID int, SecondOrderGoodID int null, SecondOrderBadID int null)
DECLARE  @SecondOrderGood as table (OrderID int, CustID int)
DECLARE  @SecondOrderBad as table (OrderID int, CustID int)
 
INSERT @BadOrdersTemp(OrderID, Orderdate, CustID)
SELECT o.OrderID, o.Orderdate, o.CustID
FROM [Order] o WITH (NOLOCK)
WHERE O.OrderDate >= '20150831'
AND O.OrderDate <  '20150907'
AND o.status =2


INSERT @BadOrders(OrderID,  OrderDate, CustID)
	
SELECT MIN(OrderID),  MIN(OrderDate), CustID
	FROM @BadOrdersTemp
	GROUP BY CustID


INSERT @SecondOrderGood (OrderID, CustID)
SELECT o.OrderID, O.CustID
	FROM @BadOrders B
	INNER JOIN [Order] O WITH (NOLOCK )
	ON O.CustID = B.CustID
	WHERE O.status = 1
	AND O.OrderDate > B.OrderDate
	AND O.OrderDate < DATEADD(mi, 120, B.OrderDate)


INSERT @SecondOrderBad (OrderID, CustID)
SELECT o.OrderID, O.CustID
	FROM @BadOrders B
	INNER JOIN [Order] O ON O.CustID = B.CustID
	WHERE O.status = 2
	AND O.OrderDate > B.OrderDate
	AND O.OrderDate < DATEADD(mi, 120, B.OrderDate)


UPDATE B
	SET B.SecondOrderGoodID = G.OrderID
FROM  @BadOrders B
	INNER JOIN @SecondOrderGood G
		ON B.CustID = G.CustID

UPDATE B
	SET B.SecondOrderBadID = G.OrderID
FROM  @BadOrders B
	INNER JOIN @SecondOrderBad G
		ON B.CustID = G.CustID

DECLARE @TotalOrders INT

SELECT @TotalOrders=  COUNT(o.OrderID)
FROM [Order] o
WHERE O.OrderDate >= '20150831'
	AND O.OrderDate <  '20150907'

Declare @TrueBad INT
SELECT @TrueBad = COUNT (OrderID)
from @BadOrders
WHERE (SecondOrderBadID IS NULL AND SecondOrderGoodID IS NULL) 
	OR (SecondOrderBadID IS NOT NULL AND SecondOrderGoodID IS NULL)

--select * from @BadOrders
--select @TrueBad
--select @TotalOrders

Declare @calc float
select @calc= round((@TrueBad/(@TotalOrders * 1.0)) * 100,2)

select @calc as 'True Bad Orders'

Don't use NOLOCK ... NOT EVER! Very risky. There are of course cases when it is useful, but an end user report is not one of them.

This will find a bad Order where there was a Good Order within the next 120 minutes. My understand was that you wanted the "next order good". I think this logic will hide things "Good, Bad, Good, Bad" which to my mind is two good orders, both followed by a bad order, but for your logic above will count under : Bad followed by Good, Good followed by Bad, Good / Good and Bad / Bad.

I can't spot what the UPDATEs are trying to achieve, but I think they are ambiguous.