I need to query for the percentage of re-orders that were 'bad' within 2 hours of having a previous 'bad' order (from total orders in a particular period)
In other words...(for example)
From Monday to Sunday (31st Aug 2015 to 6th Sept 2015):
Customers placed orders,
Some were good (these have a status 1 in Orders table),
Some were bad (these have a status 2 in Orders table),
From those that were bad, some would have re-ordered within 2 hours and then had a good order.
Others would have had another bad order.
Relevant fields:
OrderID
Orderdate
Status (1 = good, 2 = bad)
CustID
So I need the percentage of orders/transactions that were bad again within 2 hours of a bad order (as a percentage of total orders in the period).
Let me know if I have missed any fields or if you have any questions.
This perhaps? Its not a percentage, but if it gives you the correct numbers then I figure you can sort the percentage bit out!!
-- Select Bad Orders within Date Range (where there is a next order within 2 hours which is good)
SELECT [BadThenGood] = SUM(CASE WHEN ONext.status = 1 THEN 0 ELSE 1 END),
[BadThenBad] = SUM(CASE WHEN ONext.status = 2 THEN 1 ELSE 0 END),
[BadThenNothing] = SUM(CASE WHEN ONext.status IS NULL THEN 1 ELSE 0 END),
[UnexpectedStatus] = SUM(CASE WHEN ONext.status IN (1, 2) THEN 0 ELSE 1 END)
FROM Orders AS O
OUTER APPLY
(
-- Next Order (within 2 hours)
SELECT TOP 1 ONext.status
FROM Order AS ONext
WHERE ONext.CustID = O.CustID
AND ONext.OrderID <> O.OrderID -- Don't match same order!
AND ONext.OrderDate >= O.OrderDate
AND ONext.OrderDate < DATEADD(Hour, 2, O.OrderDate)
ORDER BY ONext.OrderDate ASC
) AS ONext
WHERE O.OrderDate >= '20150831'
AND O.OrderDate < '20150907'
AND O.status = 2 -- Bad
WHERE ONext.CustID = O.CustID
AND ONext.OrderID <> O.OrderID -- Don't match same order!
AND ONext.OrderDate >= O.OrderDate
AND ONext.OrderDate < DATEADD(Hour, 2, O.OrderDate)
ORDER BY
ONext.OrderDate ASC
Why's the colour coding gone for a Fruit Loop at the bottom there? I'm sure I don't need to spend brain-power worrying about such things but ... WHY? !!!
For performance reasons, you want to use > rather than <> in the OrderId comparison. That is:
AND ONext.OrderID > O.OrderID -- Don't match same order!
rather than:
AND ONext.OrderID <> O.OrderID -- Don't match same order!
Sorry about that. Just to clarify: I started off including the Customer table (aliased as C) and then decided it wasn't necessary .... but left that JOINing bit behind
Bonus points for posters that include CREATE TABLE and INSERT sample data DDL with their questions ... 'coz then I would have built a working example rather than doing it by Guess Work
@Kristen, @JamesK As I couldn't quite get the numbers to match for some reason, I started doing this a different way and got my friend to help me, see below and let me know your thoughts (if anything has been missed in the logic or if you can fine tune it) :
-----True Bad Orders------
DECLARE @BadOrdersTemp as table (OrderID int, OrderDate DATETIME, CustID int)
DECLARE @BadOrders as table (OrderID int, OrderDate DATETIME, CustID int, SecondOrderGoodID int null, SecondOrderBadID int null)
DECLARE @SecondOrderGood as table (OrderID int, CustID int)
DECLARE @SecondOrderBad as table (OrderID int, CustID int)
INSERT @BadOrdersTemp(OrderID, Orderdate, CustID)
SELECT o.OrderID, o.Orderdate, o.CustID
FROM [Order] o WITH (NOLOCK)
WHERE O.OrderDate >= '20150831'
AND O.OrderDate < '20150907'
AND o.status =2
INSERT @BadOrders(OrderID, OrderDate, CustID)
SELECT MIN(OrderID), MIN(OrderDate), CustID
FROM @BadOrdersTemp
GROUP BY CustID
INSERT @SecondOrderGood (OrderID, CustID)
SELECT o.OrderID, O.CustID
FROM @BadOrders B
INNER JOIN [Order] O WITH (NOLOCK )
ON O.CustID = B.CustID
WHERE O.status = 1
AND O.OrderDate > B.OrderDate
AND O.OrderDate < DATEADD(mi, 120, B.OrderDate)
INSERT @SecondOrderBad (OrderID, CustID)
SELECT o.OrderID, O.CustID
FROM @BadOrders B
INNER JOIN [Order] O ON O.CustID = B.CustID
WHERE O.status = 2
AND O.OrderDate > B.OrderDate
AND O.OrderDate < DATEADD(mi, 120, B.OrderDate)
UPDATE B
SET B.SecondOrderGoodID = G.OrderID
FROM @BadOrders B
INNER JOIN @SecondOrderGood G
ON B.CustID = G.CustID
UPDATE B
SET B.SecondOrderBadID = G.OrderID
FROM @BadOrders B
INNER JOIN @SecondOrderBad G
ON B.CustID = G.CustID
DECLARE @TotalOrders INT
SELECT @TotalOrders= COUNT(o.OrderID)
FROM [Order] o
WHERE O.OrderDate >= '20150831'
AND O.OrderDate < '20150907'
Declare @TrueBad INT
SELECT @TrueBad = COUNT (OrderID)
from @BadOrders
WHERE (SecondOrderBadID IS NULL AND SecondOrderGoodID IS NULL)
OR (SecondOrderBadID IS NOT NULL AND SecondOrderGoodID IS NULL)
--select * from @BadOrders
--select @TrueBad
--select @TotalOrders
Declare @calc float
select @calc= round((@TrueBad/(@TotalOrders * 1.0)) * 100,2)
select @calc as 'True Bad Orders'
Don't use NOLOCK ... NOT EVER! Very risky. There are of course cases when it is useful, but an end user report is not one of them.
This will find a bad Order where there was a Good Order within the next 120 minutes. My understand was that you wanted the "next order good". I think this logic will hide things "Good, Bad, Good, Bad" which to my mind is two good orders, both followed by a bad order, but for your logic above will count under : Bad followed by Good, Good followed by Bad, Good / Good and Bad / Bad.
I can't spot what the UPDATEs are trying to achieve, but I think they are ambiguous.