SQLTeam.com | Weblogs | Forums

Why my script missed these two cases?


I created a script to return all the loans with disposition code 'PS' happened immediately after disposition code ('PA','PM','NP','CI','CR','CL'). Disposition code ('PA','PM','NP','CI','CR','CL') has to happen between '2015-02-15' and '2015-07-15'. And if a loan has PA/PM/NP/CI/CR/CL and PS on the same date, we need to capture this one, too.

Why my script can't catch Loan 11613577 and 11613579, with Disposition_Date on 2015-07-15? Even though I changed filter condition to a.Disposition_Dt between '2015-02-14' and '2015-07-16'

Why my script can't catch Loan 14364944? I know because "PM" and "PS" happen in the same date. How can I catch this case?

I know my script is too complex, and missed some cases. Anyone can help me modify it in a easier way? Thank you.

Sample Data

Customer_Nbr_Txt Loan_Nbr Disposition_Dt Disposition_Code_Txt
007849 11613577 2015-07-21 PS
007849 11613577 2015-07-15 NP
007849 11613579 2015-07-21 PS
007849 11613579 2015-07-15 NP
008120 14364944 2015-05-27 PM
008120 14364944 2015-05-27 PS

with cte1
ROW_NUMBER()over(partition by a.loan_nbr
order by a.Disposition_Dt desc)rn
FROM [DW_ODS].[dbo].[WMS_LoanDispositionData] a
ROW_NUMBER()over(partition by a.loan_nbr
order by a.Disposition_Dt desc)rn
FROM [DW_ODS].[dbo].[WMS_LoanDispositionData] a
where a.Disposition_Dt between '2015-02-15' and '2015-07-15'

select cte1.Customer_Nbr_Txt,b.[Customer Name - Legal],cte1.Loan_Nbr,b.[Supplier Name],b.[Collateral Code],cte2.Disposition_Dt as [Disposition Code 1 Date (Inspection Date)] ,cte2.Disposition_Code_Txt as [Disposition Code 1],cte1.Disposition_Dt as [Disposition Code 2 ("PS" code) Date]
from cte1
join cte2 on cte1.Loan_Nbr=cte2.Loan_Nbr
where (cte1.Disposition_Code_Txt='PS'and cte1.rn=1)
and (cte2.Disposition_Code_Txt in ('PA','PM','NP','CI','CR','CL') and cte2.rn=2)


I did not understand all of the logic you are trying to implement, but my suspicion is that the ORDER BY clause in the ROW_NUMBER function is not sufficient to assign a row number in the case of Loan 14364944 in a deterministic fashion. Try changing the row_number functions to the following:

	(	PARTITION BY a.loan_nbr 
			a.Disposition_Dt DESC,
			 -- added the next line
			CASE WHEN Disposition_Code_Txt = 'PS' THEN 1 ELSE 0 END 
	) rn


To debug it, see what will return cte1 and cte2 before you put them in join


Because your Row_Number need more information to be like you want to be. I mean :

ROW_NUMBER()over(partition by a.loan_nbr order by a.Disposition_Dt desc

is not enough . You should add another column like an ID to be deterministic. Like this:

ROW_NUMBER()over(partition by a.loan_nbr order by a.Disposition_Dt desc,ID desc

or use the suggestion from James.

By adding the ID column , I was able to return the record

Customer_Nbr_Txt    Loan_Nbr    Disposition Code 1 Date (Inspection Date)    Disposition Code 1    Disposition Code 2 ("PS" code) Date
008120    14364944    2015-05-27    PM    2015-05-27

Because in cte2 you have a Where clause. This way , your rn never reach value 2 (with your example) . And the final Where will be never met ... and cte2.rn=2)
To solve this point you should eliminate the WHERE clause from cte2 or extend the time frame to : where a.Disposition_Dt between '2015-02-15' and '2015-07-21'


Thank you, but the problem is in my data source, in the same date sometimes the 'PS' ID is more than "PM" ID, sometimes less. What should I deal with? Thank you


You saw the suggestion from JamesK ? You can try it.