Narrowing a many to many query

scottdg · January 27, 2017, 1:43pm

I wasn't really sure how to title this post so here is my best description of what I am trying to accomplish. I have a claims database and a database of stores. I created a view that combines the data from three different tables (claims) using UNION. That view contains about 1.3M rows. I need to then join that to another table to see if any of those claims were submitted by someone who is also a retailer. Since a retailer can have multiple records because he/she is owner of multiple stores that 1.3M records quickly multiplies. In addition I need to count the number of claims for each retailer and filter on that count > 9. Below is what I have so far which is counting the correct number of claims but if the retailer has one claim and three stores it is returning the correct count but for three rows.

WITH CTE 
as (
	select retail_number, Name, ss_number, amount, count(ss_number) over (partition by ss_number, retail_number) as count, date 
	from my_view v inner join owner o on v.ss_number = o.ss_no 
	group by ss_number, date, name, retail_number, amount
	)
Select date, ss_number, Name, retail_number, amount, count
from CTE
Where count > 9  
order by ss_number, date, amount

this of course returns something resembling the following if someone has 2 claims (forgetting the > 9 for a moment to illustrate) and two stores:

1/1/2016     1234     John Smith     654321     1000     2
8/1/2016     1234     John Smith     654321      500     2
1/1/2016     1234     John Smith     987654     1000     2
8/1/2016     1234     John Smith     987654      500     2

The desired result would be:

1/1/2016     1234     John Smith     654321, 987654     1000     2
8/1/2016     1234     John Smith     654321, 987654      500     2

this way the number of rows returned matches the count. I will likely need to filter this in a report by ss_number and date but I need to get my data source down to a reasonable number of rows because right now the way I have it written it is returning many millions. There should only be around 1.3M.

Thanks in advance for your suggestions.
Scott

scottdg · January 27, 2017, 4:20pm

I resolved this issue by creating view of the unique ss_no of the retailers with their retail_numbers concatenated together using the stuff() function and FOR XML PATH. I then created the CTE from both views joined on ss_no/ss_number and filtered on the count.