Query to get count making twice inner join to same table count is showing in billions

cplusplus1 · October 14, 2016, 1:23pm

I am trying to run the below query

w_provider is joined twice it only has 75k rows and the w_encounters has 57 million rows, but when ran this query to get the counts it is showing in billions.
What is the proper way to get the count.

select count(*)
from w_encounters encounter
inner join w_provider billing_provider
on encounter.billing_prov_npi = billing_provider.npi
inner join w_provider treating_provider
on encounter.treating_prov_npi = treating_provider.npi

Thank you very much for the helpful info.

Kristen · October 14, 2016, 1:34pm

w_encounters JOINS to w_provider on npi

For EVERY row returned from that JOIN a JOIN is made to w_provider again.

So if for npi #1 there is 1 row in w_provider you will get just one row back.

But if there are 2 rows in w_provider you will get 4 rows back

Perhaps you need to also include some sort of "type" column in the joins to w_provider which filters to only provide the Billing and Treatment providers? However, if there are still 2 Billing rows and 2 Treatment rows that will still give you 4 resulting rows.

cplusplus1 · October 14, 2016, 1:44pm

But why the count is coming in billions?

in encounters i have treating provider_npi and billing_provider_npi columns.

based on its respective values making inner join.

encounters has 57 million rows and providers has just 75k rows.

Is there a better way to query that what i am trying to do to get the count?

Thank you very much for the helpful info.

Kristen · October 14, 2016, 2:01pm

Because every row in w_provider, for a given npi, is being joined to every row (again) in w_provider (for the same npi). Have a Google for Cartesian join. For a master table [w_encounters ] containing 57,000 rows to get a billion rows in the output you need an average of about 4 rows in the child table [w_provider] - i.e. its not a lot

Whats the difference between rows in the w_provider which represent Billing Providers, and rows that represent Treating Providers?