I am trying to run the below query
w_provider is joined twice it only has 75k rows and the w_encounters has 57 million rows, but when ran this query to get the counts it is showing in billions.
What is the proper way to get the count.
select count(*)
from w_encounters encounter
inner join w_provider billing_provider
on encounter.billing_prov_npi = billing_provider.npi
inner join w_provider treating_provider
on encounter.treating_prov_npi = treating_provider.npi
Thank you very much for the helpful info.
w_encounters JOINS to w_provider on npi
For EVERY row returned from that JOIN a JOIN is made to w_provider again.
So if for npi #1 there is 1 row in w_provider you will get just one row back.
But if there are 2 rows in w_provider you will get 4 rows back
Perhaps you need to also include some sort of "type" column in the joins to w_provider which filters to only provide the Billing and Treatment providers? However, if there are still 2 Billing rows and 2 Treatment rows that will still give you 4 resulting rows.
But why the count is coming in billions?
in encounters i have treating provider_npi and billing_provider_npi columns.
based on its respective values making inner join.
encounters has 57 million rows and providers has just 75k rows.
Is there a better way to query that what i am trying to do to get the count?
Thank you very much for the helpful info.
Because every row in w_provider, for a given npi, is being joined to every row (again) in w_provider (for the same npi). Have a Google for Cartesian join. For a master table [w_encounters ] containing 57,000 rows to get a billion rows in the output you need an average of about 4 rows in the child table [w_provider] - i.e. its not a lot
Whats the difference between rows in the w_provider which represent Billing Providers, and rows that represent Treating Providers?