I have a database that represents persons in different statuses. I need to summarize entities by status code and include percent counts. I can do this with
select
coalesce(status_code,'Total'),
count(person_id) as count,
format(
(count(person_id) * 2.0 / sum(count(person_id)) over ()), 'P2'
) as MyPercentage
from
mytable
where
some conditions
group by rollup (status_code);
This gives me the result I want:
status_code count MyPercentage
CA 7 2.80%
I 201 80.40%
O 34 13.60%
PF 1 0.40%
R 7 2.80%
Total 250 100.00%
My question is about the math. I don't understand why it is count(person_id) * 2.0. Without multiplying by 2, the total s 50% and all the percents are half. I'm just not tracking on the math. Explain it like I am stupid?
I think you're double-counting the total by SUMming the total count.
Try this instead:
select
coalesce(status_code,'Total'),
count(person_id) as count,
format(
(count(person_id) * 1.0 / count(person_id) over ()), 'P2'
) as MyPercentage
from
mytable
where
...
group by rollup (status_code);
'person_id' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Putting person_id in the rollup gets me
CA 1 0.40%
CA 1 0.40%
CA 1 0.40%
CA 1 0.40%
CA 1 0.40%
CA 1 0.40%
CA 1 0.40%
CA 7 2.80%
I 1 0.40%
I 1 0.40%
etc
But that maybe starts to make sense: If I am somehow double counting the denominator by using sum of the count, then double counting the numerator by multiplying by 2 makes it come out right.
Edit:
This works
(count(person_id) / sum(count(person_id)) over (partition by status_code))