I am experimenting with this code that I got from the net,
I am trying to get better at SQL so would like to understand why this does not do what it seems it should.
I am running it against a over 10million row table which I noticed had several duplicate rows .
select * from boilerdata where ReadingDate in (
select ReadingDate from boilerdata
group by ReadingDate having count(*)=6
I believe This should return any duplicate Datetimes, where specifically there are 6 duplicates for a particular DateTime Grouped together.
BTW the =6 is just an example I have tried >n or =n with several numbers but all give me similar unexpected results.
The Table consists of 2 rows added every minute 24/7 so there are always 2 of any datetime anyway,
(I don't think this is good so I am thinking of just changing that to one row per
minute ). But I digress.
So I could go on about all the results I have had but mainly it works better if I use an = rather than a >
So with this example I got several groups of 6 + several groups of 4 + about 30 groups of 2 at the end.
I can't explain this at all ! I only asked for 6 anyway but got the 4's as well? that's hard enough to explain but why the 30 or so groups of 2 ? there are millions of groups of 2 in the table!
I would appreciate any help , hopefully I might learn something.