SQLTeam.com | Weblogs | Forums

Retrieve most/least frequent numbers


#1

Hi there,

I have a table (numbers) and in this table I have fields as follows week number, num1, num2, num3, num4, num5 and num6. Those num1 - num6 would be among 1 to 54. I would like to know if i can get the most/least frequent numbers in this table. To extend this question if I can add draw date to this table, is it possible to get most/least frequent numbers between selected dates?

Thanks in advance and Best Regards.


#2

Cenk,

Can you please provide sample table and data as follows?

create table #numbers(num1 int, num2 int, num3 int, num4 int, num5 int , num6 int)

insert into #numbers
select 2, 3,4,7,3,6

#3

Sounds like someones trying to pick their lotto numbers... :wink:

The answer is yes (to both questions)...

--=======================================================================
-- some test data...IF OBJECT_ID('tempdb..#WinningNumbers', 'U') IS NOT NULL 
DROP TABLE #WinningNumbers;

CREATE TABLE #WinningNumbers (
	DrawDate DATE NOT NULL,
	Num_1 INT NOT NULL,
	Num_2 INT NOT NULL,
	Num_3 INT NOT NULL,
	Num_4 INT NOT NULL,
	Num_5 INT NOT NULL,
	Num_6 INT NOT NULL
	);

WITH 
	cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)), 
	cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
	cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n2 b),
	cte_Tally (n) AS (
		SELECT TOP 1000 
			ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
		FROM
			cte_n3 a CROSS JOIN cte_n3 b
		),
	cte_GetNumbers AS (
		SELECT 
			t.n,
			Position = DENSE_RANK() OVER (PARTITION BY t.n ORDER BY n.Number),
			n.Number
		FROM
			cte_Tally t
			CROSS APPLY ( VALUES -- generate 10 numbers to increase the likelyhood of getting 6 unique numbers.
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1)
							) n (Number)
		)
INSERT #WinningNumbers (DrawDate, Num_1, Num_2, Num_3, Num_4, Num_5, Num_6)
SELECT 
	DrawDate = DATEADD(wk, gn.n, '2000-01-03'),
	Num_1 = MAX(CASE WHEN gn.Position = 1 THEN gn.Number END),
	Num_2 = MAX(CASE WHEN gn.Position = 2 THEN gn.Number END),
	Num_3 = MAX(CASE WHEN gn.Position = 3 THEN gn.Number END),
	Num_4 = MAX(CASE WHEN gn.Position = 4 THEN gn.Number END),
	Num_5 = MAX(CASE WHEN gn.Position = 5 THEN gn.Number END),
	Num_6 = MAX(CASE WHEN gn.Position = 6 THEN gn.Number END)
FROM
	cte_GetNumbers gn
GROUP BY 
	gn.n;

-- SELECT * FROM #WinningNumbers wn;
--=======================================================================
-- the solution...

SELECT 
	n.Number,
	NumOfOccurances = COUNT(1)
FROM
	#WinningNumbers wn
	CROSS APPLY ( VALUES (wn.Num_1), (wn.Num_2), (wn.Num_3), (wn.Num_4), (wn.Num_5), (wn.Num_6) ) n (Number)
--WHERE 
--	wn.DrawDate >= '2015-01-01'
--	AND wn.DrawDate < '2016-01-01'
GROUP BY 
	n.Number
ORDER BY
	COUNT(1) DESC;

So, does this mean you'll cut me in is this helps you pick a winner? :wink:


#4

Can I limit the number of results? Lets say I would like to get only top 6 results. Can I limit query with a parameter?

edit: TOP, LIMIT solves my issue :slight_smile:


#5

@Jason_A_Long how can I get the gaps based on draws of a specific number? Lets say number 5 occured 50 times based on 250 draws. But I would like to know theoccurance gaps of number 5. (I have the draw dates data also)


#6

Is this what you want?

SELECT 
	wn.DrawDate,
	Has5 = CASE WHEN 5 IN (wn.Num_1, wn.Num_2, wn.Num_3, wn.Num_4, wn.Num_5, wn.Num_6) THEN 'Y' END
FROM 
	#WinningNumbers wn
ORDER BY 
	wn.DrawDate;

#7

Thank you @Jason_A_Long, appreciate your work. Can we improve this query? I mean can we count the nulls, may we I can get a pattern from it. For example, for the number 5 the first appearance is week 3 (2 nulls and third week it was drawn) then 37 nulls after 3rd week and was drawn at 38th week etc etc. Am I clear?


#8

Is this what you want?

SELECT 
	x.DrawDate,
	x.Has5,
	DrawlsBetween = x.Has5 - LAG(x.Has5, 1, 0) OVER (ORDER BY x.DrawDate)
FROM 
	(
	SELECT 
		wn.DrawDate,
		Has5 = CASE WHEN 5 IN (wn.Num_1, wn.Num_2, wn.Num_3, wn.Num_4, wn.Num_5, wn.Num_6) THEN ROW_NUMBER() OVER (ORDER BY wn.DrawDate) END
	FROM 
		#WinningNumbers wn
	) x
WHERE 
	x.Has5 IS NOT NULL;

#9

Hi @Jason_A_Long,

It think there is a problem about DrawIsBetween. If DrawIsBetween indicates the nulls then it should be Has5 - 1.


#10

You can alter the expression however you'd like.


#11

if I change to this LAG(x.Has5, 1, 1) , it only corrects the first DrawIsBetween. How can I change the query to make it correct for all DrawIsBetween @Jason_A_Long


#12

Change what query? You've only shown an incomplete fragment of a LAG function. My reply from yesterday tells you how to count the number of drawls between the occurrence of a five. The LAG function isn't going to do anything to help you as it doesn't count anything it only allows you to compare the current row to another single row.


#13

please check this image query result @Jason_A_Long, all of the DrwIsBetween should be -1
eg:
first row DrawIsBetween should be 2
second row DrawIsBetween should be 37
third row DrawIsBetween should be 21
fourth row DrawIsBetween should be 0..etc etc


#14

You should be able to simply subtract 1 from the DrawlsBetween calculation...

DrawlsBetween = x.Has5 - LAG(x.Has5, 1, 0) OVER (ORDER BY x.DrawDate) - 1


#15

thank you @Jason_A_Long


#16

No problem, glad to help :slight_smile:


#17

I came up with something @Jason_A_Long, can I determine the most common intervals between the numbers within a draw?


#18

All things are possible the trick is figuring out #1 how to do it, #2 how to do it in the most efficient manner possible.
The original question has been answered and it time to close this thread.


#19

sure I will ask a new question about this, hope you can help me.