Retrieve most/least frequent numbers

Hi there,

I have a table (numbers) and in this table I have fields as follows week number, num1, num2, num3, num4, num5 and num6. Those num1 - num6 would be among 1 to 54. I would like to know if i can get the most/least frequent numbers in this table. To extend this question if I can add draw date to this table, is it possible to get most/least frequent numbers between selected dates?

Thanks in advance and Best Regards.

Cenk,

Can you please provide sample table and data as follows?

create table #numbers(num1 int, num2 int, num3 int, num4 int, num5 int , num6 int)

insert into #numbers
select 2, 3,4,7,3,6

Sounds like someones trying to pick their lotto numbers... :wink:

The answer is yes (to both questions)...

--=======================================================================
-- some test data...IF OBJECT_ID('tempdb..#WinningNumbers', 'U') IS NOT NULL 
DROP TABLE #WinningNumbers;

CREATE TABLE #WinningNumbers (
	DrawDate DATE NOT NULL,
	Num_1 INT NOT NULL,
	Num_2 INT NOT NULL,
	Num_3 INT NOT NULL,
	Num_4 INT NOT NULL,
	Num_5 INT NOT NULL,
	Num_6 INT NOT NULL
	);

WITH 
	cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)), 
	cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
	cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n2 b),
	cte_Tally (n) AS (
		SELECT TOP 1000 
			ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
		FROM
			cte_n3 a CROSS JOIN cte_n3 b
		),
	cte_GetNumbers AS (
		SELECT 
			t.n,
			Position = DENSE_RANK() OVER (PARTITION BY t.n ORDER BY n.Number),
			n.Number
		FROM
			cte_Tally t
			CROSS APPLY ( VALUES -- generate 10 numbers to increase the likelyhood of getting 6 unique numbers.
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1),
								(ABS(CHECKSUM(NEWID()) % 45) + 1)
							) n (Number)
		)
INSERT #WinningNumbers (DrawDate, Num_1, Num_2, Num_3, Num_4, Num_5, Num_6)
SELECT 
	DrawDate = DATEADD(wk, gn.n, '2000-01-03'),
	Num_1 = MAX(CASE WHEN gn.Position = 1 THEN gn.Number END),
	Num_2 = MAX(CASE WHEN gn.Position = 2 THEN gn.Number END),
	Num_3 = MAX(CASE WHEN gn.Position = 3 THEN gn.Number END),
	Num_4 = MAX(CASE WHEN gn.Position = 4 THEN gn.Number END),
	Num_5 = MAX(CASE WHEN gn.Position = 5 THEN gn.Number END),
	Num_6 = MAX(CASE WHEN gn.Position = 6 THEN gn.Number END)
FROM
	cte_GetNumbers gn
GROUP BY 
	gn.n;

-- SELECT * FROM #WinningNumbers wn;
--=======================================================================
-- the solution...

SELECT 
	n.Number,
	NumOfOccurances = COUNT(1)
FROM
	#WinningNumbers wn
	CROSS APPLY ( VALUES (wn.Num_1), (wn.Num_2), (wn.Num_3), (wn.Num_4), (wn.Num_5), (wn.Num_6) ) n (Number)
--WHERE 
--	wn.DrawDate >= '2015-01-01'
--	AND wn.DrawDate < '2016-01-01'
GROUP BY 
	n.Number
ORDER BY
	COUNT(1) DESC;

So, does this mean you'll cut me in is this helps you pick a winner? :wink:

1 Like

Can I limit the number of results? Lets say I would like to get only top 6 results. Can I limit query with a parameter?

edit: TOP, LIMIT solves my issue :slight_smile:

@Jason_A_Long how can I get the gaps based on draws of a specific number? Lets say number 5 occured 50 times based on 250 draws. But I would like to know theoccurance gaps of number 5. (I have the draw dates data also)

Is this what you want?

SELECT 
	wn.DrawDate,
	Has5 = CASE WHEN 5 IN (wn.Num_1, wn.Num_2, wn.Num_3, wn.Num_4, wn.Num_5, wn.Num_6) THEN 'Y' END
FROM 
	#WinningNumbers wn
ORDER BY 
	wn.DrawDate;

Thank you @Jason_A_Long, appreciate your work. Can we improve this query? I mean can we count the nulls, may we I can get a pattern from it. For example, for the number 5 the first appearance is week 3 (2 nulls and third week it was drawn) then 37 nulls after 3rd week and was drawn at 38th week etc etc. Am I clear?

Is this what you want?

SELECT 
	x.DrawDate,
	x.Has5,
	DrawlsBetween = x.Has5 - LAG(x.Has5, 1, 0) OVER (ORDER BY x.DrawDate)
FROM 
	(
	SELECT 
		wn.DrawDate,
		Has5 = CASE WHEN 5 IN (wn.Num_1, wn.Num_2, wn.Num_3, wn.Num_4, wn.Num_5, wn.Num_6) THEN ROW_NUMBER() OVER (ORDER BY wn.DrawDate) END
	FROM 
		#WinningNumbers wn
	) x
WHERE 
	x.Has5 IS NOT NULL;

Hi @Jason_A_Long,

It think there is a problem about DrawIsBetween. If DrawIsBetween indicates the nulls then it should be Has5 - 1.

You can alter the expression however you'd like.

if I change to this LAG(x.Has5, 1, 1) , it only corrects the first DrawIsBetween. How can I change the query to make it correct for all DrawIsBetween @Jason_A_Long

Change what query? You've only shown an incomplete fragment of a LAG function. My reply from yesterday tells you how to count the number of drawls between the occurrence of a five. The LAG function isn't going to do anything to help you as it doesn't count anything it only allows you to compare the current row to another single row.

please check this image query result @Jason_A_Long, all of the DrwIsBetween should be -1
eg:
first row DrawIsBetween should be 2
second row DrawIsBetween should be 37
third row DrawIsBetween should be 21
fourth row DrawIsBetween should be 0..etc etc

You should be able to simply subtract 1 from the DrawlsBetween calculation...

DrawlsBetween = x.Has5 - LAG(x.Has5, 1, 0) OVER (ORDER BY x.DrawDate) - 1

thank you @Jason_A_Long

No problem, glad to help :slight_smile:

I came up with something @Jason_A_Long, can I determine the most common intervals between the numbers within a draw?

All things are possible the trick is figuring out #1 how to do it, #2 how to do it in the most efficient manner possible.
The original question has been answered and it time to close this thread.

sure I will ask a new question about this, hope you can help me.