Return only those rows where the count of a column is less than 5

I have a table with several columns, one of which is a date. I need to return every row for which there are not at least 5 of the same date.

For example:

search_date	award_year_token
20150622	2014
20150622	2014
20150622	2014
20150622	2014
20150622	2014
20150622	2014
20150624	2014
20150701	2014
20150701	2014
20150701	2014
20150701	2014
20150710	2014
20150710	2014
20150710	2014
20150710	2014
20150803	2014
20150803	2014
20150803	2014

There are other columns in this table I have left out (name, ID, etc.)

So in the above, you'll notice there are 6 rows which have a search_date of 20150622. I will not need any of those rows, because the number of rows with that date are >5. You'll notice there are only 4 rows which have a search_date of 20150701. That's an example of the rows I need returned from my query.

select search_date, award_year_token
from yourtable t1
join 
(
    select search_date
    from yourtable t2
    group by search_date
    having count(*) < 5
) q
 on t1.search_date = q.search_date

Thanks! You're definitely on to something there, but it's not returning everything. For example, those with 20150701 didn't return.

Works for me:

declare @ table (search_date int,	award_year_token int)
insert into @(search_date, award_year_token) values
--search_date	award_year_token
(20150622, 2014),
(20150622, 2014),
(20150622, 2014),
(20150622, 2014),
(20150622, 2014),
(20150622, 2014),
(20150624, 2014),
(20150701, 2014),
(20150701, 2014),
(20150701, 2014),
(20150701, 2014),
(20150710, 2014),
(20150710, 2014),
(20150710, 2014),
(20150710, 2014),
(20150803, 2014),
(20150803, 2014),
(20150803, 2014)

select t1.search_date, award_year_token
from @ t1
join 
(
    select search_date
    from @ t2
    group by search_date
    having count(*) < 5
) q
 on t1.search_date = q.search_date

Returns:

search_date	award_year_token
20150624	2014
20150701	2014
20150701	2014
20150701	2014
20150701	2014
20150710	2014
20150710	2014
20150710	2014
20150710	2014
20150803	2014
20150803	2014
20150803	2014

This is what I get:

search_date	award_year_token
20150602	2014
20150610	2014
20150513	2014
20150513	2014
20150513	2014
20150513	2014
20150624	2014

Then either:

  1. You changed my query in some way (please post what you actually ran)

or

  1. Your data is different.

Please confirm that the columns search_date and award_year_token are integers, as I assumed

I ran:

select t1.search_date, award_year_token
from manual.dbo.ClearingHouse_Student_History t1
join 
(
    select search_date
    from manual.dbo.ClearingHouse_Student_History t2
    group by search_date
    having count(*) < 5
) q
 on t1.search_date = q.search_date

And, yikes - they're in as varchar(8)! (I didn't set these up :frowning:)

ok, then there's probably something different about the way 20150701 is stored. Watch out for leading spaces, imbedded tabs, etc. These will cause the join to fail.

As a test, run the subquery on its own to see if 20150701 appears.

It does not.

prove it! post your table as an INSERT INTO script.

IN SSMS, right-click on the database, select Generate Scripts. Check off Tables, choose your table, then under Advanced options select Data Only.

Post the results here and then we'll be on the same page.

When I right-click I go Tasks > Generate Scripts > Choose my table > and then choose "script to clipboard" and the only option is Finish -- there's no "advanced" and this is what it gives me....

USE [Manual]
GO
/****** Object:  Table [dbo].[ClearingHouse_Student_History]    Script Date: 08/06/2015 11:36:19 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[ClearingHouse_Student_History](
	[student_ssn] [varchar](9) NULL,
	[first_name] [varchar](15) NULL,
	[middle_init] [char](1) NULL,
	[last_name] [varchar](16) NULL,
	[dob] [varchar](8) NULL,
	[search_date] [varchar](8) NULL,
	[award_year_token] [varchar](4) NOT NULL
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO

Where you chose Script to Clipboard, there should be an Advanced button at the top of the Scripting Options screen like this:

Next to Save to File you should see the button (applies to all save methods)

My guess is I'm on a different version?

Yes, quite different. Which version are you running?

btw you can download SSMS 2015. It's currently in tech preview but relatively stable. It will work with SQL Server 2005 and up

A bit long in the tooth!. You're missing SP 4 SQL Server 2008 Service Pack 4

and a security update on top of that Security Update for SQL Server 2008 Service Pack 4

but it's hard to imagine that something as simple as we're doing would be broken.

Give the SSMS 2015 preview a try. Then you should be able to generate the scripts

In the Script Options (one of the first dialog windows in the wizard) there is a "Script data" option which is FALSE by default. Including that, and scripting just the one table, should give you a Create Table script together with something to insert the data into it - which you can then hand-edit to slim down to the actual data you want and/or to obfuscate anything confidential.

Run code below and check the results:

select search_date
from manual.dbo.ClearingHouse_Student_History t2
where search_date = '20150701'

I ended up figuring out the problem. Your query was running just as it should. But then I realized that award_year_token 2014 and 2015 could both have the same search_date. So I had to narrow it down to search only for those with less than 5 of the same search_date with award_year_token=2014.

select t1.search_date, award_year_token
from manual.dbo.ClearingHouse_Student_History t1
join 
(
    select search_date
    from manual.dbo.ClearingHouse_Student_History t2
    where award_year_token=2014
    group by search_date
    having count(*) < 5
) q
 on t1.search_date = q.search_date
 where award_year_token=2014