SQLTeam.com | Weblogs | Forums

How can I calculate the MIN using specific criteria?

Hello

I want to do:

SELECT Col1
FROM Table
HAVING MIN(*) > 1
GROUP BY Col1

But I want the MIN to be calculated only based on the rows where Col2="A".

Is the below the right way?

SELECT Col1
FROM Table
HAVING MIN(*) > 1 WHERE Col2="A"
GROUP BY Col1

Thanks

SELECT a.Col1
FROM ( select * from Table WHERE Col2="A" ) a
HAVING MIN(a.*) > 1 
GROUP BY a.Col1

Thanks, I cannot make it work.

I have this data:

ID DATE Y/N
A 01/01/2020 Y
A 02/01/2020 Y
A 03/01/2020 N
B 04/01/2020 Y
B 05/01/2020 Y

How can I return the smallest date for A where Col3='Y'?

Thanks

drop create data .. click arrow
drop table #SampleData 

create table #SampleData 
(
ID varchar(10) , 	[DATE] date , 	[Y/N] varchar(10)
)

set dateformat  dmy 
insert into #SampleData values 
 ('A',	'01/01/2020',	'Y')
,('A',	'02/01/2020',	'Y')
,('A',	'03/01/2020',	'N')
,('B',	'04/01/2020',	'Y')
,('B',	'05/01/2020',	'Y')

select * from #SampleData
select 
    min(date) 
from 
   #SampleData 
where 
   ID = 'A' 
     and 
   [Y/N] = 'Y'

image

Oh so simple but I was carried over by having to use HAVING. Thanks!

Any idea how to return the following?

A, 01/01/2020, Y
B, 04/01/2020, Y

I.e. the value of the first column, then the min date of the second column for the records corresponding to the value of the first column, and then the Y/N value corresponding to the min date calculated.

Thanks

Do you want the row with the minimum date - or just the minimum date?

   With myData
     As (
Select *
     , rn = row_number() over(Partition By ID Order By DATE)
  From #SampleData
 Where [Y/N] = 'Y'  --only select the rows with 'Y'
       )
Select *
  From myData
 Where rn = 1;

If you only have those 3 columns - and you don't need any additional columns:

Select ID
     , [DATE] = min([DATE)
     , [Y/N]
  From #SampleData
 Where [Y/N] = 'Y'
 Group By
       [ID]
     , [Y/N]

What is the purpose for this - what are you trying to accomplish?

I didn't know you can specify [DATE] = min([DATE) in the SELECT section, I was specifying it in the WHERE or HAVING section and it did not work.

Can you really specific [DATE] = '2020-01-01' in the SELECT section and it will work as a WHERE statement?

I think the first is what I am looking for because I only need to return the rows where [DATE]=MIN([DATE])

No - that code is not 'working' as a WHERE clause.

[DATE] = min([DATE])

Returns a column 'named' [DATE] with the minimum value of the [DATE] column from the source table based on the grouping. It might be better written as:

MinDate = min([DATE])

Or

min([DATE]) As MinDate

To clarify - you want all columns from the source table with the minimum date for each ID? So there are columns you want returned that you did not include in the sample data? If so - then use row_number() to identify the rows...

Note: if the purpose of this is to delete duplicates then change the order in the window to DESC and select all rn > 1.

Basically, I want the rows from the original table to be filtered.
The filter is where [Date]=MIN([Date])

Is that possible?

My attempt was:
SELECT *
FROM TABLE
HAVING/WHERE [Date]=MIN([Date])

But it does not seem to work.

Any idea?

Yes - as I stated before you want to use ROW_NUMBER() with an appropriate partition and order by.

I see, thanks.

Can I use SELECT * FROM (SELECT *)?
I would like not to have to set myData into a variable.

To complete the question, can I ask how to achieve the same but for rn = max(rn) ? I.e. the row with the latest date at the same time?

To get the latest row for each partition - change the ORDER BY to a DESC order. That will identify the latest row as rn = 1, the next latest as rn = 2, etc...

Then - you can just select for rn = 1 to get the latest row for each partition.

I don't understand your other question...there is no setting of variables needed. The construct is a common-table expression that creates a 'table' named 'myData' that is then queried in the final portion of the query. This allows us to generate the row number - then filter using the results of the row number calculation.

Personally - I will use a CTE instead of a derived table where there is no need to correlate the derived table with values from the current row. When I do need to correlate the derived table I then use an OUTER/CROSS APPLY on the derived table/query.

Let's say I have the table below.

I want to:
SELECT *
FROM TABLE
WHERE (the min Active date for each ID) > 02/01/2020

A 01/01/2020 Active
A 02/01/2020 Inactive
A 03/01/2020 Active
A 04/01/2020 Inactive
B 03/01/2020 Active
B 04/01/2020 Inactive
B 05/01/2020 Active
B 06/01/2020 Inactive

Any idea?

I still don't get how to calculate MIN/MAX after applying filters. In DAX, this can be done very neatly, like: CALCULATE(MIN(DATE),FILTER(TABLE,ID=EARLIER(ID),STATUS=ACTIVE)

drop table #SampleData

create table #SampleData
(
id varchar(1) , [date] date , status varchar(100)
)

insert into #SampleData values
('A','01/01/2020','Active')
, ('A','02/01/2020','Inactive')
, ('A','03/01/2020','Active')
, ('A','04/01/2020','Inactive')
, ('B','03/01/2020','Active')
, ('B','04/01/2020','Inactive')
, ('B','05/01/2020','Active')
, ('B','06/01/2020','Inactive')

select * from #SampleData

select 
   * 
from 
   #SampleData 
where 
  status = 'Active'
      and 
  [date] > '02/01/2020' 

image

Thanks but the min Active Date of A, is 01/01/2020, so it should not be returned, because I want to specify the filter: "min Active Date > 02/01/2020".

Loooks like you are "totally totally .. new .. " to SQL

image

Why?

In your results, you show: A, 2020-03-01, Active
I want this filtered out because there is a A, 2020-01-01, Active, which is an active date less than 02-01-2020.

then this should be

min Active Date < 02/01/2020
not
min Active Date > 02/01/2020