TSQL Error: Column is invalid in the select list

SELECT [Servername]
,[DBName]
,[Filepath]
,[FileSize]
,[SpaceUsed]
,MAX([AsOfDate])

FROM [MyTable]

Where DBName not in ('Master', 'msdb', 'model', 'tempdb')
Group by [Servername],[DBName]

Gives error: Column FileSize' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

I know this is a common error but not sure how to resolve this one. I need to select the MAX / latest date for each ServerName + DBName

SELECT 
 [Servername]
,[DBName]
,[Filepath]
,[FileSize]
,[SpaceUsed]
,[AsOfDate]

FROM (
    SELECT *, ROW_NUMBER() OVER(PARTITION BY DBName ORDER BY AsOfDate DESC) AS row_num
    FROM [MyTable]
    WHERE DBName not in ('master', 'model', 'msdb', 'tempdb')
) AS MT_latest

WHERE row_num = 1
2 Likes

Thank you ScottPletcher! You pointed me in the right direction.

I tweaked your script a bit to get 1 row per Server+DB+FilePath. Then I exclude LDF files

SELECT
[Servername]
,[DBName]
,[Filepath]
,[FileSize]
,[SpaceUsed]
,[AsOfDate]

FROM (
SELECT *, ROW_NUMBER() OVER(PARTITION BY ServerName,DBName, FilePath ORDER BY AsOfDate DESC) AS row_num
FROM [dbo].[MyTable]
WHERE DBName not in ('master', 'model', 'msdb', 'tempdb') AND Filepath NOT LIKE ('%LDF%')
) AS MT_latest

WHERE row_num = 1

Great, glad it helped.

Btw, in case you were wondering, I changed the order of the system dbs in the list to put them in alpha order. For a list of literals, I always put them in alpha order, makes it easier to scan the list later for a specific value if I need to.

2 Likes

Thanks ScottPletcher. I would like to better understand what the
ROW_NUMBER() OVER(PARTITION BY ServerName,DBName, FilePath actaully does. Why does it work over what I originally coded? Thanks.

The ROW_NUMBER() assigns a sequential number to each row based on a PARTITIONing of the data (optional) and an ORDER BY (required).

Say there were five rows for the same ServerName, DBName and FilePath. The one with the largest AsOfDate, say '20210427', would be row 1. The next highest date, say '20210416' would be 2, '20210413' would be 3, etc..

Then the "WHERE row_num = 1" restricts the output to only the last one for each unique combination of the PARTITION BY columns, in this case ServerName, DBName and FilePath.

Say we left off the PARTITION BY. Then we'd get the row with the latest AsOfDate as #1, no matter which ServerName, DBName or FilePath it was for. And there would be only one #1. The PARTITION BY is what causes the number to reset to 1 for each unique combination of PARTITION BY columns.

1 Like

An alternative query:

SELECT t2.[Servername],
       t2.[DBName],
       t2.[Filepath],
       t2.[FileSize],
       t2.[SpaceUsed],
       t2.[AsOfDate]
  FROM (SELECT DISTINCT 
               [Servername], 
               [DBName]
          FROM [MyTable]
         WHERE DBName not in ('master', 'model', 'msdb', 'tempdb')) t1
CROSS APPLY(SELECT TOP(1) *
              FROM [MyTable] t2
             WHERE t2.[Servername] = t1.[Servername]
               AND t2.[DBName] = t1.[DBName]
             ORDER BY t2.[AsOfDate] DESC) t2
2 Likes