SQLTeam.com | Weblogs | Forums

SUM of column with distinct applied to a different column

Hi all,

Trying to get sum of size of unique filenames.

I have a table with filename and size. filename is not unique, however if the filenames are the same they are the same file, so I only want to count their size once.

So I want the total file size of all unique filenames, I do not want the sum of distinct sizes, because some different files can have the same size.

SELECT DISTINCT Filename, SUM([Size]) --fails

SELECT COUNT(DISTINCT Filename), SUM([Size]) --returns correct filename count, but the total sum, not distinct sum

SELECT SUM(DISTINCT [Size]) --returns sum of distinct sizes, not sum of distinct filenames.

I sure I could do it with a more elaborate, compound query, but I'm trying to be efficient as this will be called often from a dashboard.

Any guidance appreciated!

SELECT COUNT(*) AS Unique_File_Count, SUM(Size) AS Total_File_Size
FROM (
    SELECT Filename, MAX(Size) AS Size
    FROM dbo.table_name
    GROUP BY Filename
) AS derived

file size for a file can change in time. is there a date associated with the duplicate file names?

No, there is no date. The 'files' are keys to an amazon S3 bucket. If a user attaches the same file in multiple locations,I keep only a single copy of it to reduce S3 storage redundancy, and I only count the file once against the total space allowed.