Select top 20 records per fund basis

hi

i want to select the top 20 records by largest value in a column from each fund in a table. whats the best way to do that.

You can use Row_number() over (partition by fund order by column desc), then select where the Row_number <= 20. If you provide DDL and sample data, we could provide some code for you

ya thought i might need that all right.

i have done this much so far

SELECT *,

       (ROW_NUMBER() OVER (PARTITION BY Fundcode
        ORDER BY Proceeds DESC)) AS RowNumber

FROM [dbo].[WorkingDataloadFile5]
where RowNumber<=20

where i get stuck with these is to tie in the rownumber they way i have it here is wrong as it just errors out. i need another select i think but not sire. also i want to enter all the columns i get back into [WorkingDataloadFile6] table has the same exact same columns

select * from 
( SELECT *,
       (ROW_NUMBER() OVER (PARTITION BY Fundcode
        ORDER BY Proceeds DESC)) AS RowNumber
FROM [dbo].[WorkingDataloadFile5]) v
where RowNumber<= 20

actually got it working way trying to make it complicated for some reason

SELECT *
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY Fundcode ORDER BY Proceeds DESC) AS Row,*
FROM [dbo].[WorkingDataloadFile5] ) us
WHERE us.Row <=20

it is just a simple insert after ?

thanks for reply ha. i just posted again as you did. sorry for wasting you time but thanks so much for the help

like this is not a great way to insert the data into a new table is it?

INSERT into [dbo].[WorkingDataloadFile6]
([AccountPeriodEnd]
,[SecurityName]
,[FundCode]
,[Bought]
,[Sold]
, [PurchaseCost]
,[Proceeds]
,[CategoryTypeCodeList]
,[CategoryCodeList]

       )

SELECT *
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY Fundcode ORDER BY Proceeds DESC) AS Row,
[AccountPeriodEnd]
,[SecurityName]
,[FundCode]
,[Bought]
,[Sold]
, [PurchaseCost]
,[Proceeds]
,[CategoryTypeCodeList]
,[CategoryCodeList]
FROM [dbo].[WorkingDataloadFile5] ) us
WHERE us.Row <=20

plus it does error out saying

Msg 121, Level 15, State 1, Line 5
The select list for the INSERT statement contains more items than the insert list. The number of SELECT values must match the number of INSERT columns.

is that because of the row

its the fundcode that the causing the error.

but still getting Msg 8152, Level 16, State 13, Line 1
String or binary data would be truncated.

both tables are copies of each other

be careful with select *. Should list out the column names

INSERT into [dbo].[WorkingDataloadFile6]
([AccountPeriodEnd]
,[SecurityName]
,[FundCode]
,[Bought]
,[Sold]
, [PurchaseCost]
,[Proceeds]
,[CategoryTypeCodeList]
,[CategoryCodeList]

       )
SELECT [AccountPeriodEnd]
,[SecurityName]
,[FundCode]
,[Bought]
,[Sold]
, [PurchaseCost]
,[Proceeds]
,[CategoryTypeCodeList]
,[CategoryCodeList]
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY Fundcode ORDER BY Proceeds DESC) AS Row,
[AccountPeriodEnd]
,[SecurityName]
,[FundCode]
,[Bought]
,[Sold]
, [PurchaseCost]
,[Proceeds]
,[CategoryTypeCodeList]
,[CategoryCodeList]
FROM [dbo].[WorkingDataloadFile5] ) us
WHERE us.Row <=20

ok thanks. will make sure not to use select * didnt want to anyway. only need to insert into a few columns

the select * was the reason it was failing. Select * included the row_number column which isn't in the table.

great thanks again for the help.

hi

is there any reason why this logic would only pick 19 out of 20 if the number was posttive.

SELECT [AccountPeriodEnd]
,[Fundcode]
,[MasterIdentifier]
,[SecurityName]
,[Bought]
,[Sold]
, [PurchaseCost]
,[Proceeds]
,[CategoryTypeCodeList]
,[CategoryCodeList]
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY Fundcode ORDER BY Proceeds ASC) AS Row,
[AccountPeriodEnd]
,[Fundcode]
,[MasterIdentifier]
,[SecurityName]
,[Bought]
,[Sold]
, [PurchaseCost]
,[Proceeds]
,[CategoryTypeCodeList]
,[CategoryCodeList]
FROM [WorkingDataloadFile5] ) us
WHERE us.Row <=20 and MasterIdentifier like'%Sell%'

for 1 fundcode it only picks 19 even though there is 27. only think i can see is the 20th figure is positive

anyone any ideas on how i can get the logic to pick negative and positive values if they have both

The problem is in your where clause...

The row_number identifies the rows based on the partition and order. In your where clause - you then filter by the row number AND the MasterIdentifier. In the first 20 rows you have 19 with a MasterIdentifier with the word Sell - and one that does not include that word.

If the top 20 rows in a fund had 10 with Sell and 10 without Sell you would only get 10 rows.

If you only want to include those items - then move that portion of the where clause into the inner query so they are filtered out before the row number is assigned. If not - then you need to determine what you really want as the results and adjust the row number function.

hi all 20 have the word sell so thats not the issues.

where in the inner clause can i move the where MasterIdentifier like'%Sell%' as anywhere i put it doesnt seem to work?

thanks for help in advance

The inner query is defining the row based on the partition FundCode. The row number is then based on the order of the column Proceeds.

The way this code is written - it is set to return the first 20 rows for each FundCode based on an descending order of Proceeds, so the largest/highest value of Proceeds will have row number 1 - and the smallest/lowest value of Proceeds will have the greatest row number.

Now - if the inner query returns 30 rows with a mixture of rows that have Sell and not Sell values where Proceeds for a non Sell item is within the first 20 descending values of Proceeds - that row will have a row number <= 20.

To validate - remove the where clause completely and include the row number column in the outer query. Then, you can see which rows are included and what the row numbers are assigned.

If you only want to include those items with *Sell - then move that criteria to the inner query:

FROM [WorkingDataloadFile5]
WHERE MasterIdentifier like'%Sell%' ) us
WHERE us.Row <=20

This limits the query to only those items...but I am not sure that is what you really want. If you want the top 20 rows for items with Sell and the top 20 items without Sell - then change how the row number is calculated.

FROM (SELECT ROW_NUMBER() OVER (PARTITION BY Fundcode, iif(MasterIdentifierId Like '%Sell%', 0, 1) ORDER BY Proceeds ASC) AS Row,

And remove the MasterIdentifierID from the where clause.

Hi

Thanks very much for the help. The first option worked for me . the second one didnt seem to just to let you know.

Not sure what didn't seem to work means - can you elaborate? Just note - the first option eliminates those rows where MasterIndentifierID does not contain the characters Sell.