Fastest way to retrieve the result

sqllover · July 15, 2015, 2:18am

This is my table structure,

Version : sqlserver 2008R2

CREATE TABLE [Positions](
            [load_id]           [int]                  NOT NULL, 
            [acct_cd]         [varchar](20)   NOT NULL,
            [acct_num]      [varchar](255)             NULL,
            [sec_id]            [varchar](50)   NOT NULL,
            [long_sht_cd]   [varchar](3)     NOT NULL,
            [sedol]              [varchar](15)   NULL,
            [isin]                 [varchar](15)   NULL,
            [cusip]              [varchar](9)     NULL,
            [sec_type]       [varchar](8)     NULL,
            [sec_name]     [varchar](100) NULL,
            [currency_cd] [varchar](3)     NULL,
            [total_holding] [decimal](18, 4) NULL,
            [mkt_price]      [float]               NULL,
            [datetime_stamp] [datetime]   NULL,
CONSTRAINT [pk_Positions] PRIMARY KEY CLUSTERED (       
[load_id] ASC, 
            [acct_cd] ASC, 
            [sec_id] ASC,
            [long_sht_cd] ASC )
)

This table holds account positions data that are appended to multiple times a day
There are currently some 24 million rows in the table. Every time we append additional positions we add approximately 32,000 entries to this table, and all 32,000 entries will have the same load_id. The load_id is incremented by one each time we load a batch of 32,000 entries (i.e. the first 32K entries have load_id=1, the next 32K has load_id=2, etc...). The datetime_stamp field shows the time at which the entries were loaded and is the same for all 32K entries in a single load

what is the efficient way to retrieve the first set of positions for the current day given the above table definition?

I can understand that it's hard to give the query without the sample data. but at this moment i couldn't get the data. Any sample query based on the table structure please

jon_mr · July 15, 2015, 4:53am

Assuming you had an index on datetime_stamp and the clustered index you defined above, I would think you would get decent performance with datetime_stamp and load_I'd in your WHERE clause. The query plan will tell you for sure.

If you expect this table to continue to grow you may want to consider partitioning the table.

sqllover · July 15, 2015, 10:45am

hi jon,

thanks for the reply. could you please suggest sample query for the requirement.

JamesK · July 15, 2015, 12:46pm

    SELECT TOP (1) WITH TIES *
    FROM Positions
    WHERE [datetime_stamp] >= CAST(GETDATE() AS DATE)
    ORDER BY [datetime_stamp] ;

ScottPletcher · July 15, 2015, 4:22pm

If you typically lookup by datetime, make datetime_stamp the first column in the clustered index (it does not have to be an actual PK, duplicate clustering key values are OK, at least initially).

sqllover · July 15, 2015, 4:24pm

Thank you jon ,James and scott for your time on this.