Space used by indexes

puneet122016 · December 28, 2016, 12:59am

I have a table with around 50 million of rows. The table does not have any indexes. There is identity column with BigInt datatype.

The daily updation of record is done based on identity column and the weekly bulk insert/delete is done based of other column of date data type

I am planning to create Clustered index of identity column.

My query:

Will the creation of clustered index will improve daily updation of record(s) or should I create non-clustered index?. How it will affect the bulk insert/delete operation.
Does the creation of clustered index (on identity column as above) takes more space than non-clustered index or vice-versa.

stephen_hendricks · December 28, 2016, 9:52pm

Indexes are useful for finding existing data, such as in a SELECT, UPDATE or DELETE statement.The statement has to reference the column(s) in the index. Clustered indexes are especially good for finding a range of values.Personally, I want every table to have a clustered key. Without one, your data is "randomly" placed in storage (disk). The implication for updating a range of values is that, potentially, you'll need a separate disk read for every record of data. That's not good. With a clustered index, once you have found one record, the "next" one in the range is in the same location so you cut down on disk activity; always a good idea.
Long answer, short - a clustered index should improve your update performance.
Data inserts, however, can be slowed by an index since there is some overhead in maintaining the index as well as the data. How much you'll need to test. Usually the overhead on the inserts is outweighed by the improvements in the selects, updates and deletes. In a worst case scenario, you can drop the index, insert your data and re-create the index and that can take less time than simply inserting into an indexed table. Will this be true for your environment, only testing will tell. Depending on when and how you insert the data (off hours?), you may not care if the data load takes slightly longer to complete.

jason_clark · December 29, 2016, 5:18am

Yes, clustered index will take more space. In this, a new table will be created for indexing which will occupy more space. But using this performance improved. In the case of non-clustered index it captures, less space then clustered index. In non-clustered index this just hold the pointer of the data.

stephen_hendricks · December 29, 2016, 9:45pm

The amount of space that an index takes up is more a function of the width of the index column(s) rather than the "clusteredness" (Is that a word? Well it is now!) A thinner index will allow more elements to fit on a page so the depth of the index b-tree will be shallower. The actual b-tree structure is essentially the same between the two types of indexes; the main difference being that the leaf of a clustered index is the actual data while the leaf of the non-clustered index is either the clustered key, if one exists, or a pointer to the data in the heap.[quote="jason_clark, post:3, topic:8930"]
a new table will be created for indexing
[/quote]I hope that I'm misunderstanding what's being said here. A clustered key doesn't create a duplicate copy of the data; it holds the data at the leaf level instead of in a heap. If there is a clustered and non-clustered index, the non-clustered index eventually points to the data in the clustered key leaves; not to a separate copy of the data.