Order of Index creation

djj55 · August 27, 2015, 7:10pm

Is it better to create the primary key before other indexes, or should the other indexes be created before the primary key?

Also, can you disable the primary key?

Just curious.

ScottPletcher · August 27, 2015, 7:13pm

You should create the clustered index first,then non-clustered indexes. Btw, when dropping indexes, you should drop nonclustered indexes first, and the clustered index last.

You can disable a clustered index. But then you can't reference the table until it is reenabled.

djj55 · August 27, 2015, 7:23pm

Thanks @ScottPletcher
Good to know about the disable of a clustered index.

Kristen · August 28, 2015, 7:29am

Don't think Primary Key matters, its the Clustered Index that is important

If you create Non-clustered indexes first they will get a "record offset" value (because there is no Clustered Index), so SQL can retrieve the row.

When you then create the Clustered Index SQL will replace the "record offset" with the Clustered Index Key fields, so every entry in every non-clustered index will have to be modified

Hence better to create Clustered Index first.

I'd go so far as to say you might want to drop all Non-clustered Indexes before changing a clustered index, and then recreate then afterwards, but maybe SQL's in-situ reshuffle would be similar outcome, performance-wise

If I am bulk loading a table I create the clustered index first, bulk load the data PRE-sorted into clustered index key order (and provide the HINT to BCP etc. that the data is presorted)

I have quite a few Primary Keys which use non-clustered indexes - maybe your question was related to creating ( or creating a NEW) Primary keys which are CLUSTERED, in which case its still all-about-the-clustered-index but you do have the option to add a Prirmay Key as NON-clustered, if you already have a perfectly good Clustered Index in place, in which case any existing Non-Clustered indexes will be unaffected by adding / changing the Primary Key.

djj55 · August 28, 2015, 10:58am

I know that a clustered index does not have to be the primary, however we usually us the default of primary = clustered.

ScottPletcher · August 28, 2015, 2:01pm

And probably a "default" of that being an identity as well. You must realize that is terrible for overall performance. The biggest myth in SQL dbs is that tables "should" be clustered by identity by default. 100% wrong.

The clustered index is the critical factor in performance, and it should be carefully chosen for every table, never just assumed to be identity. Yes, there are cases where an identity is indeed the best clustering key, but it's less than 50% of tables, usually much less. I've tuned tens of thousands of tables, and the biggest part is usually replacing the dopey, default clustered index on identity with something much, much better.

Kristen · August 31, 2015, 8:06am

Some examples might help folk stumbling over this thread?

I'm thinking Date, for range-based queries, might be a good example?

Balanced off again whether there is a Foreign Key to consider?

What about Order / OrderDetail tables? Lets say that Order PKey is OrderID is IDENTITY, and OrderDetail has OrderID, ItemNo. Does Order table have Clustered index on OrderID or OrderDate?

(I haven't really thought it through, just posing the question )

I'm currently struggling with a Clustered Index that has keys A, B, C. A never changes, 99% of queries are covered by A, B & C ... so far so good! ... but B or C can be part of an update along the lines of

SET B = B - 2

which updates 20,000 rows and runs in a tight RBAR loop. There are at least a dozen nonclustered indexes, which all need updating each time, so in this case the best Clustered Index which covers the most queries is problematic because the keys are not "almost never changing" (I think "almost never changing" is also one of the considerations for a Clustered Index)

ScottPletcher · August 31, 2015, 2:53pm

Order table is clustered on OrderId. This is one of those true "master" tables where identity is the best clustered index, like Customer table. OrderDetail is clustered on ( OrderId, ItemNo ) or ( OrderId, EntryDateTime), depending on what the app/business needs.

I'm currently struggling with a Clustered Index that has keys A, B, C. A never changes, 99% of queries are covered by A, B & C ... so far so good! ... but B or C can be part of an update along the lines of

Interesting. I'd want to see more details on this. Can't imagine why you'd need to adjust those values later if they are searched on. Certainly updating the clustering key is not good. But some number of very limited updates could be OK.

Kristen · September 1, 2015, 8:34am

Its in this thread, which you have kindly replied to. I'm busy doing some tests before replying further. I keep uncovering more-and-more disastrous performance and not entirely sure which way to jump :
http://forums.sqlteam.com/t/autonomy-imanage-sql-performance/2946

ScottPletcher · September 1, 2015, 4:35pm

Disastrous performance is almost always due to not having the best clustered index on the tables. The majority of time that is because those clustered indexes are on identity column -- presumed by some to be a "default" clustering key -- rather than on the column(s) that are actually best for that specific table.