In-place update to cluster index

Kristen · February 17, 2016, 4:47pm

Only ON, for us, so that Readers don't block Writers, but that only applies to daytime users arriving at the DB via the APP. Any overnight batch operations should be lock-safe, and if occasionally a couple of users decide to work at 2am then, even then, the chance them triggering a deadlock is pretty close to zero.

Not sure if the DB will set about adding/removing all the Snapshot Bits to records/pages/etc if I toggle it ON / OFF though ...

Kristen · February 17, 2016, 4:52pm

By the by, READ_COMMITTED_SNAPSHOT would be OFF, for us, on any Staging Database.

Trouble is, again for us, we set an ChangedDate column on any row that we insert into the Staging Table (and then only update the production DB with rows that have changed "since last time"), so making an UPDATE with no WHERE clause on my staging table might avoid the WHERE clause, and not physically update the disks but ... if I cannot then figure out which rows actually changed I wouldn't have a slick means of updating my Production tables from Staging.

Unless you can suggest something please?

gbritton · February 17, 2016, 6:53pm

Oh I do trust Paul, as far as he goes with it. Following your suggestion, i did select * from the log. As expected, two rows are added for each update (begin/end tran) without my added where clause. However, with my added where clause, no additional rows are written to the log when the data does not change.

Either way, the pages are not marked as updated (which is really, really good to see) but new log rows are written when the where clause is omitted. Interesting.

Paul didn't address that. I found out why. See the next post.

gbritton · February 17, 2016, 7:29pm

Some more info. Paul's article links to Non updating updates

In there we have:

The runtime cost for updating a row is roughly equal to
a) locating the row in the heap or B-Tree
b) locking the row
c) changing the updated column values
d) logging the change.

So, the reason the "enhanced" (if I dare call it that!) update with change checking writes no new log entries is that the execution stops at step a) above. Since the row is never found, it is not locked, changed, or logged.

ScottPletcher · February 17, 2016, 7:53pm

SQL also has to parse and execute your additional WHERE clauses. Again, for me, it's too complex for too little return with way too much potential for serious errors. But if you like it, do it.

But you'd get multiple-orders-of-magnitude better performance gains from tuning indexes, particularly getting the best clustered index on every table of any real size (512+ pages (yes, I don't agree with those who say to totally ignore a table of "only" 1000 pages, and even the originator of that admits that 1000 was just a number off the top of his head!)).

You're futzing about with microseconds and a handful of lesser-overhead I/Os when you could be eliminating 10s of 1000s+ of I/Os.

gbritton · February 18, 2016, 1:49pm

The tables in question have already been optimized. The best indexes have been found, implemented and are regularly maintained. SQL only has to parse the enhanced update query once per startup (procs, not pass-through queries) and I have never resisted giving a compiler more work to do in order to achieve the results I want. At worst, we're talking an inconsequential addition to compile time. Frankly my dear, ...

For me and my colleagues, the little extra (a few minutes per update at most) to implement and maintain the additional where clause is nothing at all. Using enhanced editor tools such as alt-select makes it a very simple operation, whether one column is being changed or a 1000.

OTOH knowing that fewer locks will be taken and fewer log rows written does matter to us, very much.