Purge Job Hangs Forever

jbates99 · August 28, 2017, 1:15pm

This is SQL Server 2014 SP 2
The table being purged has a Columnstore Index.
The purge criteria is a smalldatetime column based on the # of days to keep, which is passed as a parameter.

declare @curDate smalldatetime;
declare @RowsDeleted int;
declare @TotalRowsDeleted int;
declare @LoopCtr int;

Set @curDate = getDate()

Set @RowsDeleted = 1;
Set @TotalRowsDeleted = 0;
Set @LoopCtr = 0;

SET NOCOUNT ON;

WHILE @RowsDeleted> 0 -- loop until all qualifying rows have been deleted
BEGIN
Set @LoopCtr = @LoopCtr + 1;
BEGIN TRAN;
DELETE TOP (12000) from LogHistory.ServiceLog
where Datediff(day, CreatedDate, @curDate) > @DaysToKeep;
SET @RowsDeleted = @@ROWCOUNT;
SET @TotalRowsDeleted = @RowsDeleted + @TotalRowsDeleted;

	COMMIT TRAN;
 END

The Problem: Recenlty, this job has not been able to complete, even after running for days. It doesn't delete any rows. even though millions of rows qualify for deleting. I have updated state on the CreatedDate column.

What could be the problem?

Thanks for any tips.

stepson · August 28, 2017, 1:53pm

There is an article about "careful batching" - you can inspire on it
http://michaeljswart.com/2014/09/take-care-when-scripting-batches/
where Datediff(day, CreatedDate, @curDate) > @DaysToKeep; this could be change in a more "sargable" argument : where CreatedDate < DATEADD(DAY,-@DaysToKeep,@curDate) (keep in mind that datediff and dateadd are different functions and results are different)
TOP (12000) - try to use a lower / upper value ( I will use something like 1000 as start)

James · August 28, 2017, 2:08pm

Columnstore is going to be a challenge because of how data is stored. Our experience in performance was two to four times worse when modifying or deleting data compared to modifying rows in a rowstore table. All the data has to be read into memory and processed (since it is stored in compressed blob columns) and this is very memory/cpu intensive.

It got to the point where we decided that it was faster and more efficient to make changes in a rowstore table and then re-build a columnstore table that is read-only. It depends on your data of course if you can do that.

It also helps if you can ensure the data is ordered in columnstore tables by having a clustered key on the rowstore table before you build it as columnstore. That can help.

Columnstore tables are fantastic for read performance (we had improvements that were hard to believe) but don't expect it to perform if modifications are made to the data.

jbates99 · August 29, 2017, 2:11am

Thanks stepson and James. I AM performing commits in batches of 12,000 rows. it's weird. I used Columnstore Indexing to save space and the space benefits are amazing. James, I will look into making this a rowstore table. This data is very very rarely read. It's historical so I don't need to rebuild as columnstore. Thanks for your advice guys.