SQL Batches Update

Hi Guys,

I am trying to update 1. 7 millions records out of 16 million records in production table database.

I am not sure what to do. I tried to do it in batches which work but supposing the update fails, how do I tracked the Update batches-ID and where it fails occur in the batches for row back transaction. See my code below:

declare @counter int
declare @numOfRecords int
declare @batchsize int

set @numOfRecords =
(SELECT COUNT(*) AS NumberOfRecords

FROM dbo.tbl_Table with(nolock))
set @counter = 0
set @batchsize = 25000

set rowcount @batchsize
while @counter < (@numOfRecords/@batchsize) +1
begin
set @counter = @counter + 1
Update dbo.tbl_Table
set [field] = 'OAU',
[SecondField] = Hello'
where [field] <> 'OAU'
and [SecondField] <> 'Hello' ;
end
set rowcount 0

The code needs to work based on the clustering key of the table. What is(are) the clustering key column(s)?

Hi Scott,
Thanks for your help.
See my sample table below.

USE [AdventureWorksDW2012]
GO
--Change 1000000 to the number of your preference for your needs
SELECT TOP 1700000
c1.[BusinessEntityID], --- primarykey
c1.[FirstName], -- Update First name to Help
c2.[LastName], --Update Last Name - help2
c1.[Title]

FROM [Person].[Person] c1
CROSS JOIN [Person].[Person] c2

Basically, the table is 16 million records but only want to update 1.7m in production environment. How do that in batches and also able track batch ID in transation if it fails. The primary key is the Business Indenty Key.

Thanks

SET ANSI_NULLS ON; 
SET ANSI_PADDING ON;
SET ANSI_WARNINGS ON;
SET ARITHABORT ON;
SET CONCAT_NULL_YIELDS_NULL ON;
SET NOCOUNT OFF;
SET NUMERIC_ROUNDABORT OFF;
SET QUOTED_IDENTIFIER ON;
SET XACT_ABORT ON;

DECLARE @batch_size_of_update int;
DECLARE @BusinessEntityID int;
DECLARE @rows_updated int;
DECLARE @total_rows_updated int;
DECLARE @total_rows_to_update int; 

SET @batch_size_of_update = 25000;  
SET @total_rows_to_update = 1700000;

SET @BusinessEntityID = 0
SET @rows_updated = 0;
SET @total_rows_updated = 0;

PRINT 'Start time = ' + CONVERT(varchar(30), GETDATE(), 0) + '.';

WHILE 1 = 1
BEGIN
    UPDATE p
    SET [FirstName] = 'Help',
        [LastName] = 'Help2' 
    FROM Person.Person p
    INNER JOIN (
        SELECT TOP (@batch_size_of_update) BusinessEntityID
        FROM Person.Person
        WHERE BusinessEntityID > @BusinessEntityID
        ORDER BY BusinessEntityID
    ) AS p_to_del ON p_to_del.BusinessEntityID = p.BusinessEntityID
    WHERE p.BusinessEntityID > @BusinessEntityID
    SET @rows_updated = @@ROWCOUNT;
    IF @rows_updated = 0
        BREAK;
    SET @total_rows_updated = @total_rows_updated + @rows_updated;
    PRINT 'Rows updated so far = ' + CAST(@total_rows_updated AS varchar(10)) + '.';
    IF @total_rows_updated >= @total_rows_to_update
        BREAK;
    IF @batch_size_of_update > 10000 AND @batch_size_of_update <= 30000
        WAITFOR DELAY '00:00:00.100';
    ELSE
        WAITFOR DELAY '00:00:00.250';
END --WHILE

CHECKPOINT; 

PRINT '';
PRINT 'Total rows updated = ' + CAST(@total_rows_updated AS varchar(10));
PRINT 'End time = ' + CONVERT(varchar(30), GETDATE(), 0) + '.';

Hi Scott,

You are a genius.
Much apprciated. Where are you based?

Thanks.

Hi Scott,

Sorry to bother you again. I am struggling to understand this statement - IF @batch_size_of_update > 10000 AND @batch_size_of_update <= 30000

What does this mean. Basically, I only want to update only 1.4M from 16M records in that table.

Thanks

You can drop that part, the whole IF and ELSE, if you want.

It's just testing the size of batch to see how long to delay before deleting the next batch. It's solely to give the SQL instance a little "breather" between deletes, just in case the server is busy. So of course it's optional. The delay is only 1/10 of a second, but that's still fairly long for a modern computer(!).