Hi Guys,
I am trying to update 1. 7 millions records out of 16 million records in production table database.
I am not sure what to do. I tried to do it in batches which work but supposing the update fails, how do I tracked the Update batches-ID and where it fails occur in the batches for row back transaction. See my code below:
declare @counter int
declare @numOfRecords int
declare @batchsize int
set @numOfRecords =
(SELECT COUNT(*) AS NumberOfRecords
FROM dbo.tbl_Table with(nolock))
set @counter = 0
set @batchsize = 25000
set rowcount @batchsize
while @counter < (@numOfRecords/@batchsize) +1
begin
set @counter = @counter + 1
Update dbo.tbl_Table
set [field] = 'OAU',
[SecondField] = Hello'
where [field] <> 'OAU'
and [SecondField] <> 'Hello' ;
end
set rowcount 0
The code needs to work based on the clustering key of the table. What is(are) the clustering key column(s)?
Hi Scott,
Thanks for your help.
See my sample table below.
USE [AdventureWorksDW2012]
GO
--Change 1000000 to the number of your preference for your needs
SELECT TOP 1700000
c1.[BusinessEntityID], --- primarykey
c1.[FirstName], -- Update First name to Help
c2.[LastName], --Update Last Name - help2
c1.[Title]
FROM [Person].[Person] c1
CROSS JOIN [Person].[Person] c2
Basically, the table is 16 million records but only want to update 1.7m in production environment. How do that in batches and also able track batch ID in transation if it fails. The primary key is the Business Indenty Key.
Thanks
SET ANSI_NULLS ON;
SET ANSI_PADDING ON;
SET ANSI_WARNINGS ON;
SET ARITHABORT ON;
SET CONCAT_NULL_YIELDS_NULL ON;
SET NOCOUNT OFF;
SET NUMERIC_ROUNDABORT OFF;
SET QUOTED_IDENTIFIER ON;
SET XACT_ABORT ON;
DECLARE @batch_size_of_update int;
DECLARE @BusinessEntityID int;
DECLARE @rows_updated int;
DECLARE @total_rows_updated int;
DECLARE @total_rows_to_update int;
SET @batch_size_of_update = 25000;
SET @total_rows_to_update = 1700000;
SET @BusinessEntityID = 0
SET @rows_updated = 0;
SET @total_rows_updated = 0;
PRINT 'Start time = ' + CONVERT(varchar(30), GETDATE(), 0) + '.';
WHILE 1 = 1
BEGIN
UPDATE p
SET [FirstName] = 'Help',
[LastName] = 'Help2'
FROM Person.Person p
INNER JOIN (
SELECT TOP (@batch_size_of_update) BusinessEntityID
FROM Person.Person
WHERE BusinessEntityID > @BusinessEntityID
ORDER BY BusinessEntityID
) AS p_to_del ON p_to_del.BusinessEntityID = p.BusinessEntityID
WHERE p.BusinessEntityID > @BusinessEntityID
SET @rows_updated = @@ROWCOUNT;
IF @rows_updated = 0
BREAK;
SET @total_rows_updated = @total_rows_updated + @rows_updated;
PRINT 'Rows updated so far = ' + CAST(@total_rows_updated AS varchar(10)) + '.';
IF @total_rows_updated >= @total_rows_to_update
BREAK;
IF @batch_size_of_update > 10000 AND @batch_size_of_update <= 30000
WAITFOR DELAY '00:00:00.100';
ELSE
WAITFOR DELAY '00:00:00.250';
END --WHILE
CHECKPOINT;
PRINT '';
PRINT 'Total rows updated = ' + CAST(@total_rows_updated AS varchar(10));
PRINT 'End time = ' + CONVERT(varchar(30), GETDATE(), 0) + '.';
Hi Scott,
You are a genius.
Much apprciated. Where are you based?
Thanks.
Hi Scott,
Sorry to bother you again. I am struggling to understand this statement - IF @batch_size_of_update > 10000 AND @batch_size_of_update <= 30000
What does this mean. Basically, I only want to update only 1.4M from 16M records in that table.
Thanks
You can drop that part, the whole IF and ELSE, if you want.
It's just testing the size of batch to see how long to delay before deleting the next batch. It's solely to give the SQL instance a little "breather" between deletes, just in case the server is busy. So of course it's optional. The delay is only 1/10 of a second, but that's still fairly long for a modern computer(!).