SQLTeam.com | Weblogs | Forums

Why update statement take too much time?

I working on SQL server 2012 I face issue update statement take too much time
and it is very slow

my query is

 update t set t.Revision_ID = r.rec_id 
 FROM extractreports.dbo.TempROHS t
 INNER join Z2urlSystem.zsrc.Local_URL l (NOLOCK) ON isnull(t.ZURLSource1,'N/A')  = l.local_url
 INNER join Z2urlSystem.zsrc.Online_URL o (NOLOCK) ON ISNULL(t.OnlineSource1,'N/A') = o.url 
 INNER join Z2urlSystem.zsrc.Revision r (NOLOCK) ON l.rec_id = r.local_id AND o.Rec_ID = r.online_id
 WHERE  t.[Status] IS NULL

and execution plan as below :

sample data for test :

 CREATE TABLE [dbo].[TempROHS](
 [ID] [int] IDENTITY(1,1) NOT NULL,
 [ZURLSource1] [nvarchar](3000) NULL,
 [OnlineSource1] [nvarchar](3000) NULL,
 [Revision_ID] [bigint] NULL,
 [status] [nvarchar](300) NULL
 ) ON [PRIMARY]
    
 GO
 /****** Object:  Table [zsrc].[Local_URL]    Script Date: 10/13/2021 8:59:21 AM ******/
 SET ANSI_NULLS ON
 GO
 SET QUOTED_IDENTIFIER ON
 GO
 CREATE TABLE [dbo].[Local_URL](
 [rec_id] [bigint] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
 [local_url] [nvarchar](2048) NOT NULL,
    
 ) ON [PRIMARY]
    
 GO
 /****** Object:  Table [zsrc].[Online_URL]    Script Date: 10/13/2021 8:59:21 AM ******/
 SET ANSI_NULLS ON
 GO
 SET QUOTED_IDENTIFIER ON
 GO
 CREATE TABLE [dbo].[Online_URL](
 [Rec_ID] [bigint] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
 [URL] [nvarchar](500) NOT NULL,
    
  CONSTRAINT [PK_Rec_ID_OnlineURL] PRIMARY KEY CLUSTERED 
 (
 [Rec_ID] ASC
 )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
 ) ON [PRIMARY]
    
 GO
 /****** Object:  Table [zsrc].[Revision]    Script Date: 10/13/2021 8:59:21 AM ******/
 SET ANSI_NULLS ON
 GO
 SET QUOTED_IDENTIFIER ON
 GO
 CREATE TABLE [dbo].[Revision](
 [rec_id] [bigint] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
 [online_id] [bigint] NOT NULL,
 [local_id] [bigint] NOT NULL,
  CONSTRAINT [PK_Rec_ID] PRIMARY KEY CLUSTERED 
 (
 [rec_id] ASC
 )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
 ) ON [PRIMARY]
    
 GO
 SET IDENTITY_INSERT [dbo].[TempROHS] ON 
    
 INSERT [dbo].[TempROHS] ([ID], [ZURLSource1], [OnlineSource1], [Revision_ID], [status]) VALUES (1, N'https://source.z2data.com/2017/3/30/11/41/56/874/706167429/Compliance.pdf', N'http://2t70un3m1d9z1kztamkdrd38.wpengine.netdna-cdn.com/wp-content/uploads/2016/03/5-wireTouch.pdf', NULL, NULL)
 INSERT [dbo].[TempROHS] ([ID], [ZURLSource1], [OnlineSource1], [Revision_ID], [status]) VALUES (2, N'https://source.z2data.com/2017/11/19/11/26/36/82/1315668739/rohsCoC.pdf',N'http://2t70un3m1d9z1kztamkdrd38.wpengine.netdna-cdn.com/wp-content/uploads/2016/02/accessories-hardware-1.pdf', NULL, NULL)
 INSERT [dbo].[TempROHS] ([ID], [ZURLSource1], [OnlineSource1], [Revision_ID], [status]) VALUES (3, N'https://source.z2data.com/2018/10/24/1/8/18/621/542663177/Statement_of_Compliance_PN_YACT20MJ43JNV00100_181024_160527.pdf',N'http://ac-dc.power.com/design-support/product-documents/data-sheets/linkzero-lp-datasheet', NULL, NULL)
 INSERT [dbo].[TempROHS] ([ID], [ZURLSource1], [OnlineSource1], [Revision_ID], [status]) VALUES (4, N'https://source.z2data.com/2017/6/30/8/50/44/270/1593045455/RoHS.ECC36DTAH.pdf', N'http://ark.intel.com/products/78945/Intel-Pentium-Processor-3558U-2M-Cache-1_70-GHz', NULL, NULL)
 INSERT [dbo].[TempROHS] ([ID], [ZURLSource1], [OnlineSource1], [Revision_ID], [status]) VALUES (5, N'https://source.z2data.com/2017/8/20/10/11/49/132/1114175001/t1116p.pdf', N'http://assmann.us/specs/A-DS25-HOOD-WP.pdf', NULL, NULL)
 SET IDENTITY_INSERT [dbo].[TempROHS] OFF
 SET IDENTITY_INSERT [dbo].[Local_URL] ON 
    
 INSERT [dbo].[Local_URL] ([rec_id], [local_url]) VALUES (2, N'https://source.z2data.com/2017/3/30/11/41/56/874/706167429/Compliance.pdf')
 INSERT [dbo].[Local_URL] ([rec_id], [local_url]) VALUES (3, N'https://source.z2data.com/2017/11/19/11/26/36/82/1315668739/rohsCoC.pdf')
 INSERT [dbo].[Local_URL] ([rec_id], [local_url]) VALUES (4, N'https://source.z2data.com/2018/10/24/1/8/18/621/542663177/Statement_of_Compliance_PN_YACT20MJ43JNV00100_181024_160527.pdf')
 INSERT [dbo].[Local_URL] ([rec_id], [local_url]) VALUES (5, N'https://source.z2data.com/2017/6/30/8/50/44/270/1593045455/RoHS.ECC36DTAH.pdf')
 INSERT [dbo].[Local_URL] ([rec_id], [local_url]) VALUES (6, N'https://source.z2data.com/2017/8/20/10/11/49/132/1114175001/t1116p.pdf')
 SET IDENTITY_INSERT [dbo].[Local_URL] OFF
 SET IDENTITY_INSERT [dbo].[Online_URL] ON 
    
 INSERT [dbo].[Online_URL] ([Rec_ID], [URL]) VALUES (2, N'http://2t70un3m1d9z1kztamkdrd38.wpengine.netdna-cdn.com/wp-content/uploads/2016/03/5-wireTouch.pdf')
 INSERT [dbo].[Online_URL] ([Rec_ID], [URL]) VALUES (3, N'http://2t70un3m1d9z1kztamkdrd38.wpengine.netdna-cdn.com/wp-content/uploads/2016/02/accessories-hardware-1.pdf')
 INSERT [dbo].[Online_URL] ([Rec_ID], [URL]) VALUES (4, N'http://ac-dc.power.com/design-support/product-documents/data-sheets/linkzero-lp-datasheet')
 INSERT [dbo].[Online_URL] ([Rec_ID], [URL]) VALUES (5, N'http://ark.intel.com/products/78945/Intel-Pentium-Processor-3558U-2M-Cache-1_70-GHz')
 INSERT [dbo].[Online_URL] ([Rec_ID], [URL]) VALUES (6, N'http://assmann.us/specs/A-DS25-HOOD-WP.pdf')
 SET IDENTITY_INSERT [dbo].[Online_URL] OFF
 SET IDENTITY_INSERT [dbo].[Revision] ON 
    
 INSERT [dbo].[Revision] ([rec_id], [online_id], [local_id]) VALUES (1, 2, 2)
 INSERT [dbo].[Revision] ([rec_id], [online_id], [local_id]) VALUES (2, 3, 3)
 INSERT [dbo].[Revision] ([rec_id], [online_id], [local_id]) VALUES (3, 4, 4)
 INSERT [dbo].[Revision] ([rec_id], [online_id], [local_id]) VALUES (4, 5, 5)
 INSERT [dbo].[Revision] ([rec_id], [online_id], [local_id]) VALUES (5, 6, 6)
 SET IDENTITY_INSERT [dbo].[Revision] OFF
 /****** Object:  Index [Key2]    Script Date: 10/13/2021 8:59:24 AM ******/
 ALTER TABLE [dbo].[Local_URL] ADD  CONSTRAINT [Key2] PRIMARY KEY NONCLUSTERED 
 (
 [rec_id] ASC
 )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
 GO
    
 update t set t.Revision_ID = r.rec_id 
  FROM dbo.TempROHS t
  INNER join dbo.Local_URL l (NOLOCK) ON t.ZURLSource1  = l.local_url
  INNER join dbo.Online_URL o (NOLOCK) ON t.OnlineSource1 = o.url 
  INNER join dbo.Revision r (NOLOCK) ON l.rec_id = r.local_id AND o.Rec_ID = r.online_id
  WHERE  t.[Status] IS NULL

The sample data, schema and UPDATE statement, while graciously appreciated, do not generate the same query plan or query statistics as the one you posted online. It also doesn't use the same UPDATE statement you're using in the original query, and the sample is missing several indexes that are listed in the plan. This will make it more difficult to determine the issue.

Another concern is "too much time" and "very slow", that's not particularly informative. Do you have actual numbers/durations on how long the query runs? You can also help by setting SET STATISTICS IO, TIME ON before you run the UPDATE statement and pasting those results.

The things that pop out as points to address are:

  1. ISNULL() in a JOIN condition is going to force a scan on a table or index, however...

  2. Since your JOINs are happening on URL columns that are declared NVARCHAR(3000), an index isn't likely to help (or even be created anyway)...

  3. And NVARCHAR(3000) is inconsistent with the URL columns you're JOINing to in the other tables (NVARCHAR(2048) and NVARCHAR(500)...

  4. Since you already have one defined as NVARCHAR(500), that should be your max size for all URL columns.

  5. The Local_URL table in Plan Explorer is showing a Table Scan and has the highest overall cost within the plan. This is due to the aforementioned ISNULL in the JOIN condition.

I don't have any particular suggestions on how to improve your situation, assuming you cannot change the table schema/design?

thank you for reply
can you write these statement below by joining using id
without join by url text
are there are any way to do that

 update t set t.Revision_ID = r.rec_id 
 FROM extractreports.dbo.TempROHS t
 INNER join Z2urlSystem.zsrc.Local_URL l (NOLOCK) ON t.ZURLSource1 = l.local_url
 INNER join Z2urlSystem.zsrc.Online_URL o (NOLOCK) ON t.OnlineSource1 = o.url 
 INNER join Z2urlSystem.zsrc.Revision r (NOLOCK) ON l.rec_id = r.local_id AND o.Rec_ID = r.online_id
 WHERE  t.[Status] IS NULL

Not with the current table schema. That's why I mentioned whether you can change it or not. If you can't then there's nothing else I can think of. Adding or modifying some indexes may help a little bit, but again it depends on what you mean by "very slow".

i remove is null as above it solve issue
support i need to update 5 miilion
the statement above will take too much time because i join by url
so i need to join by id but i don't know how
are there are any way convert text to id to be fast when update
I mean very slow because update statement take 6 hour
to update 68 k

I'm struggling to understand your reply:

i remove is null as above it solve issue

That suggests you don't need any more help. Is the problem (slow performance) actually solved by removing the ISNULL?

support i need to update 5 miilion
...
I mean very slow because update statement take 6 hour to update 68 k

Do you need to update 68 thousand rows, or 5 million? This is very confusing.

the statement above will take too much time because i join by url
so i need to join by id but i don't know how
are there are any way convert text to id to be fast when update

The way that TempROHS table is designed, no, you can't JOIN by ID. That's why I'm asking if you can change the schema or not. I apologize if I wasn't clear about that. If the table schemas CANNOT be modified, then I don't have any more guidance to give you, you will have to accept the current poor performance.

If you only need to run this UPDATE once, I'd say just run it. If you have to update this daily, and the performance isn't acceptable, then you need to redesign these tables. (Frankly, throw out the current table designs and use new ones)

URLs do not allow unicode chars (you have to use a % to escape non-standard chars in a URL). So change the URLs to varchar (rather than nvarchar) and choose a shorter, more reasonable length.

You really should encode the URLs everywhere, that is, use a number/id to represent the URL so you don't have to repeat URL strings in many tables (gak!).

But, encoded or not, you also need to change the clustering on the tables to match the joins/lookups. So, cluster the tables as follows:
Local_URL: ( local_url ) or ( local_url_id ) --if URL is encoded
Online_URL: ( online_url ) or ( online_url_id ) --if URL is encoded
Revision: ( local_id, online_id ) --rather than identity!!

It is a stupid and extremely damaging myth that tables should (almost) always be clustered on identity. That's just 100% wrong.