Insert into from view results hangs using cursor

sjeakins · March 22, 2017, 2:36pm

I'm doing inserts into a table with no foreign keys from an enormous view (53 tables).

If I do my inserts / lookup with a straight join, it takes a day to run. This is a data-load process, run only one time.

To speed things up a bit I am using a cursor (OH NO!)

At any rate, in the heart of the cusor read is the following:

FETCH FROM #tmptable

WHILE @@FETCH_STATUS = 0 
BEGIN  

          SELECT ...search variables

          INSERT INTO ResultsTable(8 fieldlist...)
          SELECT DISTINCT 8 fieldlist...
      FROM myView
          where 6 where clauses...

      FETCH NEXT FROM ...
END

This runs fine for awhile and then seems to freeze. It's not really frozen but it takes many minutes to execute (executes fine for awhile and then hangs).

I am displaying the search criteria so I can tell what it's looking for when it hangs. I can run the above code with the correct where clause and it takes 6 seconds. But in the cursor it just stalls.

There seems to be plenty of space in tempdb.

I have run an execution plan and client statistics. Both are clean as a whistle with no recommendations.

So any ideas would be appreciated.

Oh and PS - I have tried the query without DISTINCT. No joy.

Thanks

ScottPletcher · March 22, 2017, 4:16pm

I can't conceive of how a cursor would ever speed up something like this.

Odds are you need to ignore any existing view(s) and just build the specific view(s) you need for this task.

sjeakins · March 22, 2017, 5:25pm

Not that it matters but all 53 tables are needed. Also I have a correct join running as well, instead of the cursor.

I now believe that the problem is in the FETCH NEXT. I'm able to see the data in the results table but not the next select for the new criteria, so it's stalling on the FETCH NEXT which is from a #temp table I also believe our sandbox server is too small.

jeffw8713 · March 22, 2017, 7:43pm

This appears to be a matter of 'batching' the inserts using a cursor...

I would use SSIS instead. Using SSIS would allow me to use the view and control the batch and commit sizes to manage the transaction log and improve the overall load process. Once the query was parsed - then it just becomes a matter of spooling the output to SSIS and SSIS then spooling the data into the destination table.

ScottPletcher · March 22, 2017, 8:09pm

If batches are needed, try processing ranges of clustering keys on the main/large tables. Again, I can't imagine a cursor could ever speed this up.

For a lot of logging activity, be sure to pre-allocate enough log space rather than having the log dynamically expand, which is excruciatingly slow.

sjeakins · March 23, 2017, 3:15am

Wouldn't have thought of that. The lookup on the view is easier to manage with the cursor.

So the bottom line is that I moved back to table joins - no more cursor. I also have been able to avoid the bulk load altogether - put the load onto the other procedures that add / remove data rows. So the whole thing is much more manageable now.

But thanks everyone for the help.

sjeakins · March 23, 2017, 3:16am

Great idea, thanks!

Kristen · March 23, 2017, 8:39am

There may be smarter ways, but I'm an old dog ...

Given the DISTINCT I would be inclined to try the SELECT DISTINCT ... INTO #TEMP and add an IDENTITY column.

I would then

INSERT INTO ResultsTable(8 fieldlist...)
SELECT 8 fieldlist...
FROM #TEMP
WHERE ID BETWEEN @Offset AND @Offset+10000

and then LOOP around that incrementing @Offset by 10000 (or whatever batch size makes sense)

However, it is bothering me that for huge numbers of rows maybe the task of creating them INTO a #TEMP table will be a mammoth task and so No Gain??

If ResultsTable is empty to start with (or, perhaps even if it isn;t) maybe it even makes sense to output the VIEW Query to a BCP file (using NATIVE data mode), pre-sorted by Clustered Index and then import it from BCP file into ResultsTable (using the Clustered Index hint)

Other thought:

DROP all secondary indexes (i.e. NOT the clustered index) before the INSERT and then recreate them afterwards (bonus: the recreated indexes will have no fragmentation and fresh Stats )