Hi Guys
Good day to you all and I hope you are enjoying the weekend.
Please I need your help on this topic. I have over 300 millions of record in Azure blob which I want to load into Azure SQL database. We don't have Azure Data Factory just yet.
I downloaded the csv file from blob to my location. Used SSIS to load the data from the location to Temp table which takes over 10 min.
I then used T-SQL to load the data from staging table to production environment after transformation. There is no index in any of tables ( staging and Destination) However, the insert takes forever, about 1 hours to load the data into prod table. I have used batches in 10,000 rows but, still no better performance.
Please can anyone advise the best strategy to load such amount of data from blob to staging and destination table with better performance outcome. I think, over an hour is too much.
When you export data from SQL, it has an option for "native format". This format matches how SQL stores the data internally, so it's much faster to load, since the data is already in the proper binary format.
IF the existing table is clustered AND the new table is clustered the same way -- which presumably is true here -- then create the clus index before loading the table. Wait until after the table is loaded to create / recreate any nonclus index(es).