SQLTeam.com | Weblogs | Forums

Loading 300 millions records from staging to Prod

Greeting to you all. I need help on how to load records of 300 million from staging table to production table. Currently, I am using Select into or Insert into production table. It takes me about 2 hours to load in Azure SQL database. I am think this is too much. I don't have any indexes in any of the tables. I am using SSMS and not SSIS.

Could you please help me how to load this table as fast as possible.

IF OBJECT_ID('tempdb..#DataToLoad') IS NOT NULL
DROP TABLE #DataToLoad
CREATE Table #DataToLoad
(
DataId int identity primary key,
OrganisationName nvarchar(50),
Diagnosis_code nvarchar(50)
)

Declare @Id int
Set @Id = 1

While @Id <= 300000000
Begin
Insert Into #DataToLoad values ('OrganisationName - ' + CAST(@Id as nvarchar(10)),
'DiagnosisCode - ' + CAST(@Id as nvarchar(10)) + ' ICD_10')
Print @Id
Set @Id = @Id + 1
End

Above is my test data and any help will be appreciated.

Thanking you in advance.

R

there are 3rd party software designed SPECIFICALLY for this purpose
( open source or paid )

IN BUILT sql server stuff ..which you can cleverly tweak

or chunks at a time ..

A joke here ( quantum computer .. rent it ... within nano second ) .. :wink:

you can google search ..

i dont know ( of the top of my head ) .. please excuse me

The Primary Key designation makes a Clustered Index that you must observe.

Since you've apparently done it at least twice (you did say "I am using Select into or Insert into production table"), how many more times will you need to do this and how often?

Also, are the source and target tables on the same instance? If not, where are they? For example, is the source table in an on-premise instance and the target table is in the cloud?

You should also encode the OrganisationName and Diagnosis_code, that is, use numbers to represent the names so that you don't have to store the same text over and over. Then compress those two newly-created lookup tables.

I guess there's an off chance that the DataId is the Id for the OrgName. Doesn't seem likely here, but, if so, just encode the Diagnosis_code into another table.

I follow a rule most people don't like to wait on: design a table before loading it.

It really isn't clear what you are trying to accomplish - or where the problem exists. Normally, ICD codes will be available in a lookup table and defined by CMS.

There shouldn't ever be a difference in diagnosis codes across organizations.

And the code you provided isn't looking to a staging table to load to a final table. I think we need more information on what it is you are trying to accomplish here.