I dont think pre-creating the index is helpful. You could try this:
Set database recovery model to "simple"
Backup the database
Create new table without clustered index
BCP out all data from old table based on a query, rather than straight table extract, using a query that orders by the column you intend to make the primary key
BCP in the data into the new table
Create clustered index
Drop old table Rename new table
Set database recovery model back to Full, if that is what you use
Backup database
BCP out all data from old table based on a query, rather than straight table extract, using a query that orders by the column you intend to make the primary key
BCP in the data into the new table
Create clustered index (We had an issue last time creating an index on this table with 46 million rows..wouldnt we see the same isseu if we create the index after loading the data)?
Also, is there any facility in SQL server which can move data from 1 table to another as fast as bcp or bulk_insert?(or can any of these 2 be used for that)
And in BCP, can I change the batch of 1000 to something bigger ..like 10k etc?
1000 rows successfully bulk-copied to host-file. Total received: 366000 1000 rows successfully bulk-copied to host-file. Total received: 367000
1000 rows successfully bulk-copied to host-file. Total received: 368000
1000 rows successfully bulk-copied to host-file. Total received: 369000
1000 rows successfully bulk-copied to host-file. Total received: 370000
1000 rows successfully bulk-copied to host-file. Total received: 371000
You can use Import/Export wizard. Right click on the database name in SSMS object explorer select Tasks -> Import (or export). It uses SSIS internally, and is quite fast - comparable to BCP. (That is based on my casual observations; I have not measured the performance to see if it is faster/slower).
I'm working on some other suggestions but I first want to state that there is absolutely positively no reason to set the database to the SIMPLE recovery model. It breaks the log chain and is totally unnecessary for bulk logging. You only need to set the recovery model to the BULK LOGGED recovery model with the understanding that if a failure occurs during the time that you're in the BULK LOGGED mode, you will not be able to recover to the point of failure (same for SIMPLE). You can get close if you make sure that you do a log file backup before switching to the BULK LOGGED recovery model.
The BCP method is fine but then you have to do a final true up to catch any rows that may have been Inserted, Updated, or Deleted. Not fun and could take hours on top of it all unless you write a temporary "it changed" trigger to capture row identifiers during to a table during the operations involved.
And, no... you can't change the batch size on BCP out even with the batch parameter. It will be ignored. I'm not sure why they did that but that's the way it is.
There are also some other caveats that folks might not be considering. If you create a table without the clustered index, BCP into it, and then create the clustered index, the table will cause the database to grow by the size of the table (if there's not already enough space) because the HEAP will be preserved until the Clustered Index is built and then the HEAP will be dropped leaving you with at least 150GB of unused space in the database and that will be on top of the fact that you're going to drop the original table leaving at least another 150GB of free space. While that's not necessarily a bad thing, it will seriously add to restore times unless you have instant file initialization turned on. Free space in the database won't, however, affect the duration or size of your backups because the actual free space isn't stored to the backup. It's just reallocated during the restore.
And, make sure that you DO use the BULK LOGGED recovery model so that you don't blow out your log file and the related tape backups.
I'll be back. In the meantime, I need to see the full CREATE TABLE statement along with the CREATE INDEX statements of every NCI and I need to know if any FKs are pointing at this table. I also need to know if you're running the STANDARD EDITION or the ENTERPRISE EDITION as well as the names and order of the columns that you want to use for the new CI.
Last but not least, I need to know the overall size of the database, how much free space in the database you currently have, how much free space you have on the disk, the size of the 5 largest tables in the database, and how many tables you have. Depending on the limits of those things, this could be super simple or it could be painful.