SQLTeam.com | Weblogs | Forums

CSV load


#1

HI,

I am having a fairly large issue in trying to load from a csv file. The normal load process isn't suitable as the format of the csv file may change. The change is that the columns may not always exist. (Please advise if I have missed something in the normal load solution).
So to get around this problem I reverted to using a script. The scripting is fine, but that leads to two further issues.

  1. There is no 64 bit method of reading from the csv (Please correct me if I am wrong). The solution must work in the design tool as well as on the destination.
  2. Reading the data causes out of memory exception. This occurs whilst testing through SSDT, but also on the destination server (although had to switch to 32 bit to get around the second issue)

The file isn't particularly large 1Gb approx., a few million rows.

Does anyone have a solution that would allow me to read a csv file that loads into a staging table (format of staging subtly different to the source column)?


#2

is there a way to identify the format of the file (i.e. config file or headers)? If so, I created a stage table dynamically based on the format of the file. Then run bulk insert statement into state table. File needs to be local to the server though and use unc for paths


#3

if csv file changes columns another approach is to use powershell. based on the number of columns you can create a new staging table dynamically that matches the columns of the csv file and then ingest file content into newly created staging table. etc...

Here is one way

look at things like


#4

Another option is to create 1 wide column in a staging table and parse and use split_string or Jeff Moden's string spitter to parse the columns. If the CSV has column headings you could compare the headings to the destination table's columns and add if any new columns.