Hi experts,
I have a very large csv file (100 MB) which I'm trying to import into a table. (Using the Im port wizard). The file is comma-delimited.
I have removed all of the double quotes that are visible. But there are still some that I only found by viewing the data with Notepad ++. These " are causing the import to fail with "truncation errors".
Can I simply use Notepad ++ to remove the double quotes, then save the file as .CSV? Will it still be a comma-delimited file?
Yes a real SSIS package might be better but I'm not able to use SSIS here.
Thanks for any tips.
You could try that, but double quotes are usually used where the delimiter is within the text. for Example, you could have data like this
1,2,3,"Col1,Col2",5
they are there for a reason. There is an option in the import wizard to set the text qualifier. Change it to "
As @mike01 pointed out, you cannot simply remove double-quotes using a text editor if you have commas as part of the data.
When I run into this problem, I usually try these (in order):
- Request the data provider to change the column delimiter to something that does not exist in the data. In my case, pipe works out well.
- Use SSIS. This has (had) its problems where it would get confused by embedded commas. I think they have fixed those problems in recent versions. Since you are not able to use SSIS this is a moot point.
- Use Excel to open the file and save it with pipe (or other) delimiter. See here. Since this is a manual process, this wouldn't work for automated daily imports.
- Use a regex parser. See here for example. I haven't had to resort to this, but did investigate it a while ago, and seemed like a viable option.
I think truncation errors are usually caused because the destination table column(s) is not wide enough and not because of double quotes? am I correct?
Now that I think about it, that is more likely than not. When using import/export wizard, there is an option to specify a text qualifier, and in a lot of cases, that takes care of the embedded commas. Assuming that was done, then it is probably column width being not sufficient.
If you are able to use the import/export wizard - then you should be able to utilize SSDT. You don't have to have SSIS installed on a server to use the development environment to build and run packages.
If this is a process that needs to be automated - then you really need to follow up on some tool that will allow for that automation. SSIS is one tool...BCP is another, but could also have issues parsing CSV files if they are not well formatted.
If the expectation is to have someone manually load the data - then I see no reason why you could not utilize SSDT to build your own packages and simplify your daily/weekly/monthly processes.
Thanks, everyone. After much fanigling, I was able to import the file into a table.
What I did:
- Used Notepad ++ to save the .csv file as UnicodeText
- In the Import Wizard, the default delimiter then became 'Tab delimited'
- Increased the length of 2 columns
Then the Wizard imported ALL 584,000 rows successfully.
This is a one-time effort. Otherwise yes I would have developed an SSIS package.
Whew!