SQLTeam.com | Weblogs | Forums

Double quote issue in the source file in .CSV file


#1

Hi Guys,

Here is the one Data from my source is erroring out.
"E:60 Pic: 20 years and 11 seconds" - The Travis Loy Story"",

The source file is in .csv format and In SSIS Flat file source I am using TEXT QUALIFIER = ". He is removing first " in the source file but the second one. If I remove second double quote (")
from the file, it is working. Any advice how can I fix this issue?


#2

The data is corrupt. I'd push the problem back up to the source. as it is, how can you tell what it should be:

"E:60 Pic: 20 years and 11 seconds - The Travis Loy Story",

Is that what it should be? but then, in the general case, how would you know? If the data is:

"Beef ribs boudin spare ribs turkey." Swine tongue t-bone"",

should that be:

"Beef ribs boudin spare ribs turkey. Swine tongue t-bone",

or

"Beef ribs boudin spare ribs turkey.", "Swine tongue t-bone ",

Basically, you need to have a firm contract with the source data provider and hold 'em to it. Otherwise you'll be fixing files manually forever


#3

exactly. very arbitrary ==> you will run into another issue later on. another option is eliminate all double quotes because based on this http://www.espn.com/video/clip?id=13889431 there is no quotation mark


#4

Of course, if another row had an embedded comma, you'd need to use text qualifiers

another option is to use different qualifiers. the bar

|

for example


#5

I rarely see a CSV that is fully implemented, and I'm not sure that SQL's import tools honour that specification anyway?

This is my understanding of how a "complex" CSV file should be written

CSV Vaule    Col1 Col2   Col3
A,B,C         A   B      C
A,"B,B,B",C   A   B,B,B  C
A,"B,"",B",C  A   B,",B  C

It is OK to have every column quoted, if that is easier to produce, the quoting doesn't have to be constrained to only values that need it :slight_smile:

But, either way, CSV can't handle line breaks (well ... I can't see any reason why it should NOT - the Quote-Wrapping can handle that, but I've never come across any CSV import that didn't consider end-of-line to also be end-of-row