SQLTeam.com | Weblogs | Forums

Unable to load flat file: unicode and non-unicode

sql2008r2

#1

Hello!

I'm developing a SSIS package to load a flat file into a table, but a warning appears on varchar fields:

Column "xyz" cannot convert between unicode and non-unicode strings data types.

The database is SQL_Latin1_General_CP1_CI_AI and the file was UTF-8, so that message makes sense.

Instead of adding filters, derived columns or whatever in the SSIS project, I could change the program that creates the flat file, so it is ANSI iso-8859-1 or Latin1 now, but the error message has not gone. I deleted the project and started a new one without success.

What am I missing?

Thanks a lot...


#2

The Code Page needs to be '1252 (ANSI - Latin 1) for SSIS. If the code page for the file is not setup that way - then you need to add a derived columns transformation to convert to ANSI.


#3

Thanks for the quick reply.

I've checked that it says that CodePage is 1252 in the Propierties for the flat file conection manager.

I've also checked that the CodePage has the same value for all Input Columns in the Advanced Editor for OLE DB Destination.

The DefaultCodePage is also 1252 for the database. It doesn't matter if AlwaysUseDefaultCodePage is true or false. BTW, those columns in the table are of nvarchar type.

This is the reason why I'm confused.

Any other hint?


#4

UPDATE:

I changed the nvarchar to varchar datatype in the destination table and this error disappeared.

Then, when the flat file has UTF-8 encoding, why I couldn't load it to nvarchar fields as this datatype is for UNICODE texts? I guess that there was another config in the middle of the flow that was bothering. Which one?

Thanks again...


#5

I was going to suggest making the change in the table - but I see you already did that. As to why the UTF file wouldn't load - I would guess it has to do with the code page that was selected not matching the code page for the table/columns.


#6

It is supposed that NVARCHAR type is for Unicode, but as the database was still defined as Latin1, I'd expect a datatype bypass and store them in UTF-8 like in the source flat file.

In other words, how is stored an accented word like "América" in a NVARCHAR field if the source file is Latin1? It is automatically mapped to UTF? It seems it's not...