SQLTeam.com | Weblogs | Forums

Data as XML


I am trying to write a process to parse out multiple values in a field to build another table and get this into some semblance of normal form. The problem I am having is that it appears this particular column contains free form data and there is all sorts of stuff entered. As I try to convert to xml to aid with my process, I get an error saying "Illegal qualified name character', but I cannot for the life if me find it in the data. I am taking into account the normal stuff, gt, lt, amp and so on, and also checked for ellipsis and em dashes and a ton others, but I still can't figure out what is causing this.

I did some research and saw to try CMDATA, but it doesn't seem to be helping either
CAST('<![cdata[' = my column data here + ']]>' as xml)

My questions are, am I using cdata wrong, is there a way to find these seemingly invisible characters, or should I just work on another way? The idea was to parse out the column by char(13) + char(10) into multiple records so each "line" within the notes entered can have their own record that can tie back to the main record.


Is that the complete text of the error message? What I have usually seen is a more helpful error message such as "line 17, character 2, Illegal qualified name character".


Yes, it points to a line, its the cast line I had in my message, but it's impossible to tell what the character is as I have no idea what record has it.


Without looking at your code/data it is hard to tell. One thing you should do is to use CDATA instead of cdata. SQL XML is case-sensitive (even if your SQL Server has case-insensitive collation). Compare these two:

SELECT CAST('<![cdata[' + 'abcd' + ']]>' as xml) -- generates error
SELECT CAST('<![CDATA[' + 'abcd' + ']]>' as xml) -- no error

If the error is in fact in the data stored in the column, and if you can narrow it down to a specific row, you can cast the data to VARBINARY to examine the invisible characters if any that are causing the problem. However, unless you have really strange data, usually that should not happen.