Hello. There are two main issues here:
-
You are mixing
NVARCHAR
data withVARCHAR
literals and variables.VARCHAR
is an 8-bit datatype that can only hold characters found on the code page associated with the collation being used. For string literals and variables, the collation being used is the database's default collation. Outside of UTF-8, which is the 8-bit Unicode encoding (and is only an option starting with SQL Server 2019), single byte code pages (the majority of them) can only contain, at most, 256 characters.NVARCHAR
is Unicode (which contains a potential 1,114,112 characters ; but only about 138k of them have been mapped to actual characters).
What all of that means is: when you store Unicode values in non-UTF8VARCHAR
variables, literals, or even columns, then you lose the original Unicode character IF that character does not exist on the code page being used by the collation of thatVARCHAR
destination. That is why yourascii(' ')
test returned 63. It's not that the original character and?
both share the same numeric value (as that is not possible), but the original character got converted into a "?", which is the default replacement character if a character can't be found in the destination code page.
Since you are dealing with Unicode /NVARCHAR
data, you need to do 3 things:- Use
NVARCHAR
for variables, notVARCHAR
- Prefix string literals with an upper-case "N"
- Use the
UNICODE()
function instead of theASCII()
function to get the numeric value of the first character in a string.
The following example should help:
SELECT NCHAR(2345), ASCII(NCHAR(2345)), 'ऩ', N'ऩ'; -- ऩ 63 ? ऩ SELECT ASCII('ऩ'), ASCII(N'ऩ'), UNICODE('ऩ'), UNICODE(N'ऩ'); -- 63 63 63 2345
- Use
-
When needing to deal with a special code point / character, use a binary collation (i.e. one ending in
_BIN2
). You use this in aCOLLATE
clause added to any predicate where you want to override what the collation would have been (or prevent an error if there was no initial collation). For example:DECLARE @Test NVARCHAR(500) = N'aaa bbb' + NCHAR(0xFFFE) + N'ccc'; SELECT @Test, REPLACE(@Test, NCHAR(0xFFFE), N'!'), REPLACE(@Test COLLATE Latin1_General_100_BIN2, NCHAR(0xFFFE), N'!'); -- aaa bbbccc aaa bbbccc aaa bbb!ccc
For more info on working with collations (especially in SQL Server), please visit:
Collations Info