Help return the right count from nvarchar with mixed values

Hommer · November 7, 2016, 4:04pm

Hi, my dear experts,

The objective is to get a count of test scores for any given school that is above or below certain threshold..
I have created this sample data for my issue. The challenge is on how to handle the data in scores column which is a nvarchar string.

USE [tempdb]
GO
create table TestScores
(Stu_id int primary key,
Scores nvarchar null, --should be nvarchar (20)
School nvarchar null
)
go

insert into TestScores
Select 1, '133', 'ABC' union
Select 2, '135', 'ABC' union
Select 3, '131', 'ABC' union
Select 4, '.', 'ABC' union
Select 5, '--', 'ABC' union
Select 6, '%', 'ABC' union
Select 7, 'a word', 'ABC' union
Select 8, '45', 'ABC' union
Select 9, '199', 'BBC' union
Select 10, '100', 'ABB'

Select * from TestScores

--Select count (scores), School from TestScores group by School

Select count (scores), School from TestScores
Where
--scores < 133
--Msg 245, Level 16, State 1, Line 1
--Conversion failed when converting the nvarchar value '.' to data type int.
--replace (scores, '.','') < 133
replace (replace (replace (replace (scores, '.','0'), '--','0'), '%','0'), 'a word','0') > 133
group by School

My last Select kind of 'worked', but I don't like it. Not just because it is ugly, but also it will break in the future if other un-account for text get into the field.

What is the right and elegant way to handle this? Controlling how data get in there is not an option.

Thanks!

ScottPletcher · November 7, 2016, 4:56pm

You can take advantage of the fact that a CASE statement is always processed sequentially in SQL Server, which allows you to ignore nonnumeric values before doing a numeric comparison:

DECLARE @comparison_score tinyint
SET @comparison_score = 133

SELECT COUNT(ts.scores) AS ScoreCount, ts.School
FROM TestScores ts
WHERE
    1 = CASE
        WHEN Scores LIKE '%[^0-9]%' THEN 0
        WHEN Scores < @comparison_score THEN 1
        ELSE 0 END
GROUP BY ts.School

Hommer · November 7, 2016, 5:13pm

Wow, Regular expression, that is cool!

By the way, is there a way to modify the Where Clause so that those schools that do not meet the comparison also included in the returned set with a count of 0?

Kristen · November 7, 2016, 5:21pm

Sadly SQL only provides a cut-down RegEx when using LIKE ... you have to do some ugly workarounds if you want a full-blown RgegEx

ScottPletcher · November 7, 2016, 5:56pm

SELECT SUM(CASE WHEN 1 = CASE WHEN Scores LIKE '%[^0-9]%' THEN 0 
                WHEN Scores < @comparison_score THEN 1 
                ELSE 0 END
           THEN 1 ELSE 0 END) AS ScoreCount, ts.School
FROM TestScores ts
GROUP BY ts.School