Help return the right count from nvarchar with mixed values

Hi, my dear experts,

The objective is to get a count of test scores for any given school that is above or below certain threshold..
I have created this sample data for my issue. The challenge is on how to handle the data in scores column which is a nvarchar string.

USE [tempdb]
GO
create table TestScores
(Stu_id int primary key,
Scores nvarchar null, --should be nvarchar (20)
School nvarchar null
)
go

insert into TestScores
Select 1, '133', 'ABC' union
Select 2, '135', 'ABC' union
Select 3, '131', 'ABC' union
Select 4, '.', 'ABC' union
Select 5, '--', 'ABC' union
Select 6, '%', 'ABC' union
Select 7, 'a word', 'ABC' union
Select 8, '45', 'ABC' union
Select 9, '199', 'BBC' union
Select 10, '100', 'ABB'

Select * from TestScores

--Select count (scores), School from TestScores group by School

Select count (scores), School from TestScores
Where
--scores < 133
--Msg 245, Level 16, State 1, Line 1
--Conversion failed when converting the nvarchar value '.' to data type int.
--replace (scores, '.','') < 133
replace (replace (replace (replace (scores, '.','0'), '--','0'), '%','0'), 'a word','0') > 133
group by School

My last Select kind of 'worked', but I don't like it. Not just because it is ugly, but also it will break in the future if other un-account for text get into the field.

What is the right and elegant way to handle this? Controlling how data get in there is not an option.

Thanks!

You can take advantage of the fact that a CASE statement is always processed sequentially in SQL Server, which allows you to ignore nonnumeric values before doing a numeric comparison:

DECLARE @comparison_score tinyint
SET @comparison_score = 133

SELECT COUNT(ts.scores) AS ScoreCount, ts.School
FROM TestScores ts
WHERE
    1 = CASE
        WHEN Scores LIKE '%[^0-9]%' THEN 0
        WHEN Scores < @comparison_score THEN 1
        ELSE 0 END
GROUP BY ts.School
1 Like

Wow, Regular expression, that is cool!

By the way, is there a way to modify the Where Clause so that those schools that do not meet the comparison also included in the returned set with a count of 0?

Sadly SQL only provides a cut-down RegEx when using LIKE ... you have to do some ugly workarounds if you want a full-blown RgegEx

1 Like
SELECT SUM(CASE WHEN 1 = CASE WHEN Scores LIKE '%[^0-9]%' THEN 0 
                WHEN Scores < @comparison_score THEN 1 
                ELSE 0 END
           THEN 1 ELSE 0 END) AS ScoreCount, ts.School
FROM TestScores ts
GROUP BY ts.School
1 Like