SQLTeam.com | Weblogs | Forums

Data Quality checking of strings, delimited by [_]

sql2008

#1

Hello all!

I would like some help creating a data quality check query, I have a tag which is made up of five elements separated by [] so one check would be:
WHERE first_source_tag NOT LIKE '%[
]%[]%[]%[_]%'

However, I then want to check the different elements so I need to know how to tell SQL that I want to check say the fifth element to check that it is not a date in the future and not a date in the wrong format, does anyone know if and how this would be possible?

Thanks in advance!


#2

My initial reaction is that your design is wrong and should be corrected. If you have five different attributes of an entity, then there should be five attributes in your relation; five columns in your table. While it would be possible to parse the string using the delimiter and string functions (like charindex and substring), you'd be better off splitting the data at the application layer and letting SQL enforce data integrity for each element.


#3

You can use CROSS APPLYs to assign variable names to intermediate values to make the process of validation easier and "cleaner" to read. You didn't provide many details, but here's the general outline:

SELECT ...
FROM table_name
CROSS APPLY (
    SELECT RIGHT(first_source_tag, CHARINDEX('_', REVERSE(first_source_tag)) - 1) AS tag5
) AS assign_alias_names
CROSS APPLY (
    SELECT <check_tag5_for_valid_date>
) AS assign_alias_names2
...

#4

Thanks Stephen, I would love that but I work in an international company and I don't have database administration rights as that sits in another country. :frowning:


#5

Thanks Scott, I will have a look at this soon! :slight_smile: