I would like some help creating a data quality check query, I have a tag which is made up of five elements separated by [] so one check would be:
WHERE first_source_tag NOT LIKE '%[]%[]%[]%[_]%'
However, I then want to check the different elements so I need to know how to tell SQL that I want to check say the fifth element to check that it is not a date in the future and not a date in the wrong format, does anyone know if and how this would be possible?
My initial reaction is that your design is wrong and should be corrected. If you have five different attributes of an entity, then there should be five attributes in your relation; five columns in your table. While it would be possible to parse the string using the delimiter and string functions (like charindex and substring), you'd be better off splitting the data at the application layer and letting SQL enforce data integrity for each element.
You can use CROSS APPLYs to assign variable names to intermediate values to make the process of validation easier and "cleaner" to read. You didn't provide many details, but here's the general outline:
SELECT ...
FROM table_name
CROSS APPLY (
SELECT RIGHT(first_source_tag, CHARINDEX('_', REVERSE(first_source_tag)) - 1) AS tag5
) AS assign_alias_names
CROSS APPLY (
SELECT <check_tag5_for_valid_date>
) AS assign_alias_names2
...
Thanks Stephen, I would love that but I work in an international company and I don't have database administration rights as that sits in another country.
use sqlteam
go
create table #tags(first_source_tag nvarchar(150))
insert into #tags
select '2019-01-01[_]2019-01[_]I_am_not_a_date[_]2020-01-01[_]wack_data_design'
SELECT Split.a.value('.', 'VARCHAR(100)') AS Data ,
case
when ISDATE(Split.a.value('.', 'VARCHAR(100)')) = 1 then
case
when cast(Split.a.value('.', 'VARCHAR(100)') as date) > getdate() then 1
else 0
end
else 0
end as isFutureDate
FROM
(
SELECT CAST ('<M>' + REPLACE(first_source_tag, '[_]', '</M><M>') + '</M>' AS XML) AS Data
FROM #tags
) AS A CROSS APPLY Data.nodes ('/M') AS Split(a);
drop table #tags