SQLTeam.com | Weblogs | Forums

Extract pattern from a string?



I am running SQL-Server 2012 and am with the following difficulty...

There is this query that returns me a string like:


I need to select the codes that can be matched with the regular expression [1-9][BCDEFHJKLNPRSGT][1-9] each one in a data row like this:


There is a way of doing it in a simple way?

PS: The string may be none, one or more sub-strings like number-letter-number.



I would use a Splitter Function (Google: "Jeff Moden string splitter") to split the string into rows using the "," delimiter, and then match the split values against your RegEx (using LIKE '[1-9][B-HJ-LNPR-T][1-9]' )


Here's the link to the article that Kristen recommended Googling for.
Heh... I can vouch for the guy that wrote it. :wink:


Thank you very much!

This function worked like a charm!

All that I needed to do was to write a WHILE loop that searches for the pattern in my string and isolate it in a separate var. Then I just ran the function against this var — and voilà! — I got my list!

To whose may interest here it is what I did:


SET @found = 0
SET @allcodes = ''

/* start selecting the contents from your table and loading into the @mystring var (input string) */
SELECT @mystring = from [YOURTABLE] where [YOUR CONDITION]

/* now it will scan the input string to find the codes /
WHILE (@found = 0)
pick the next code found using a custom Regex function /
SELECT @mycode = MyDatabase.dbo.regexObjMatch(@mystring,'[1-9][BCDEFHJKLNPRSGT][1-9]')
IF ((@mycode <> '') AND (@mycode IS NOT NULL))
add the code found to the @allcodes var */
SET @allcodes = @allcodes + @mycode + ','

		/* remove the code from input string */
		SET @mystring = RTRIM(REPLACE(@mystring,@mycode,''))

		/* remove comma at the end of the input string if any */
		WHILE (SUBSTRING(@mystring,LEN(@mystring),1) = ',')
			SET @mystring = SUBSTRING(@mystring,1,LEN(@mystring)-1)
	/* no more codes found */
	SET @found = 1


/* remove the comma at the end of @allcodes var */
IF (SUBSTRING(@allcodes,LEN(@allcodes),1) = ',')
SET @allcodes = SUBSTRING(@allcodes,1,LEN(@allcodes)-1)

/* run the function agains the @allcodes var */
select * from MyDatabase.dbo.DelimitedSplit8k (@allcodes,',')


While this would work for a single variable, it will scale poorly. What I mean is that if you had a large number of rows in a table for which you had to do this operation, it would be very slow. The method that @Kristen was suggesting would be much faster. The code would be more compact as well.


Well... that's exactly what I did (used the function).
The problem is that I have to extract the desired items from a bigger messy string first.

The SP that uses my poor code above just isolate the items or fragments from a looooong and MESSY string that contains all sort of crazy and confused product detailing. The guys that generates this string just append anything they want to the column and the coder (me) has to pull out the hair to extract human readable data from this! LoL!

In this case I had to extract these fragments (each one represents a car part damage) and turn them in URL links in the product page they belong to. Luckily, I don't have to loop along several rows of data and I always will have to deal with only one row. That messy string from where I have to extract the fragments comes in one single variable so it will work well and quickly.

I agree that it wouldn't be well scalable but I will think about this in the future IF I need to make a multiple record extraction (I doubt that it will happen for THIS case though).



You can us a CROSS JOIN .to fire a Function (such as a string splitter) at multiple rows. I haven't looked closely at your code, but that would be preferable to a loop (unless the loop is only doing a handful of iterations)


Cool, thanks Kristen, it seems to be a good idea. I am not that good in T-SQL (do mainly the basics) so I will need to take a deeper look on it!