Extract a substring

I'm stumbling all over this one; thought it time to just ask for some help. :slight_smile:

I need to extract the substring that begins with the final "." followed by an "A" and ends with the character before the final "-"

For example, given the string "1987-AB.99.A185a-15", I need to extract the "A185a"

Assistance greatly appreciated!


;WITH cte_data AS (
    SELECT '1987-AB.99.A185a-15' AS string
)
SELECT *, LEFT(string2, string2_length) AS result
FROM cte_data
CROSS APPLY ( SELECT RIGHT(string, CHARINDEX('A.', REVERSE(string))) AS string2 ) AS ca1
CROSS APPLY ( SELECT CHARINDEX('-', string2) - 1 AS string2_length ) AS ca2

Hello! (I'm Italian too :smile:)

I think you want to find all occurrences and not just one.
For example, if your string started with the character '.':

".1987-AB.99.A185a-15"

then I think you want to get two strings:

  1. "1987"
  2. "A185a"

and so on in all other cases.

I propose, then, this solution:

declare @str varchar(50) = '1987-AB.99.A185a-15'
;with cte as
(	
	select
		row_number() over(order by @str) as rn
		, value as v from string_split(@str, '.')
)
select
	left(v, charindex('-', v)-1) as s
from 
	cte
where
	rn > 1 and
	charindex('-', v) > 1

Try changing the @str string yourself now to see the effect.
I hope it works!

P.S.
I forgot!!!
It is now easy to also include the condition that after the "." there must be the letter "A".
In fact, it is enough to add the condition to WHERE:

and left(v,1)='A'

What about string?:
declare @str varchar(50) = '1987-AB.99-2.A185a-15' --added "-2" to the string

It falls into the more general case that I hypothesized in my answer, where the "A" after the "." is not taken into account.
But, as you can see, I added a P.S. to my response to take this into account.

@ScottPletcher, what about string? :slight_smile:
SELECT '.A1987-AB.99.A185a-15' AS string --I added ".A" to the beginning of the string.

I'm working with your replies. A couple of added details:

  1. I am (of course) working with a data table that includes the original string data. I have changed Scott's "WITH cte_data" code block accordingly

  2. The sought after pattern, '%.A%-%' should appear only once in each string in the data table; there are other controls in place at the time the strings are created that constrain the string contents

  3. Ultimately, I need a list of the unique substrings - in my current data, that comes to about 400 unique substrings out of 22,000 data rows.

After tweaking Scott's original code, I have seemingly exactly what I needed.

Thanks for the assistance!

hi

I hope this helps.

i have tried this in a different, simpler way.

create data script

drop table #data
create table #data(strng varchar(30) )

insert into #data select '1987-AB.99.A185a-15'
insert into #data select '1988-DC.100.A2000000b-17'
insert into #data select '1987-AB.99-2.A185a-15'
insert into #data select '.A1987-AB.99-2.A185a-15'

select     
   reverse(substring(reverse(strng),charindex('-',reverse(strng))+1,charindex('A.',reverse(strng))-3)) as strng 
from #data