Substring/Charindex Help...Splitting delimited column to four columns

Naheka · May 30, 2019, 2:36pm

First post so I hope I get this right.

I have a address column (Street, City, State, Zip) that is delimited by " > " (space,greater than, space).

I do not understand Substring/Charindex syntax but I was able to use some old code of mine to split out the Street, City and Zip. I just can't seem to figure out the syntax for the state value.

At this point, I'm able to get "State > Zip" together but not sure how to split it out just for State.

current code:
SUBSTRING(full_address,CHARINDEX(' > ',full_address,CHARINDEX(' > ',full_address)+1)+3,LEN(full_address)) as State

I would appreciate any help with this syntax and if anyone can point me to a resource that explains the syntax (ELI5), that would be even more so appreciated.

Note: DBAs will not likely allow functions so I resorted to SUBSTR and CHARINDEX

yosiasz · May 30, 2019, 4:44pm

create table #naheka(full_address  varchar(150))
insert into #naheka
select '1204 Big Boa Ave S > Novità > PR > 98118'


;WITH cteAddresses
AS
(
    SELECT *,
    CONVERT(XML,'<Names><name>'  
    + REPLACE(full_address,' > ', '</name><name>') + '</name></Names>') AS xmlname
      FROM #naheka
)

 SELECT *,      
 xmlname.value('/Names[1]/name[1]','varchar(100)') AS Address,    
 xmlname.value('/Names[1]/name[2]','varchar(100)') AS City,
 xmlname.value('/Names[1]/name[3]','varchar(100)') AS State,
 xmlname.value('/Names[1]/name[4]','varchar(100)') AS zipCode
 FROM cteAddresses


drop table #naheka

ScottPletcher · May 30, 2019, 5:29pm

create table #naheka(address varchar(150)) 
insert into #naheka values('1204 Big Boa Ave S > Novità > PR > 98118')
insert into #naheka values('9876 partial address > ABC_Name > XQ')
insert into #naheka values('8765 only 2 part address > ABC_Name')

SELECT
    address,
    CASE WHEN delim_1 > 0 THEN LEFT(address, delim_1 - 1)
         ELSE address END AS address_part1,
    CASE WHEN delim_2 > 0 THEN SUBSTRING(address, delim_1 + 3, delim_2 - delim_1 - 3)
         WHEN delim_1 > 0 THEN SUBSTRING(address, delim_1 + 3, 500)
         ELSE '' END AS address_part2,
    CASE WHEN delim_3 > 0 THEN SUBSTRING(address, delim_2 + 3, delim_3 - delim_2 - 3)
         WHEN delim_2 > 0 THEN SUBSTRING(address, delim_2 + 3, 500)
         ELSE '' END AS address_part3,
    CASE WHEN delim_3 > 0 THEN SUBSTRING(address, delim_3 + 3, 500)
         ELSE '' END AS address_part4
FROM #naheka
CROSS APPLY (
    SELECT CHARINDEX(' > ', address) AS delim_1
) AS ca1
CROSS APPLY (
    SELECT CASE WHEN delim_1 = 0 THEN 0 
        ELSE CHARINDEX(' > ', address, delim_1 + 3) END AS delim_2
) AS ca2
CROSS APPLY (
    SELECT CASE WHEN delim_2 = 0 THEN 0
        ELSE CHARINDEX(' > ', address, delim_2 + 3) END AS delim_3
) AS ca3

Naheka · May 30, 2019, 8:13pm

Excellent solutions! I was able to follow XML much easier than CHARINDEX and SUBSTR.

harishgg1 · May 31, 2019, 8:15am

hi

there is
STRING_SPLIT ( string , separator )

function from SQL Server 2016 or 2017

hope it helps

i love any feedback
thanks

yosiasz · May 31, 2019, 11:08pm

but if tomorrow a new > appears with a fifth data point it will go MIA

ScottPletcher · June 3, 2019, 5:17pm

I was trying to avoid the overhead of XML and multiple xml parses of data. If your data volume is low, you may not care about the overhead (at least for now).