SQLTeam.com | Weblogs | Forums

I need to split street adress / number

tsql

#1

Hi
I have a table with street name and number in same col. I need to separate Street name and number in separate col.

"Haukelandsbakken 40"
"Nesttunveien 90A"
"Vestre Torggate 12"

needs to ble splitted to two colums:
"Haukelandsbakken" + "40"
"Nesttunveien" + "90A"
"Vestre Torggate" + "12"

In Excel I'd find posistion of first numer, and made a split based on position. I've been away from TSQL a few years, so I appretiate a hint
br
Svein


#2

What if it's a two-part street name? What if the "number" is not always a number? e.g.

Nieuwe Prinsengracht A90

What if "A90" is a street name (it can be the name of a highway, e.g.)

I can spot the name and "number". Can a program (any program?) do it consistently? You need some AI to get it done. In fact, you probably need HI (human intelligence) to get close to 100% (even humans have been known to make mistakes! :slightly_smiling:

However if you're willing to tolerate less than 100% correct results it shouldn't be too hard.


#3

The street name can be one / two words - yes
Street name are always written in letters (her in Norway)

aaa bb 12
aaaaa 12A

And, yes - I can accept less than 100% , so some kind of human control afterwards is acceptable.

  • Have heard of humans doing mistake - yes
  • Have heard of humans admintting dioing a mistake - rarly :slight_smile:

#4

If splitting on last space in string is good enough for you, then this might work for you:

select left(yourcol,len(yourcol)-charindex(' ',reverse(yourcol)))
      ,right(yourcol,charindex(' ',reverse(yourcol))-1)
  from yourtable
;

Otherwise you might want to use frontend program or a string splitter (DelimitedSplit8K by Jeff Moden)


#5

This perhaps? Might give you a bit more control over the patter than you need to match

;WITH CTE AS
(
	SELECT	[Address] = 'Haukelandsbakken 40'
	UNION ALL SELECT 'Nesttunveien 90A'
	UNION ALL SELECT 'Vestre Torggate 12'
)
, CTE2 AS
(
	SELECT	[Offset] = PATINDEX('% [0-9]%', [Address])
		, [Address]
	FROM	CTE
)
SELECT	[FirstPart] = LEFT([Address], Offset-1)
	, [SecondPart] = STUFF([Address], 1, Offset, '')
FROM	CTE2

#6

Due to some rows missing data in this col, I had to adjust, but thanks for giving a good tip.. Ended up with ltrim, and dropped the '-1' in 'right' statment:
select ...
left(street,len(street)-charindex(' ',reverse(street))) as Street,
ltrim(right(street,(charindex(' ',reverse(street))))) as StreetNumber
..

With -1, the step failed for rows with no data 0-1 = -1, and the right functtion does not like negative numbers

:smile:


#7

Better to exclude them from the process perhaps? Either in WHERE clause, or using CASE

Your LTrim() is probably more CPU, but also will be a problem if you need to include any other separator characters - such as "-" or "," (might not be an issue of course). OTOH LTrim() will also take care of multiple spaces which probably exists in your data (although when using REVERSE you would need Trim() on the Street as well as LTrim() on the StreetNumber)

Do you not have street names with two, or more, words which have no Street Number? They will split to give you the last word as the Street Number.

;WITH cte AS
(
	SELECT	[Address] = 'Haukelandsbakken 40'
	UNION ALL SELECT 'Nesttunveien 90A'
	UNION ALL SELECT 'Vestre Torggate 12'
	UNION ALL SELECT 'Two Words'
	UNION ALL SELECT 'OneWord'
)
, CTE2 AS
(
	SELECT	[OffsetPatIndex] = PATINDEX('% [0-9]%', [Address])
		, [OffsetSpace] = charindex(' ',reverse([Address]))
		, [Address]
	FROM	cte
)
SELECT	[Kristen1] = CASE WHEN [OffsetPatIndex] > 0
				THEN LEFT([Address], OffsetPatIndex-1)
				ELSE [Address]
				END
	, [Kristen2] = CASE WHEN [OffsetPatIndex] > 0
				THEN STUFF([Address], 1, OffsetPatIndex, '')
				ELSE NULL	-- Could use Blank String here instead
				END
	, [Bitsmed_1] = CASE WHEN [Address] LIKE '% %'
				THEN left([Address], len([Address]) - OffsetSpace)
				ELSE [Address]
				END
	, [Bitsmed_2] = CASE WHEN [Address] LIKE '% %'
				THEN right([Address], OffsetSpace-1)
				ELSE NULL	-- Could use Blank String here instead
				END
	, [sfrette1] = left([Address], len([Address]) - OffsetSpace)
	, [sfrette2] = ltrim(right([Address], OffsetSpace))
FROM	CTE2

EDIT: Improved [Bitsmed_1]