Loop through results returned by Pivot Table

barnabeck · June 4, 2016, 7:12pm

I have an Item Table with text string column that contains information on how to build that item. There are 8000 items.
Example: "length: 150 mm, width: 65 mm, shape: conic" The features can be up to 150 and not every item is described with the same features.

I need to build a table with the columns: ItemID and all possible 150 features

Step: I build a table 'Features' with all possible features by splitting the text with delimiter ',' and removing the values so I get only 'length', 'width', 'shape'.
Step: I build a pivot query with the item table and the features table that returns:

Item Length Width Shape ....

TW24    150      65      conic  ....

The problem I'm having is that the pivot table returns just one line for each item I do the split for and I can't imagine how to loop through all items.

declare @Categories nvarchar(max) = ''
declare @sql nvarchar(max)
declare @testString varchar(max)

set @testString = (SELECT 'ITEMID:'+ItemID+'<br />'+Text_T from (
select WK_ITEMTEXT.ITEMID, Text_T, row_number() over(order by TEXT_T) as ord	
           FROM WK_ITEMTEXT
				  left outer join Inventtable a on WK_ITEMTEXT.ITEMID = a.itemid and a.dataareaid = 'ES75'
                  left outer joinDOCUREF b on a.RECID = b.REFRECID AND b.ACTUALCOMPANYID = 'ES75' and b.REFTABLEID = 175 and b.name not like '%Cert%'
                  left outer join DOCUVALUE c on b.VALUERECID =  c.RECID
                  left outer join DOCUPARAMETERS d on d.DATAAREAID = 'ES75'					  
		   WHERE left(a.itemname,2) ='TW' and b.Name = 'Plano' and Text_T is NOT NULL)a
where ord = 1)

;with query as (
SELECT Tag, Value from (SELECT Item, CASE WHEN CHARINDEX(':', Item, 0) = 0 and left(item,11) = 'Certificado' THEN 'Certificado' 
              WHEN CHARINDEX(':', Item, 0) = 0 THEN 'Categoria' ELSE left(item, CHARINDEX(':', Item, 0)) END as Tag, LTRIM(right(item, len(item) -CHARINDEX(':', Item, 0))) as Value 
FROM dbo.Split(@testString, '<br />'))a)



select @Categories += QUOTENAME(Tag) +',' from Model_tags
set @Categories = left (@Categories, len(@Categories)-1)

declare @param1 nvarchar(20)
declare @param2 nvarchar(20)
declare @param3 nvarchar(20)
declare @param4 nvarchar(20)
declare @param5 nvarchar(20)
declare @param6 nvarchar(20)
declare @param7 nvarchar(20)
declare @param8 nvarchar(20)
declare @param9 nvarchar(20)

set @param1 = 'ES75'
set @param2 = '%Cert%'
set @param3 = 'Plano'
set @param4 = 'TW'
set @param5 = 'Certificado'
set @param6 = 'Categoria'
set @param7 = '<br />'
set @param8 = ':'
set @param9 = 'ItemID:'

set @sql =
'declare @testString varchar(max)

set @testString = (SELECT ''' + @param9 + '''+ ItemID +''' + @param7 + '''+Text_T from (
select WK_ITEMTEXT.ITEMID, Text_T, row_number() over(order by TEXT_T) as ord	
           FROM WK_ITEMTEXT
				  left outer join Inventtable a on WK_ITEMTEXT.ITEMID = a.itemid and a.dataareaid = ''' + @param1 + '''
                  left outer join DOCUREF b on a.RECID = b.REFRECID AND b.ACTUALCOMPANYID = ''' + @param1 + ''' and b.REFTABLEID = 175 and b.name not like ''' + @param2 + '''
                  left outer join DOCUVALUE c on b.VALUERECID =  c.RECID
                  left outer join DOCUPARAMETERS d on d.DATAAREAID = ''' + @param1 + '''					  
		   WHERE left(a.itemname,2) =''' + @param4 + ''' and b.Name = ''' + @param3 + ''' and Text_T is NOT NULL)a
where ord = 1)

;with query as (
SELECT Tag, Value from (SELECT Item, CASE WHEN CHARINDEX(''' + @param8 + ''', Item, 0) = 0 and left(item,11) = ''' + @param5 + ''' THEN ''' + @param5 + ''' 
           WHEN CHARINDEX(''' + @param8 + ''', item, 0) = 0 THEN ''' + @param6 + ''' ELSE left(item, CHARINDEX(''' + @param8 + ''', item, 0)) END as Tag, LTRIM(right(item, len(item) -CHARINDEX(''' + @param8 + ''', item, 0))) as Value 
FROM dbo.Split(@testString, ''' + @param7 + '''))a)


SELECT * from(
SELECT CASE WHEN Tag = ''' + @param9 + ''' THEN ''' + @param9 + ''' ELSE Tag END as Tag, CASE WHEN Tag = ''' + @param9 + ''' THEN Value ELSE VALUE END as Value from query)a

PIVOT (Min(Value) for Tag in (' + @Categories +')) as pvt'

exec sp_executesql @sql

JeffModen · June 5, 2016, 12:06am

What's the datatype of your string column?
What's the maximum length of the string column (might not have anything to do with datatype).
What's the PK column (including datatype) of the table that contains the string column.
Please post your "split" function. There's a pretty good chance that it's slow (almost everyone's is) and we can show you how to fix it.

Why do you actually want to pivot the columns back to single rows? All that's going to do for you is continue to make your life difficult. An EAV type of table would be much more effective for any kind of look you may want to do and it would never need to be "adjusted" when you add another "feature".

Also, stop double posting. It only serves to split up talent.

barnabeck · June 5, 2016, 12:21am

The string column is NVARCHAR(100)
The lonest feature is 46 characters long, but might get longer according to the records I gather
There is an self incrementing ID
-underneath goes the function

I need that format, each item being described in a single row in order to compare all items and features for further conclusions. Speed is not an issue here.

I consider this question completely different than the post on how to implement a loop in the Pivot table. This one is about the creation of the temp table I want to insert the results into. In my understanding this is no double posting.

ALTER FUNCTION [dbo].[Split] (
  @InputString                  VARCHAR(8000),
  @Delimiter                    VARCHAR(50)
)

RETURNS @Items TABLE (
      Item                          VARCHAR(8000)
)

AS
BEGIN
      IF @Delimiter = ' '
     BEGIN
        SET @Delimiter = ','
        SET @InputString = REPLACE(@InputString, ' ', @Delimiter)
  END

  IF (@Delimiter IS NULL OR @Delimiter = '')
        SET @Delimiter = ','

--INSERT INTO @Items VALUES (@Delimiter) -- Diagnostic
--INSERT INTO @Items VALUES (@InputString) -- Diagnostic

  DECLARE @Item                 VARCHAR(8000)
  DECLARE @ItemList       VARCHAR(8000)
  DECLARE @DelimIndex     INT

  SET @ItemList = REPLACE(@InputString+'XXX123XX', @Delimiter++'XXX123XX','')
  SET @DelimIndex = CHARINDEX(@Delimiter, @ItemList, 0)
  WHILE (@DelimIndex != 0)
  BEGIN
        SET @Item = SUBSTRING(@ItemList, 0, @DelimIndex)
        INSERT INTO @Items VALUES (@Item)

        -- Set @ItemList = @ItemList minus one less item
        SET @ItemList = SUBSTRING(@ItemList, @DelimIndex+len(@Delimiter), LEN(@ItemList)-@DelimIndex)
        SET @DelimIndex = CHARINDEX(@Delimiter, @ItemList, 0)
  END -- End WHILE

  IF @Item IS NOT NULL -- At least one delimiter was encountered in @InputString
  BEGIN
        SET @Item = @ItemList
        INSERT INTO @Items VALUES (@Item)
  END

  -- No delimiters were encountered in @InputString, so just return @InputString
  ELSE INSERT INTO @Items VALUES (@InputString)

  RETURN

END -- End Function

Kristen · June 5, 2016, 7:47am

If it were me I would have a Key / Value table with the individual pieces of the Features and then reassemble them (e.g. as a comma or delimited list, using a VIEW for those reports that want them in that format. There is a nice way to get a delimited list (i.e. from multiple rows of a Feature Items table) in SQL using an XML trick

If you want to put all the Features into separate columns then that would be easy from the Feature Items table.

Your SPLIT() function would need to be modified to return an Item Number as well as a Value. I've assumed column names, from the resulset for the SPLIT, as ItemNumber and Item (@JeffModen I think its a lousy column name, [Value] would be better than [Item] )

For example

CREATE TABLE #SourceData
(
	SD_ID 		int,
	SD_FeatureList	varchar(8000),
	PRIMARY KEY
	(
		SD_ID
	)
)

CREATE TABLE #FeatureItems
(
	FI_ID		int IDENTITY(1, 1),	-- Tie-break in case of duplicates
	FI_SD_ID	int,
	FI_SetNo	int,
	FI_Key		varchar(50),
	FI_Value	varchar(100),
	PRIMARY KEY
	(
		FI_SD_ID,
		FI_SetNo,
		FI_Key,
		FI_ID
	)
)

INSERT INTO #SourceData
(
	SD_ID,
	SD_FeatureList
)
SELECT 1, 'length: 150 mm, width: 65 mm, shape: conic'
UNION ALL
SELECT 2, 'length: 200 mm, width: 50 mm, height: 75 mm, shape: cube'
UNION ALL
SELECT 3, 'length: 150 mm, width: 65 mm, shape: conic<br>length: 200 mm, width: 50 mm, height: 75 mm, shape: cube'
UNION ALL
SELECT 4, 'duplicate: 150 mm, duplicate: 65 mm, shape: duplicate'
UNION ALL
SELECT 5, 'Certificado ABC'	-- "Certificado" instance
UNION ALL
SELECT 6, 'Xxx XYZ'		-- "Categoria" instance

-- Process data:
INSERT INTO #FeatureItems
(
	FI_SD_ID,
	FI_SetNo,
	FI_Key,
	FI_Value
)
SELECT	SD_ID,
	T_SetItem,
	T_Key,
	T_Value
FROM	#SourceData
	CROSS APPLY
	(
		SELECT	ItemNumber AS T_SetItem,
			Item AS T_SetValues
		FROM dbo.Split(SD_FeatureList, '<br>', NULL)
	) AS X
	CROSS APPLY
	(
		SELECT	T_Offset = CHARINDEX(':', Item, 0),
			Item
		FROM dbo.Split(T_SetValues, ',', NULL)
	) AS Y
	CROSS APPLY
	(
		SELECT	T_Key = LTRIM(CASE
					WHEN T_Offset = 0 and LEFT(Item, 11) = 'Certificado' THEN 'Certificado' 
					WHEN T_Offset = 0 THEN 'Categoria' 
					ELSE LEFT(Item, T_Offset - 1)
					END),
			T_Value = LTRIM(RIGHT(Item, LEN(Item) - T_Offset))
	) AS Z


SELECT TOP 100 *
FROM	#FeatureItems
ORDER BY FI_SD_ID,
	FI_SetNo,
	FI_Key,
	FI_ID

Kristen · June 5, 2016, 7:52am

P.S. I recommend that you use Jeff's Splitter Function; whilst you say that performance is not important you may well use a splitter function in other processes in the future, where performance is important, and it would be better to have one that performs well rather than having two different ones in future. Your loop-based solution it likely to be 25x - 100x slower than set-based solution

http://www.sqlservercentral.com/articles/Tally+Table/72993/

We use separate splitter functions for INT / String and varchar(MAX) solutions, rather than one generic string-splitter, because performance is significantly different between them

barnabeck · June 5, 2016, 8:48pm

Kristen, your post gives a lot of Input but I'm afraid I didn't understand everything 100%.

I would have a Key / Value table with the individual pieces of the Features

I think I have exactly what you mean: The feature table has an ID Key and the column feature (length, width, shape,...)

I recommend that you use Jeff's Splitter Function

I would love to use a faster split function, but I had to change the length of the delimiter to 6 as I have the  and then the function just returned a scalar. Actually speed might become more important so I'm interested to use this function.

Your SPLIT() function would need to be modified to return an Item Number as well as a Value

Why? In order to build a table with ITEMID, Feature, Value? If I go for the solution with all features in separated columns, the ItemID becomes yet another feature and there is no need for the function to return the ItemID separately.

The code you posted...

...I couldn't execute, as you refer to a split function with 3 arguments... and I couldn't understand what this was aiming at.

Anyway, beside the split function and the general concept, can someone comment on my "real problems"? The code that I posted executes in 1 second and gives me the desired result and format and speed, but just for ONE single Item, while I need to loop through all Items. I inserted a simple loop function, but that works just in the first loop. Entering the second loop and after the row had been successfully saved to the #temp table I get a syntax error "Incorrect syntax near itemid". How can the syntax be fine on the first loop? It's not related to the data, as I can start the loop at row#2 with the same error showing up after inserting successfully row#2.

This is my code now. Something is not working with the loop; if I place "select @row" right before the @SQL parameter I can see how @row is counting up to 44 (which is the stack of the records found) BEFORE the first row has even been inserted. This should give you experts the right clue! What am I doing wrong? So much text... hopefully someone has the patience to read all this. Martin

declare @Categories nvarchar(max) = ''
declare @sql nvarchar(max)
declare @row int
declare @rowMax int
declare @testString varchar(max)
set @row = 1

set @rowMax = (Select COUNT(ID) FROM WK_ITEMTEXT
				  left outer join DE01150_DAX_ESSBP1_PROD_50.dbo.Inventtable a on WK_ITEMTEXT.ITEMID = a.itemid and a.dataareaid = 'ES75'
                  left outer join DE01150_DAX_ESSBP1_PROD_50.dbo.DOCUREF b on a.RECID = b.REFRECID AND b.ACTUALCOMPANYID = 'ES75' and b.REFTABLEID = 175 and b.name not like '%Cert%'
                  left outer join DE01150_DAX_ESSBP1_PROD_50.dbo.DOCUVALUE c on b.VALUERECID =  c.RECID
                  left outer join DE01150_DAX_ESSBP1_PROD_50.dbo.DOCUPARAMETERS d on d.DATAAREAID = 'ES75'					  
		   WHERE left(a.itemname,2) ='TW' and b.Name = 'Plano' and Text_T is NOT NULL)

WHILE @row <= @rowMax
BEGIN

set @testString = (SELECT 'itemid:'+ItemID+'<br />'+Text_T from (
select WK_ITEMTEXT.ITEMID, Text_T, row_number() over(order by WK_ITEMTEXT.ITEMID) as ord	
           FROM WK_ITEMTEXT
				  left outer join DE01150_DAX_ESSBP1_PROD_50.dbo.Inventtable a on WK_ITEMTEXT.ITEMID = a.itemid and a.dataareaid = 'ES75'
                  left outer join DE01150_DAX_ESSBP1_PROD_50.dbo.DOCUREF b on a.RECID = b.REFRECID AND b.ACTUALCOMPANYID = 'ES75' and b.REFTABLEID = 175 and b.name not like '%Cert%'
                  left outer join DE01150_DAX_ESSBP1_PROD_50.dbo.DOCUVALUE c on b.VALUERECID =  c.RECID
                  left outer join DE01150_DAX_ESSBP1_PROD_50.dbo.DOCUPARAMETERS d on d.DATAAREAID = 'ES75'					  
		   WHERE left(a.itemname,2) ='TW' and b.Name = 'Plano' and Text_T is NOT NULL)a
where ord =  @row)

;with query as (
SELECT Replace(Replace(Replace(a.Tag,':',''),'[','('),']',')') as Tag, Value 
from (SELECT Item, CASE WHEN CHARINDEX(':', Item, 0) = 0 and left(item,11) = 'Certificado' THEN 'Certificado' 
                    WHEN CHARINDEX(':', Item, 0) = 0 THEN 'Categoria' 
					ELSE left(item, CHARINDEX(':', Item, 0)) END as Tag, LTRIM(right(item, len(item) -CHARINDEX(':', Item, 0))) as Value 
FROM dbo.Split(@testString, '<br />'))a)

select @Categories += QUOTENAME(Tag) +',' from Model_tags
set @Categories = left (@Categories, len(@Categories)-1)

declare @param1 nvarchar(20)
declare @param2 nvarchar(20)
declare @param3 nvarchar(20)
declare @param4 nvarchar(20)
declare @param5 nvarchar(20)
declare @param6 nvarchar(20)
declare @param7 nvarchar(20)
declare @param8 nvarchar(20)
declare @param9 nvarchar(20)
declare @param10 nvarchar(20)
declare @param11 nvarchar(20)
declare @param12 nvarchar(20)
declare @param13 nvarchar(20)
set @param1 = 'ES75'
set @param2 = '%Cert%'
set @param3 = 'Plano'
set @param4 = 'TW'
set @param5 = 'Certificado'
set @param6 = 'Categoria'
set @param7 = '<br />'
set @param8 = ':'
set @param9 = 'itemid:'
set @param10 = '['
set @param11 = '('
set @param12 = ']'
set @param13 = ')'

set @sql =
'declare @testString nvarchar(max)

set @testString = (SELECT ''' + @param9 + '''+ ITEMID +''' + @param7 + '''+Text_T from (
select WK_ITEMTEXT.ITEMID, Text_T, row_number() over(order by WK_ITEMTEXT.ITEMID) as ord	
           FROM WK_ITEMTEXT
				  left outer join DE01150_DAX_ESSBP1_PROD_50.dbo.Inventtable a on WK_ITEMTEXT.ITEMID = a.itemid and a.dataareaid = ''' + @param1 + '''
                  left outer join DE01150_DAX_ESSBP1_PROD_50.dbo.DOCUREF b on a.RECID = b.REFRECID AND b.ACTUALCOMPANYID = ''' + @param1 + ''' and b.REFTABLEID = 175 and b.name not like ''' + @param2 + '''
                  left outer join DE01150_DAX_ESSBP1_PROD_50.dbo.DOCUVALUE c on b.VALUERECID =  c.RECID
                  left outer join DE01150_DAX_ESSBP1_PROD_50.dbo.DOCUPARAMETERS d on d.DATAAREAID = ''' + @param1 + '''					  
		   WHERE left(a.itemname,2) =''' + @param4 + ''' and b.Name = ''' + @param3 + ''' and Text_T is NOT NULL)a
where ord = '+CAST(@row as NVARCHAR)+')



;with query as (
SELECT Replace(Replace(Replace(a.Tag,''' + @param8 + ''',''''),''' + @param10 + ''',''' + @param11 + '''),''' + @param12 + ''',''' + @param13 + ''') as Tag,
   Value from
(SELECT Item, CASE WHEN CHARINDEX(''' + @param8 + ''', Item, 0) = 0 and left(item,11) = ''' + @param5 + ''' THEN ''' + @param5 + ''' 
                  WHEN CHARINDEX(''' + @param8 + ''', item, 0) = 0 THEN ''' + @param6 + ''' 
				  ELSE left(item, CHARINDEX(''' + @param8 + ''', item, 0)) END as Tag, LTRIM

(right(item, len(item) -CHARINDEX(''' + @param8 + ''', item, 0))) as Value 
FROM dbo.Split(@testString, ''' + @param7 + '''))a)


SELECT * from(
SELECT CASE WHEN Tag = ''' + @param9 + ''' THEN ''' + @param9 + ''' ELSE Tag END as Tag, CASE WHEN Tag = ''' + @param9 + ''' THEN Value ELSE VALUE END as Value from query)a

PIVOT (Min(Value) for Tag in (' + @Categories +')) as pvt'

INSERT INTO VAINAS 
exec sp_executesql @sql
set @row = @row +1
END

JeffModen · June 6, 2016, 12:08am

Heh... dunno about that. To me, "Value" implies something numeric and it never is, especially by definition.. Oxford Languages | The Home of Language Data

I called it "Item" because it 1) will always be an "item" Oxford Languages | The Home of Language Data and 2) it's short, and 3) it's not even close to being a reserved word and my bet would be that it won't be. I could see "value" becoming a reserved word someday.

It's like some folks don't understand why I called a table of sequential numbers a "Tally Table". After all, it doesn't contain "tallies". Another definition of what "Tally" means is "Count" as found in the "origin" of the word. See Tally Definition & Meaning - Merriam-Webster where it says...

Origin of tally

Middle English talye, from Anglo-French talie, taille, in part from tailler to cut, measure, count; in part from Medieval Latin tallia, alteration of Latin talea plant cutting, thin piece of wood

First Known Use: 15th century

... and is, indeed, used to cut, measure, and count not to mention being a very "thin" table. Although it goes a bit against the grain for most folks, I didn't name it after what it contains... I named it after what it's used for, like "Tally Stick" except "Tally Table".

JeffModen · June 6, 2016, 12:15am

So use REPLACE as a preprocessor to replace your 6 character delimiter with a single character delimiter.

JeffModen · June 6, 2016, 1:14am

For 1 above... you have a couple of possible problems. Your function takes a VARCHAR(8000) as the input and there may be something that literally gets lost in the translation from NVARCHAR(100) to VARCHAR(8000). It's also an implicit conversion that can take quite a bit of extra time.

Most seriously, it won't take the "all possible 150 features" that you say that the final table will have.

With that, I have no actual idea of what your string data actually looks like for sure. Would you please provide a readily consumable example in the form of a CREATE TABLE statement and provide the code to do a bit of a table population using INSERT/VALUES so that we can work more closely with your actual situation?

Also I'm still not getting what you're saying about some 150 features... are you saying that you have a table already defined with more than 150 columns that you need to populate???

barnabeck · June 6, 2016, 8:30am

So use REPLACE as a preprocessor to replace your 6 character delimiter with a single character delimiter

Off course I thought about it, but it looked too risky to pick just a single character delimiter as there is so much text, and the chance that this character is somewhere within the text is not irrelevant. But I appreciate your comment and I suppose a '╣' will be fine and will work with for your algorithm.

barnabeck · June 6, 2016, 9:41am

The thing is that I'm doing the reverse algorithm of what had been done in the forehand and what for some reason is not processed: Our ERP has an external configurator for all articles we produce. There you define length, width, shape as one of the common but then there are very special features that rarely show up. This configurator builds a 150 character long modelcode which is like the DNA of the product. While configuring the article, a text string is packed with all that data that later becomes essential for both production planning and manufacturing. I consider this highly inefficient for many processes, as you only have a massive text block, that one has to scan with the naked eye.

This project is about regaining the lost data structure and assigning the values to actual features to make everything more visible.

In most of the cases you'll have a structure like "length: 140mm
width: 40 mm
..." with ":" as a second delimiter. But then there are lines without the ":"-division between the feature and it's value. These lines are generally much longer with more than 300 characters. I'm not yet decided about how to treat those lines, as for consistency their text is supposed to become the column's name. I have to talk to production to see if this information is crucial or can be skipped. The best thing right now seems to stuff all those lines without ':' into one feature "Else"

(Internally the data had been saved as NTEXT with char(10) carriage return as the delimiter; I convert that to NVARCHAR in a automated night JOB, replacing the char(10) by a html-break to be able to work decently with that column)

Also I'm still not getting what you're saying about some 150 features... are you saying that you have a table already defined with more than 150 columns that you need to populate

I build this table dynamically:
1.Step: Define the score; which records to analyse from the 500'000 in the itemmaster
2.Step: Create the table Columns were all features of the score are grouped in one column. They become the column names.
3.Step: Create a table dynamically with as many columns as the Columns table has records and using them as their column-names.
4.Step: Run the pivot table code to populate the table.

This is what one row looks like: (In the meantime I expanded all nvarchar to nvarchar(max); I have to read your post in more detail to improve, but unfortunately I don't have enough time for investigation. Things have to work in the first place, once the run you can tune them)

CREATE TABLE [dbo].[Vainas](
[ID] [int] IDENTITY(1,1) NOT NULL,
[itemid] [nvarchar](max) NULL,
[Categoria] [nvarchar](max) NULL,
[Datos técnicos según hoja técnica] [nvarchar](max) NULL,
[Forma de la vaina] [nvarchar](max) NULL,
[Unidad de medida] [nvarchar](max) NULL,
[Diámetro del cabezal] [nvarchar](max) NULL,
[Conexión a instrumento (N)] [nvarchar](max) NULL,
[Longitud de inmersión(U)] [nvarchar](max) NULL,
[Longitud de la conexión (H)] [nvarchar](max) NULL,
[Material] [nvarchar](max) NULL,
[Taladro (B or d1)] [nvarchar](max) NULL,
[Diámetro de raíz (Q)] [nvarchar](max) NULL,
[Diámetro de la punta (V o F3)] [nvarchar](max) NULL,
[Notas adicionales] [nvarchar](max) NULL,
[Profundidad de taladro] [nvarchar](max) NULL,
[Protección rosca] [nvarchar](max) NULL,
[Grosor de la punta] [nvarchar](max) NULL,
[Certificado] [nvarchar](max) NULL,
[Idioma del certificado] [nvarchar](max) NULL,
[Pruebas / certificados para] [nvarchar](max) NULL,
[Certificado de materiales] [nvarchar](max) NULL,
[Prueba de presión y de estabilidad] [nvarchar](max) NULL,
[Ejecución de cabezal] [nvarchar](max) NULL,
[Conexión a proceso (P o E)] [nvarchar](max) NULL,
[Diámetro extensión (Bd or F2)] [nvarchar](max) NULL,
[Ejecución de la soldadura] [nvarchar](max) NULL,
[Tamaño de la brida] [nvarchar](max) NULL,
[Presión nominal] [nvarchar](max) NULL,
[Superficie de la brida] [nvarchar](max) NULL,
[Grosor de la pared del tubo] [nvarchar](max) NULL,
[Material brida] [nvarchar](max) NULL,
[Marcaje] [nvarchar](max) NULL,
[Prueba de penetración de líquidos] [nvarchar](max) NULL,
[Longitud total (L)] [nvarchar](max) NULL,
[Longitud de inmersión (U1)] [nvarchar](max) NULL,
[Longitud del cono] [nvarchar](max) NULL,
[Diámetro de soldadura para tuberia] [nvarchar](max) NULL,
PRIMARY KEY CLUSTERED 
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

GO
SET IDENTITY_INSERT [dbo].[Vainas] ON 

INSERT [dbo].[Vainas] ([ID], [itemid], [Categoria], [Datos técnicos según hoja técnica], [Forma de la vaina], [Unidad de medida], [Diámetro del cabezal], [Conexión a instrumento (N)], [Longitud de inmersión(U)], [Longitud de la conexión (H)], [Material], [Taladro (B or d1)], [Diámetro de raíz (Q)], [Diámetro de la punta (V o F3)], [Notas adicionales], [Profundidad de taladro], [Protección rosca], [Grosor de la punta], [Certificado], [Idioma del certificado], [Pruebas / certificados para], [Certificado de materiales], [Prueba de presión y de estabilidad], [Ejecución de cabezal], [Conexión a proceso (P o E)], [Diámetro extensión (Bd or F2)], [Ejecución de la soldadura], [Tamaño de la brida], [Presión nominal], [Superficie de la brida], [Grosor de la pared del tubo], [Material brida], [Marcaje], [Prueba de penetración de líquidos], [Longitud total (L)], [Longitud de inmersión (U1)], [Longitud del cono], [Diámetro de soldadura para tuberia]) VALUES (180, N'44781300', N'Vaina roscada de barra taladrada tipo TW15', N'TW 95.15', N'Cónica', N'Métrico (mm)', NULL, N'1/2 NPT hembra', N'75 mm', N'45 mm', N'Acero inox. 316/316L', N'6,6 mm', N'27 mm', N'19 mm', NULL, N'113 mm', NULL, N'6,4 mm (0,250 pulgadas)', NULL, NULL, NULL, NULL, NULL, N'Hexagonal', N'1 NPT', N'Hexagonal 36 mm', NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL)
SET IDENTITY_INSERT [dbo].[Vainas] OFF

barnabeck · June 6, 2016, 9:49am

I finally solved what had been my "real" problem:
For some reason I don't understand, the loop returned the desired results after I moved

select @Categories += QUOTENAME(Tag) +',' from Model_tags
set @Categories = left (@Categories, len(@Categories)-1)

outside the loop. Being inside that command messes up everything. (???) Now everything is working. I will implement this code in my solution and afterwards I come back to tune and gain speed. Unfortunately that has to be the order...

Jeff & Kristen. Thank you for your support,

Kristen · June 6, 2016, 10:01am

If you want to split:

length: 150 mm, width: 65 mm, shape: conic<br>length: 200 mm, width: 50 mm, height: 75 mm, shape: cube

You have two different properties ("conic" and "cube") but you only have one ID, hence my suggestion that a Splliter (on " ") which gives you an Item Number would give you:

1 = length: 150 mm, width: 65 mm, shape: conic
2 = length: 200 mm, width: 50 mm, height: 75 mm, shape: cube

and then you could split those to get the individual attributes, so combined with the ID (lets call it "123" for this row) you would get

123 - 1- length: 150 mm
123 - 1- width: 65 mm
123 - 1- shape: conic
123 - 2 length: 200 mm
123 - 2 width: 50 mm
123 - 2 height: 75 mm
123 - 2 shape: cube

without the Item number "123 + Length" (and width, & shape) would have two entries, and thus no unique key.

Personally, if I was building this, I would have an EAV table and deal with producing a multi-column output when producing a report for users. What happens (with your dynamically-created table) if tomorrow a column is "removed" (it was there yesterday, but today no products have that attribute). If some code somewhere is including that column name in a SELECT it will now fail. Perhaps that's not a problem for you.

My instinct is to suggest you don't do it that way (using a loop), because it will be a problem downstream

I missed that because I did't read your original code. If I have read it correctly [WK_ITEMTEXT] is an EAV table?

If so can you just report from that directly (instead of turning it into a multi-column table using dynamic SQL to create all the necessary columns (representing all the possible attributes)?

If you have some difficult parsing some things using SQL (" there are lines without the ":"-division between the feature and it's value ... I'm not yet decided about how to treat those lines,") it might be worth looking at some translation process during the import - something like a sequential-editor, for example, which can use regular expressions and plenty of sequential-style logic to parse and arrange the data, and then present it "oven ready" for SQL to import. Just a thought ...

I would also keep the CHAR(10) delimiter (assuming it is unique within the data, and the user cannot, ALSO!!, create a line break within an attribute name or its value). It is a single character, thus easier for Splitting, and could be converted to " " at the output stage (because only needed for a browser; for an export to, say, Excel the output could substitute TAB or CHAR(10)+CHAR(13) instead ... or for some process "|" even <sigh!>

Interesting, thanks. I'm dyslexic, so should never enter the fray on such topics! Maybe "Data" instead of "Value" then. Having ItemNumber and Item has ambiguity for me (probably because I'm dyslexic ...) as I cannot be confident which-is-which (well, "ItemNumber" is clearly the 1, 2, 3 bit ... but when I see "Item" on its own I lack certainty that that is the "value" part. )

but there again, you did call the table an EAV ... Wikipedia (on EVA) says the third element is "The value of the attribute. This would depend on the data type, ..." ...; I give up!!

barnabeck · June 6, 2016, 11:56am

You have two different properties ("conic" and "cube") but you only have one ID

This won't happen as the occurrence of every feature for one ItemID is UNIQUE.

What happens (with your dynamically-created table) if tomorrow a column is "removed"

The dynamically created table is build according to all the features of the scope; the records I gathered to analyze. Therefore a column can't get los.

Personally, if I was building this, I would have an EAV table

I start seeing your point and even how I could build a multi-column view with the pivot function. Probably there a other ways? I'm not too familiar, so this is new... but interesting. The key question to me is, how to populate the EAV table without looping through the scope, if the split function returns me a table for every ItemID? I guess this is the Code that you had posted before with the 3 argument Split function.

Kristen · June 6, 2016, 1:01pm

That makes life easier! Could you give me an example pls - I assumed that the " " divider was to separate "similar" sets of attributes; if it is just a list of unique attributes it might help to understand why some are aggregated together, and others treated as being "different". Obvious I made up the example that I used in my test code ...

Hopefully it is, assuming that I understood the requirements correctly.

if there is no need for a tie-break Item Number then your existing Splitter Function will do Just Fine (although i think using Jeffs optimised version would be worthwhile) (Note that you will only be able to have one-each of your Categoria and Certificado scenarios, per record)

Also, in that case you could just use one split, rather than nested splits, by just replacing " " with ","

I've taken out the Duplicate Test, and the "Set No" part to accommodate Duplicates, and I've also disabled the IDENTITY (which would act as a tie-break in the event of duplicates).

The SPLIT() function no longer needs an ItemNumber, just an Item (TAG)

If this works OK for you as a test rig perhaps stuff a reasonable number of actual data values into the #SourceData table and make sure it behaves OK.


DROP TABLE #SourceData
GO
DROP TABLE #FeatureItems
GO
CREATE TABLE #SourceData
(
	SD_ID 		int,
	SD_FeatureList	varchar(8000),
	PRIMARY KEY
	(
		SD_ID
	)
)

CREATE TABLE #FeatureItems
(
--N/A	FI_ID		int IDENTITY(1, 1),	-- Tie-break in case of duplicates
	FI_SD_ID	int,
	FI_Key		varchar(50),
	FI_Value	varchar(100),
	PRIMARY KEY
	(
		FI_SD_ID,
		FI_Key
--N/A		, FI_ID
	)
)

INSERT INTO #SourceData
(
	SD_ID,
	SD_FeatureList
)
SELECT 1, 'length: 150 mm, width: 65 mm, shape: conic'
UNION ALL
SELECT 2, 'length: 200 mm, width: 50 mm, height: 75 mm, shape: cube'
UNION ALL
SELECT 3, 'length: 150 mm, width: 65 mm, shape: conic<br>lengthX: 200 mm, widthX: 50 mm, height: 75 mm, shapeX: cube'
UNION ALL
-- NOTE: Test for Duplicate Attribute TAGs disabled
-- SELECT 4, 'duplicate: 150 mm, duplicate: 65 mm, shape: duplicate'
-- UNION ALL
SELECT 5, 'Certificado ABC'	-- "Certificado" instance
UNION ALL
SELECT 6, 'Xxx XYZ'		-- "Categoria" instance
UNION ALL
SELECT 7, 'length: 150 mm, width: 65 mm, shape: conic<br>Certificado ABC<br>Xxx XYZ'


-- Process data:
INSERT INTO #FeatureItems
(
	FI_SD_ID,
	FI_Key,
	FI_Value
)
SELECT	SD_ID,
	T_Key,
	T_Value
FROM	#SourceData
	CROSS APPLY
	(
		SELECT	T_Offset = CHARINDEX(':', Item, 0),
			Item
		FROM dbo.split(REPLACE(SD_FeatureList, '<br>', ','), ',')	-- Convert "<BR>" to YetAnotherCommaDelimiter
	) AS Y
	CROSS APPLY
	(
		SELECT	T_Key = LTRIM(CASE
					WHEN T_Offset = 0 and LEFT(Item, 11) = 'Certificado' THEN 'Certificado' 
					WHEN T_Offset = 0 THEN 'Categoria' 
					ELSE LEFT(Item, T_Offset - 1)
					END),
			T_Value = LTRIM(RIGHT(Item, LEN(Item) - T_Offset))
	) AS Z

SELECT TOP 100 *
FROM	#FeatureItems
ORDER BY FI_SD_ID,
	FI_Key
--N/A	, FI_ID

JeffModen · June 6, 2016, 1:52pm

Absolutely understood. The real reason I named them "Item" and "Item Number" and didn't want to admit to is that I'm old and suffering from major CRS so if I didn't name them with "Item", I'd have to look them up every time.

Can't defend anything on the use of "V" in "EAV" and "NVP" tables. Heck of a language we have. Keeps it interesting.

JeffModen · June 6, 2016, 1:57pm

THAT's actually the easy part and we can show you that. The thing I'm still having problems with is that I've not seen an example of your data that actually contains the "
" stuff. If you could post several lines of data, we can show you what a cake walk that can actually be.

barnabeck · June 6, 2016, 2:44pm

I really appreciate your help, my head is glowing and I'm trying to enter into all details you ask me.

Could you give me an example pls

Flanged Thermowell TW10, solid machined Specifications according to data sheet: TW 95.10 / TW 95.11 / TW 95.12 Welding options: Partial penetration (fillet a=3 mm minimum) Thermowell style: Tapered Unit of measure: Metric (mm) Flange size: ASME 2 inch Pressure rating: Class 300 Flange face: RF (Raised face) Connection to thermometer [N]: G 1/2 female Pipe sched. /Intern. nozzle dia.: WAKI (Standard Germany) Insertion length [U]: 200 mm Head length [H]: 57 mm Flange material: Stainless steel 316/316L Material: Stainless steel 316/316L Bar diameter [Bd or F2]: 34 mm Bore size [B or d1]: 0.276 inch (7.0 mm) Root diameter [Q]: 25 mm Tip diameter [V or F3]: 19 mm Bore depth / stem length (US) [A]: 250 mm Tip thickness [Tt]: 0.250 inch (6.4 mm) Inspection certificate acc. to EN 10204 3.1 Certificate Language: English and Russian Certificate for: Thermowell / protection tube Material certificate: Wetted parts material (metal contents) with mill certificate incl. chemical analysis Pressure and stability test: Hydrostatic, external pressure test - 1.5 flange rating - 3 min. Liquid dye penetration: Yes

Thermowell TW55, solid machined DIN 43772 (Form 4,4F) Specifications according to data sheet: TW 95.55 Thermowell style: Form 4 (welding connection) Bar diameter [Bd or F2]: 24 mm Connection to thermometer [N]: M18 x 1,5 female Bore size [B or d1]: 0.276 inch (7.0 mm) Material: Stainless steel 316/316L Total length [L]: 260 mm Insertion length [U1]: variable Tapering length [U]: 150 mm Tip diameter [V or F3]: 12.5 mm Inspection certificate acc. to EN 10204 3.1 Certificate Language: English and German Certificate for: Thermowell / protection tube Material certificate: Wetted parts material (metal contents) with mill certificate incl. chemical analysis

If this works OK for you as a test rig

Cool. Yes now it works, although I had to place the drop table commands at the end... and I have still have to understand the logic. This is basically all I need... beside the code that gives me the view of the table with all their features as column. This I do with the pivot, right?

Kristen · June 6, 2016, 3:10pm

Blimey ... that could mean ... almost anything!

http://acronyms.thefreedictionary.com/CRS

That is certainly one way. My guess is that it is probably the most "generic" way ...

What are you going to do with the output? Just display it in an HTML TABLE in a browser? or something else?

For an HTML TABLE there might be mileage in just outputting it "flat" from the FeaturesTable and having the "application end" do a bit of formatting / arranging.

For example

; WITH CTE1 AS
(
	SELECT DISTINCT FI_Key
	FROM	#FeatureItems
),
CTE2 AS
(
	SELECT DISTINCT FI_SD_ID
	FROM	#FeatureItems
)
SELECT	CTE2.FI_SD_ID, CTE1.FI_Key, FI.FI_Value
FROM	CTE1
	JOIN CTE2
		ON 1=1
	LEFT OUTER JOIN #FeatureItems AS FI
		 ON FI.FI_SD_ID = CTE2.FI_SD_ID
		AND FI.FI_Key = CTE1.FI_Key
ORDER BY CTE2.FI_SD_ID,
	CTE1.FI_Key

or even something like this which will "assembled" all the columns (including blank ones) into Table Cells (for HTML)



; WITH CTE1 AS
(
	SELECT DISTINCT FI_Key
	FROM	#FeatureItems
),
CTE2 AS
(
	SELECT DISTINCT FI_SD_ID
	FROM	#FeatureItems
)
SELECT	CTE2.FI_SD_ID,
	[MyValueHTML] = 
--N/A STUFF(
		(
			SELECT	'<td>' + COALESCE(FI.FI_Value, '&nbsp;') + '</td>'
			FROM	CTE1
				LEFT OUTER JOIN #FeatureItems AS FI
					ON FI.FI_SD_ID = CTE2.FI_SD_ID
					AND FI.FI_Key = CTE1.FI_Key
			ORDER BY CTE2.FI_SD_ID,
				CTE1.FI_Key
			FOR XML PATH(''), TYPE
		).value('.', 'varchar(max)')	-- NOTE: "'value" is case sensitive!!
--N/A , 1, 1, '')	-- Use ", 1, 2, '')" if the delimiter includes a space e.g. ", "
FROM	CTE2
ORDER BY CTE2.FI_SD_ID

(You'd need an additional query to get all the TH column headings, in the same order as the TD's!!)

I had mine at the front, deliberately, so that the tables "persist" after running in (in case I wanted to do any adhoc SELECTs) but they generate errors if the #Tables don't exist, so I left that bit as an "exercise for the reader"

Kristen · June 6, 2016, 3:17pm

Ah ... I now realise that your delimiter is " " ... from your original example "length: 150 mm, width: 65 mm, shape: conic" I thought you had two different kinds of delimiter - both "," and " " and had assumed they represented two different types of "sub records"

The whole business with replacing " " should probably be changed to something (other than COMMA) which cannot be present in the data - maybe the original APP's delimiter of CHAR(10) is a good choice.