Calculate LEN with embedded TABs

I have some code that generates SQL. It has some embedded TABs in the output that is being generated - so that the code all lines up nicely :slight_smile: but I also want to add a "-- Comment" to the end of the line, and I would like those to line up too, but I need help with how to calculate how long the string is, including "expanded TABs" where appropriate.

SELECT	*
INTO	#TEMP
FROM
(
	SELECT	[TheData] = '1234567'
	UNION ALL SELECT '12345678'
	UNION ALL SELECT '123456789'
	UNION ALL SELECT '123456789.'
	UNION ALL SELECT '123456789.1'
	UNION ALL SELECT '123456789.12'
	UNION ALL SELECT '123456789.123'
	UNION ALL SELECT '123456789.1234'
	UNION ALL SELECT '123456789.1234'
	UNION ALL SELECT '123456789.12345'
	UNION ALL SELECT '123456789.123456'
	UNION ALL SELECT '123456789.1234567'
	UNION ALL SELECT '123456789.12345678'
	UNION ALL SELECT '123456789.123456789'
	UNION ALL SELECT 'A' + CHAR(9) + '9.123456789'
	UNION ALL SELECT 'AA' + CHAR(9) + '9.123456789'
	UNION ALL SELECT 'AAA' + CHAR(9) + '9.123456789'
	UNION ALL SELECT 'AAAA' + CHAR(9) + '9.123456789'
	UNION ALL SELECT 'AAAAA' + CHAR(9) + '9.123456789'
	UNION ALL SELECT 'AAAAAA' + CHAR(9) + '9.123456789'
	UNION ALL SELECT 'AAAAAAA' + CHAR(9) + '9.123456789'
	UNION ALL SELECT 'A' + CHAR(9) + 'B' + CHAR(9) + '789'
	UNION ALL SELECT 'A' + CHAR(9) + 'BB' + CHAR(9) + '789'
	UNION ALL SELECT 'A' + CHAR(9) + 'BBB' + CHAR(9) + '789'
	UNION ALL SELECT 'A' + CHAR(9) + 'BBBB' + CHAR(9) + '789'
	UNION ALL SELECT 'A' + CHAR(9) + 'BBBBB' + CHAR(9) + '789'
	UNION ALL SELECT 'A' + CHAR(9) + 'BBBBBB' + CHAR(9) + '789'
	UNION ALL SELECT 'A' + CHAR(9) + 'BBBBBBB' + CHAR(9) + '789'
) AS T
SELECT	
	TheData
	+ REPLICATE(CHAR(9), ((MaxTheDataLen/8+1)*8 - (LEN(TheData)/8+1)*8) / 8 + 1)
	+ '-- Some aligned data'
--	+ ', Len=' + CONVERT(varchar(20), LEN(TheData))
--	+ ', Calc=' + CONVERT(varchar(20), ((MaxTheDataLen/8+1)*8 - (LEN(TheData)/8+1)*8) / 8 + 1)
FROM	#TEMP
	CROSS APPLY
	(
		SELECT	[MaxTheDataLen] = MAX(LEN(TheData))
		FROM	#TEMP
	) AS X

Output is like this:

1234567                 -- Some aligned data
12345678                -- Some aligned data
123456789               -- Some aligned data
123456789.              -- Some aligned data
123456789.1             -- Some aligned data
123456789.12            -- Some aligned data
123456789.123           -- Some aligned data
123456789.1234          -- Some aligned data
123456789.1234          -- Some aligned data
123456789.12345         -- Some aligned data
123456789.123456        -- Some aligned data
123456789.1234567       -- Some aligned data
123456789.12345678      -- Some aligned data
123456789.123456789     -- Some aligned data

With no embedded TABs the calculation works fine, but WITH embedded TABs the character-count gives the wrong value, of course:

A       9.123456789             -- Some aligned data
AA      9.123456789             -- Some aligned data
AAA     9.123456789             -- Some aligned data
AAAA    9.123456789     -- Some aligned data
AAAAA   9.123456789     -- Some aligned data
AAAAAA  9.123456789     -- Some aligned data
AAAAAAA 9.123456789     -- Some aligned data
A       B       789                     -- Some aligned data
A       BB      789             -- Some aligned data
A       BBB     789             -- Some aligned data
A       BBBB    789             -- Some aligned data
A       BBBBB   789             -- Some aligned data
A       BBBBBB  789             -- Some aligned data
A       BBBBBBB 789             -- Some aligned data

What I want is the right hand bit all aligned, even when the left hand bit has embedded TABs.

Maybe as below? Assuming a tab is 4 spaces, if not, naturally adjust as needed:

...
	CROSS APPLY
	(
		SELECT	[MaxTheDataLen] = MAX(LEN(REPLACE(TheData, CHAR(9), SPACE(4))))
		FROM	#TEMP
	) AS X

Using DelimitedSplit8K from @JeffModen you can do:

with cte
  as (select thedata
            ,sum(l1+case when rn=1 then 0 else l2 end) as thelen
        from (select a.thedata
                    ,len(b.item) as l1
                    ,(8-len(b.item)%8)%8 as l2
                    ,row_number() over(partition by a.thedata
                                           order by b.itemnumber desc
                                      )
                     as rn
                from #temp as a
                     cross apply dbo.DelimitedSplit8K(a.thedata,char(9)) as b
             ) as a
       group by thedata
     )
select a.thedata
      +replicate(char(9),((b.maxthedatalen/8+1)*8-(a.thelen/8+1)*8)/8+1)
      +'-- Some aligned data'
  from cte as a
       cross apply (select max(thelen) as maxthedatalen
                      from cte
                   ) as b
;

I wish I could think of an easier way :thinking:

Don't think so, because in effect the CHAR(9) just "rounds up to the next tab stop", rather than being a substitute for a fixed number of spaces.

Haven't tried that yet, but "split on TAB" looks promising, thanks.

However

you suggestion caused me to wonder whether this would work:

JOIN the individual characters to NUMBERS table (as DelimitedSplit8K presumably does) and then just "round up " the Number Positions to figure out "how many extra padding spaces would be added"

Bit short of time over the weekend, nice weather forecast and I want to crack on in the garden before Winter weather arrives ... plus we lose Summer Time soon - I'll take a look on Monday.

I found following problems with the query I suggested:

  • duplicates mess up calculation of max length
  • multiple tabs next to each other mess up things aswell

To correct this, try:

with cte1
  as (select row_number() over(order by (select null)) as id
            ,thedata
        from #temp
     )
    ,cte2
  as (select id
            ,thedata
            ,sum(l1+case when rn=1 then 0 else l2 end) as thelen
        from (select a.id
                    ,a.thedata
                    ,len(b.item) as l1
                    ,case
                        when len(b.item)=0
                        then 8
                        else ceiling(len(b.item)/8.)*8-len(b.item)
                     end
                     as l2
                    ,row_number() over(partition by a.id
                                                   ,a.thedata
                                           order by b.itemnumber desc
                                      )
                     as rn
                from cte1 as a
                     cross apply dbo.DelimitedSplit8K(a.thedata,char(9)) as b
             ) as a
       group by id,thedata
     )
select a.thedata
      +replicate(char(9),(floor(b.maxthedatalen/8)+1)-floor(a.thelen/8))
      +'-- Some aligned data'
  from cte2 as a
       cross apply (select max(thelen) as maxthedatalen
                      from cte2
                   ) as b
;

It keeps getting more and more ugly - sorry :frowning:

This appears to be working according to the description...

SELECT 
	TheData = CASE 
					WHEN s1.Space_1 < MAX(s1.Space_1) OVER () 
					 AND s2.Space_2 < MAX(s2.Space_2) OVER ()
						THEN STUFF(
								STUFF( d1.TheData, s2.Space_2, 1, REPLICATE(' ', MAX(s2.Space_2) OVER () - s2.Space_2 + 1))
								, s1.Space_1, 1, REPLICATE(' ', MAX(s1.Space_1) OVER () - s1.Space_1 + 1)
							)
					WHEN s2.Space_2 < MAX(s2.Space_2) OVER ()
						THEN STUFF( d1.TheData, s2.Space_2, 1, REPLICATE(' ', MAX(s2.Space_2) OVER () - s2.Space_2 + 1))
					WHEN s1.Space_1 < MAX(s1.Space_1) OVER ()
						THEN STUFF( d1.TheData, s1.Space_1, 1, REPLICATE(' ', MAX(s1.Space_1) OVER () - s1.Space_1 + 1))
					ELSE d1.TheData
				END
FROM
	#TEMP t
	CROSS APPLY ( VALUES (REPLACE(t.TheData, CHAR(9), ' ')) ) d1 (TheData)
	CROSS APPLY ( VALUES (NULLIF(CHARINDEX(' ', d1.TheData, 1), 0)) ) s1 (Space_1)
	CROSS APPLY ( VALUES (NULLIF(CHARINDEX(' ', d1.TheData, s1.Space_1 + 1), 0)) ) s2 (Space_2);

The output...

1234567
12345678
123456789
123456789.
123456789.1
123456789.12
123456789.123
123456789.1234
123456789.1234
123456789.12345
123456789.123456
123456789.1234567
123456789.12345678
123456789.123456789
A       9.123456789
AA      9.123456789
AAA     9.123456789
AAAA    9.123456789
AAAAA   9.123456789
AAAAAA  9.123456789
AAAAAAA 9.123456789
A       B       789
A       BB      789
A       BBB     789
A       BBBB    789
A       BBBBB   789
A       BBBBBB  789
A       BBBBBBB 789