SQL Coding Standard - Things On The Right End Of The Line

JimSnyder · August 4, 2015, 1:43pm

Trying to pull JamesK and ScottPletcher back in, I agree with Kristen that losing things out at the end of the line is a problem, but without the ability to get tools, my solution is line continuation:

IF OBJECT_ID(‘tempdb..#Denials’) IS NOT NULL
DROP TABLE #Denials
GO
CREATE TABLE #Denials
(
	rejection_code				CHAR(10)		NULL,
	rej_cd_description			CHAR(40)		NULL,
	rsn_ds					CHAR(40)		NULL,
	payment_date				DATETIME	NULL,
	rejected_amount			MONEY		NULL,
	co_insurance_amount			MONEY		NULL,
	total						MONEY		NULL
)
INSERT INTO  #Denials
(
	rejection_code,
	rej_cd_description,
	rsn_ds,
	payment_date,
	rejected_amount,
	co_insurance_amount,
	total
)
SELECT
	Tpb483.rejection_code					AS [Rejection Code],
	Tpb483.rej_cd_description					AS [Rejection Code Description],
	Tpb483.rsn_ds							AS [Rejection Reason],
	Tpb484.payment_date					AS [Payment Date],
	ISNULL(Tpb484.rejected_amount, 0)			AS Rejected,
	ISNULL(Tpb484.co_insurance_amount, 0)		AS [Co-insurance],
	(SELECT "Total" =
		ISNULL(Tpb484.noncovered_amount, 0) + 
		ISNULL(Tpb484.rejected_amount, 0))		AS [Total Rejection]
FROM
	TPB484_ER_FILE_PATIENT_HISTORY		AS Tpb484
		ON Tpb484.ivo_int_id		= Tpb311.ivo_int_id
		AND Tpb484.plan_int_id	= Tpb311.plan_int_id
		AND Tpb484.pyr_seq_no	= Tpb311.pyr_seq_no
	JOIN 
		TPB486_REASON_HISTORY_DETAIL	AS Tpb486
		ON Tpb484.er_file_pat_hst_int_id = Tpb486.er_file_pat_hst_int_id
	JOIN
		TPB483_ER_REJECTION_CODES		AS Tpb483
		ON Tpb486.rsn_cod_int_id = Tpb483.er_rejection_codes_int_id
		AND Tpb483.rej_cd_description <> '16'
WHERE
	Tpb483.rej_cd_description		= 'DENIED'
	AND 
		ISNULL(Tpb484.noncovered_amount, 0) + 
		ISNULL(Tpb484.rejected_amount, 0) > 250.00
)

Although ragged when pasted in, the aliases are set on the closing brace at an exact distance easily viewable on a 'small' monitor. I match the equals signs, the assignment column (tab is my friend!) and explicitly define everything for readability. I admit that the 'INSERT INTO' block may be overkill, but six months from now when a new feature needs added, I will be less confused.

Kristen · August 4, 2015, 2:01pm

Not to me We include every column in the table, with the unused ones commented out. The column-list is mechanically generated, and we can easily freshen it up (e.g. if/when the DDL changes).

We do the same with SELECT's column lists if they include "most" of the columns. It is as important to me to see what is NOT there, as what is included

Personally I don't see a benefit in aligning the AS alias - to my mind they go-with the column on the left, so I would want them close to it to avoid accidentally following the wrong line across the white-space, but if the reason is to ensure that all-lines-have-one then I can see that it could catch an error / non-conformance.

Couple of observations:

We prefix ALL tables (and Sproc calls etc.) with "dbo." as it used to be the case that that prevented the optimiser from checking if the same object existed in a schema matching the user's connection ID (or somesuch). Is it still the case that there is some (slight??) performance benefit from that I wonder? Similar reason for not prefixing SProc names with "sp_" because SQL checks MASTER for them before checking the current DB.
Curious why you have

		...,
	(SELECT "Total" =
		ISNULL(Tpb484.noncovered_amount, 0) + 
		ISNULL(Tpb484.rejected_amount, 0))	AS [Total Rejection]

instead of just

		...,
	ISNULL(Tpb484.noncovered_amount, 0) + 
		ISNULL(Tpb484.rejected_amount, 0)	AS [Total Rejection]

and, for me, I would have the "+" at the left-end of the second line (again, I see things like that getting lost at the Right-End and I might overlook a MINUS that should be a PLUS or similar)

On another point:

We only use left-side comma, for continuation lines, in a few instances, favouring right-side for them. I find the need to comment out the first line of a list of columns is as frequent as any other one! so I haven;t found that left-side commas are a gain, but I'd be interested in other views

However, we do have templates which have things like:

    PRIMARY KEY
    (
        MyPKeyColumn1
        , MyPKeyColumn2
        , MyPKeyColumn3
        , MyPKeyColumn4
    )

A global Find&Replace substitutes the MyPKeyColumnN for its actual column name, and then we just need to delete / comment out the unused ones. PITA to have to go through the code to do it, and for anything "frequently used" we set up some mechanically-generated code instead, but they takes a lot longer to write than just a template, and a lot longer to maintain if the template needs to change (we do improve things when we can ) so our infrequently-ish used templates are "manual"

JimSnyder · August 4, 2015, 2:56pm

I have seen the entire table's columns with those not in use commented out and like that style. The only reason I do not use that is a stored procedure may only use ~ 8 tables, but at least three will have 100+ columns and I have to type them in manually. Maybe you can tell me how you mechanically generate column lists (script table creation?).

I remember back at the turn of the century, I aligned the AS with the SELECT, FROM, and WHERE. The only excuse I have for out to the right and aligned is that when parsing the two in house programmers' code, they had the aliases embedded in the block of code that ran on from line to line and I parsed it there the first time by accident. I had to parse their code just to read it. The aligning was part of the same process of 'over parsing' to see what was going on. I may return to the alignment I used to use now that you point that out.

1.) Good tip. I tried it once and had errors to sort out and removed it to make the code as simple as possible and never added it back once fixed. The performance gain is measurable from my previous reading. As far as sp_, I/we use the hospital acronym (OMH_). I was happy to follow their style for that very reason. Their older stored procs (2-8 years old) started with either sp_ or usp_ (different authors).

2.) There was some sort of bug that I resolved by adding the "Total" at the beginning and never had time to go back and figure out. I have avoided the left side comma solely for readability of non-programmers looking at the code. I don't mind the few extra keystrokes for the first and last lines. Commenting out the rest of the lines is as simple. I wish this hospital did have some standards and used templates, but as an outside programmer 'horning in' from their owning company, I walk on glass just to keep the connection open (fail to be able to log in 3 out of 5 days) and am careful not to push unionized programmers that are in house. I do send them links to articles dealing with optimization, coding style, and technical coverage of T-SQL features like temp tables.

ScottPletcher · August 4, 2015, 3:15pm

I hate the practice of putting all the column names in, commented out. That's a huge waste of SQL proc buffer space, making the entire instance run more slowly. Just stop that, it's horribly inefficient.

gbritton · August 4, 2015, 3:24pm

...not to mention. if the schemas change, who changes the commented column names? No one, I'll bet.

JimSnyder · August 4, 2015, 3:35pm

In my case, the schema changes always involves a new table and the old one falling into disuse. The vendor needs to maintain the code base for previous clients.

Kristen · August 4, 2015, 4:44pm

In SSMS you can drag the [COLUMNS] label in the object tree onto the Query Window to get a list of columns. Don't think you have any control over the format though ...

We have an Sproc, with a short easy-to-type name (KLC - Kristen's List Columns, natch), which displays a table and its columns in various suitable formats - both single line command delimited and one-column-per-line. We have meta data which provides descriptive names for columns which are include as comments. The Sproc will take partial table, or column, names and if specific to a single table will display that table, otherwise a list of possible matches - it also handles VIEWs and if no match on any of those will display Sprocs with matching names and, again, if a single SProc matched then display parameter list.

We tend to retain a comment in the code adjacent to the FROM statement such as

SELECT ...
-- KLC MyTable2
FROM MyTable1
    JOIN MyTable2
        ON T2_ID = T1_ID
...

it is easy enough to double-click to select a different table and then double-click and paste to replace the parameter in the KLC line and highlight and execute that snippet to get a column list. We deliberately don't include quotes around the parameter - except when we want additional parameters - which makes it easier to fiddle with - our table and columns names are A-Z, o-9 and "_" only, so no word-break stuff to have to deal with We have a additional @CMD parameter that delivers the column list in different, less frequently used, ways such as

@ColName1 = ColName1,
@ColName1 = @ColName1,
@ColName1, @ColName2,

and "INSERT" which generates an

INSERT INTO
(
    ColList
)
SELECT
    [Col1]='xxx20',
    [Col2]=99,

etc. which indicates the size of the target column and datatype making it easy to cut & paste actual columns, or CONST data.

Code is proprietary to the Metadata tables we use, otherwise I'd publish the SProc for anyone interested. Perhaps better, if you like the idea, to knock up your own as it can evolve to meet your needs.

We pass ALL our SQL code through a compaction tool before putting it into production as I consider every space and every comment a waste of buffer space. I've never done any performance tests, but it surely has to be slower to parse verbose code than compact code?, plus I don't intend to make it easy and for anyone snooping on our code to help themselves; Perhaps WITH ENCRYPTION is more robust now .. it used to be trivially easy to reverse engineer.

On the contrary, we see it as hugely efficient as it reduces the number of bugs in the code during DEV

Correct, its usually only relevant when the code is written, or significant changes are undertaken (at which time the list would be refreshed), so it is valid (and maintained) during a code development cycle (6-months-ish) only.

As I mentioned this is not something we do every time, only when the majority of the columns are selected as we think the the exceptions as useful to see - i.e. we see it as Defensive Code that may highlight a bug.

We version the table - so the original table becomes MyTable_V2 and a view is created called MyTable which provides backwards compatibility. We usually locate all INSERT/UPDATE code and fix that to reference the new table, but SELECTs referencing the old table will often only get updated when that Sproc is changed for some other reason.

gbritton · August 4, 2015, 4:47pm

Love that idea! Home-grown?

Kristen · August 4, 2015, 4:55pm

Yeah, VBScript I think ... its got a bug because we have 3 or 4 sproc that have a big comment at the top saying "Don't compact this one" !!

We wrote it as I couldn't find anything out there, but I'm pretty sure I've seen some stuff out there quite recently - the last 6 months or so

Let me have a go at Googling that for you

JimSnyder · August 4, 2015, 4:58pm

Darn! It would be a tool I have no access to!

Kristen · August 4, 2015, 5:17pm

"sql source code minifier" seems to be the right sort of search phrase for Google.

Anything along the lines of "SQL Compaction" seems to wind up with Data Compression which is no use.

And "SQL Compressor" got me nowhere useful either

http://poorsql.com/ seems to do a reasonable job. However, it splits lines at close-to a specific point. Mine retains all the original line break positions - which gives me a chance of debugging it in that state if I have to

Original 3,036 bytes - not lavishly commented
My compaction tool 1,704 bytes
PoorSQL 1,678 bytes

http://codebeautify.org/sqlformat]http://codebeautify.org/sqlformat Puts the code all on one line, including "GO", which obviously doesn't work ... (Result = 1,754 bytes)

unless I'm doing something wrong?? this put the result all on one line, comments and all, so the first comment would inactivate the rest of the line ...

http://www.webtoolkitonline.com/sql-minifier.html

Kristen · August 4, 2015, 5:22pm

You still using Query Analyser? That does it too (but you have to open the COLUMN tree arm before it will work - you can close it and then drag, so I guess it just has to have cached the column metadata.

Otherwise you'll just have to knock up a column listing SProc

Kristen · August 4, 2015, 6:00pm

In case of interest we had an interesting discussion on SQL Source Code compression / compaction / minify in the old forum:

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=138560

JimSnyder · August 4, 2015, 6:06pm

No, since I am working remotely, they have given me sharply defined permissions. I ran profiler one time on a large result set and crashed their server (low memory and no TempDB ceiling). Their server is the production server with the 'test' environment packed alongside. I no longer have access to even profiler.

In my defense, the stored proc I was building was required to return the entire history of every patient since the purchase of the hospital two years ago, and even with temp tables retrieving only 25 columns out of 180, there were still several million rows to report. The attempt at using the profiler was to reduce the run time and memory footprint.

Kristen · August 4, 2015, 8:58pm

What do you use to write SQL? Maybe just an editor of some sort?

Sounds like it might be worth having a copy database locally, even if no data (assuming there is sensitivity over that being off-site), so that column names and the like are available to you.

Kristen · August 5, 2015, 12:48pm

I could do with some suggestions for how to test performance of a Minified Sproc.

I am thinking just to change every white space to a much larger one, and add a long comment to each line.

Would that do, or is there something better?

How to actually test? I can get a Parse Time from SET STATISTICS - is that my best option?

Maybe I need to

DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE

between each test, otherwise maybe the query plan is cached and there will be no parse time?

I can also use SET STATISTICS for the runtime, and I can also use a DATETIME2 to calculate the elapsed time.

Are those my best options?

jeffw8713 · August 5, 2015, 5:26pm

Interesting conversation - just one note:

I always change the tab settings in SSMS to replace tabs with spaces - and modify the size to 8 characters. 8 characters will line things up nicely for SQL whereas 4 just doesn't work (for me).

Converting to spaces helps insure that when someone views my code - they view it the way I meant it to be viewed and not all over the place because they have different tab settings.

Of course, if they have changed the font it might not quite line up...but still better than tabs that don't line up.

Kristen · August 5, 2015, 5:42pm

So your files will be bigger than mine (retained TABs) ... will be interesting to see if there is any discernible performance difference once the whitespace has been removed.

JamesK · August 5, 2015, 6:50pm

My gut feeling (with absolutely no basis in fact - pure speculation) is that, it will show no discernible difference in performance. My experience has been that the biggest bang for the buck comes from optimizing the queries to minimize data retrieval/sorting and that everything else has very little (orders of magnitude less) impact.

So when I write code, I write it the way that seems best and easiest to maintain/read, with absolutely no regard to how much space the code is taking. Then again, I work with puny databases - my biggest database is only 40GB - and I am the only database guy. So there is no one to point a finger at me and say "you are screwing up here".

Kristen · August 5, 2015, 6:57pm

If the Procedure is taking seconds to run then for sure any parse will be trivial, by comparison, but parse is mostly pure CPU. Every system I have ever worked on has had improve compile/parse times from compressed source code because the parser doesn't have to step-over the whitespace and comments; unless computer science has moved on there is no easy way to "know" where the next keyword is in the file, it is found by walking the characters in the line sequentially, ignoring any spaces etc.. I agree that it is not a reason to work with poorly formatted code, but if it turns out to make even a modest saving it would be a reason to consider compressing the source code prior to deployment. For some, such as me, some obfuscation is welcome too.

That said, right now I'd just like to make a performance test, and I'd appreciate any ideas folk have as to the best way of doing that.