SQLTeam.com | Weblogs | Forums

Please critique my code for file comparison and exclusion


#1

I am still fairly new to the coding world, but I also realize that just because I may have made something that works, it may not be the best way. So instead of feeling secure in the knowledge of a job well done and potentially building upon potentially faulty code under the pretense that it is perfect I ask for you to critique the hell out of it.

The code in question is a stored procedure I made to handle file loading. Because a number of the processes are not contained within the code I feel the need to explain them so it can be properly criticized. On a daily basis files are received in the form of .txt and .csv files. The .txt files are moved to the \dmserver\shared\polling\partsraw\ directory where they are automatically loaded to the system then copies of them are sent to the corresponding archive directory depending if they are a buy or a sell file. The .csv files are moved to the \dmserver\shared\polling\partsraw\handsfreesql\b4wash\csv\ where they are converted to .txt files and moved to the \dmserver\shared\polling\partsraw\ and subsequently loaded (then the .csv file is copied to the corresponding archive directory based on if it is a buy or sell file). For unknown reasons, some .txt and .csv files managed to skip the rails and end up in the corresponding archive directories without actually loading. The stored procedure in question captures the contents of the corresponding directory and compares it to the loading dates and if the file is newer sends it back to be reloaded. Then it checks the .csv files to verify if there is a corresponding .txt file and if not will send the .csv file out to be converted and therefore processed.

Also as a side note, when it mentions rp_filemove this is a table that functions like a variable entry batch file that moves, copies, or deletes files based upon the parameters inputted and is run every 15 minutes.

and here is the procedure:

ALTER PROCEDURE [dbo].[DMSP_ReloadFiles]
AS
BEGIN
SET NOCOUNT ON;
DELETE
FROM rp_filemove
WHERE filemove_system = 'ken'
TRUNCATE TABLE files
TRUNCATE TABLE filessell
TRUNCATE TABLE filesbuy
TRUNCATE TABLE buyfiledate
TRUNCATE TABLE sellfiledate
INSERT files (
	filename
	,depth
	,type
	)
EXEC master.sys.xp_dirtree '\\dmserver\shared\polling\archive\buy'
	,1
	,1;
INSERT filesbuy (
	dmnumber
	,filename
	)
SELECT LEFT(filename, 7)
	,filename
FROM files
WHERE type = 1
	AND RIGHT(filename, 4) = '.txt'
INSERT buyfiledate (
	dmnumber
	,filename
	,datepolled
	)
SELECT f.dmnumber
	,f.filename
	,max(p.datepolled) AS datepolled
FROM filesbuy AS f
INNER JOIN pollingdata AS p ON f.dmnumber = p.dmnumber
WHERE p.verbage = 'buy'
GROUP BY f.dmnumber
	,f.filename
ORDER BY f.dmnumber
DELETE
FROM buyfiledate
WHERE datepolled > getdate() - 1
	AND datepolled < getdate()
UPDATE buyfiledate
SET [system] = 'KEN'
	,sequence = '100'
	,directory = '\\dmserver\shared\polling\archive\buy\'
	,action1 = 'FileNameContains'
	,action2 = 'MoveFile'
	,parameter2 = '\\dmserver\shared\polling\partsraw\'
WHERE dmnumber <> ''
INSERT INTO rp_filemove (
	filemove_system
	,filemove_sequence
	,filemove_directory
	,filemove_action1
	,filemove_parameter1
	,filemove_action2
	,filemove_parameter2
	)
SELECT [system]
	,sequence
	,directory
	,action1
	,[filename]
	,action2
	,parameter2
FROM buyfiledate
WHERE dmnumber <> ''
TRUNCATE TABLE filesbuycsv
INSERT filesbuycsv (
	dmnumber
	,filename
	)
SELECT LEFT(filename, 7)
	,filename
FROM files
WHERE type = 1
	AND RIGHT(filename, 4) = '.csv'
UPDATE filesbuycsv
SET txt = '1'
WHERE dmnumber IN (
		SELECT LEFT(filename, 7)
		FROM files
		WHERE type = 1
			AND RIGHT(filename, 4) = '.txt'
		)
DELETE
FROM filesbuycsv
WHERE txt = '1'
UPDATE filesbuycsv
SET [system] = 'KEN'
	,sequence = '100'
	,directory = '\\dmserver\shared\polling\archive\sell\'
	,action1 = 'FileNameContains'
	,action2 = 'MoveFile'
	,parameter2 = '\\dmserver\shared\polling\partsraw\handsfreesql\b4wash\csv\'
WHERE dmnumber <> ''
INSERT INTO rp_filemove (
	filemove_system
	,filemove_sequence
	,filemove_directory
	,filemove_action1
	,filemove_parameter1
	,filemove_action2
	,filemove_parameter2
	)
SELECT [system]
	,sequence
	,directory
	,action1
	,[filename]
	,action2
	,parameter2
FROM filesbuycsv
WHERE dmnumber <> ''
TRUNCATE TABLE files
INSERT files (
	filename
	,depth
	,type
	)
EXEC master.sys.xp_dirtree '\\dmserver\shared\polling\archive\sell'
	,1
	,1;
INSERT filessell (
	dmnumber
	,filename
	)
SELECT LEFT(filename, 7)
	,filename
FROM files
WHERE type = 1
	AND RIGHT(filename, 4) = '.txt'
INSERT sellfiledate (
	dmnumber
	,filename
	,datepolled
	)
SELECT f.dmnumber
	,f.filename
	,max(p.datepolled) AS datepolled
FROM filessell AS f
INNER JOIN pollingdata AS p ON f.dmnumber = p.dmnumber
WHERE p.verbage = 'sell'
GROUP BY f.dmnumber
	,f.filename
ORDER BY f.dmnumber
DELETE
FROM sellfiledate
WHERE datepolled > getdate() - 1
	AND datepolled < getdate()
UPDATE sellfiledate
SET [system] = 'KEN'
	,sequence = '100'
	,directory = '\\dmserver\shared\polling\archive\sell\'
	,action1 = 'FileNameContains'
	,action2 = 'MoveFile'
	,parameter2 = '\\dmserver\shared\polling\partsraw\'
WHERE dmnumber <> ''
INSERT INTO rp_filemove (
	filemove_system
	,filemove_sequence
	,filemove_directory
	,filemove_action1
	,filemove_parameter1
	,filemove_action2
	,filemove_parameter2
	)
SELECT [system]
	,sequence
	,directory
	,action1
	,[filename]
	,action2
	,parameter2
FROM sellfiledate
WHERE dmnumber <> ''
TRUNCATE TABLE filessellcsv
INSERT filessellcsv (
	dmnumber
	,filename
	)
SELECT LEFT(filename, 7)
	,filename
FROM files
WHERE type = 1
	AND RIGHT(filename, 4) = '.csv'
UPDATE filessellcsv
SET txt = '1'
WHERE dmnumber IN (
		SELECT LEFT(filename, 7)
		FROM files
		WHERE type = 1
			AND RIGHT(filename, 4) = '.txt'
		)
DELETE
FROM filessellcsv
WHERE txt = '1'
UPDATE filessellcsv
SET [system] = 'KEN'
	,
SET sequence = '100'
	,directory = '\\dmserver\shared\polling\archive\sell\'
	,action1 = 'FileNameContains'
	,action2 = 'MoveFile'
	,parameter2 = '\\dmserver\shared\polling\partsraw\handsfreesql\b4wash\csv\'
WHERE dmnumber <> ''
INSERT INTO rp_filemove (
	filemove_system
	,filemove_sequence
	,filemove_directory
	,filemove_action1
	,filemove_parameter1
	,filemove_action2
	,filemove_parameter2
	)
SELECT [system]
	,sequence
	,directory
	,action1
	,[filename]
	,action2
	,parameter2
FROM filessellcsv
WHERE dmnumber <> ''
END

#2

First off, I wouldn't do this in a proc. I'd use SSIS

Since you're using a proc, one immediate thing is that multiple updates, e.g.

update filessellcsv set sequence = '100' where dmnumber <> ''
update filessellcsv set directory = '\\dmserver\shared\polling\archive\sell\' where dmnumber <> ''
update filessellcsv set action1 = 'FileNameContains' where dmnumber <> ''
update filessellcsv set action2 = 'MoveFile' where dmnumber <> ''

You're forcing SQL to read the table several times in a row. Better to use one update:

update filesselccsv set sequence = '100'
                                 , set direcgtory = '\\dmserver\shared\polling\archive\sell\' 
                                 , ... etc.
where dmnumber <> ''

#3

Wonderful, I have updated the code to reflect the suggestion of Gbritton as I have already applied it to my system.


#4

Another improvement:

TRUNCATE TABLE is generally much faster than DELETE FROM, as you have used. Also, it does not log the deletes.


#5

Strangely the "truncate" command was never covered in any of my SQL classes. Thank you again and I have altered the code to reflect the changes.


#6

Tip: See Poorsql.com for online formatting


#7

Cool, I've always hated having to go back and make my code legible. It's good to see that someone has taken the pain out of doing so.