Archiving old file when using a variable to pick file name

Hi
I've got a package that has a ForEach loop container with data flow inside it, at the moment the collection inside the loop is set to the variable I created for the file name - any .xls file, and is set to fully qualified using the path where the file resides.

It runs fine and picks up the name of the xls file placed in the path, however I want to now be able to archive the old file once the job is finished, I tried to put in a couple of source and destination variables, but the issue I am finding is that I need to populate another variable with the file name itself, not the whole path, for some reason if I change the setting in the for loop to have name only, it fails to run, I'm guessing that's because it can't find the full path at the start of runtime.

So is it possible to populate the file name only, from the initial filename it picks up, into a variable, once the full path has been mapped to the file name?

I'm trying to follow a guide below that I reckon should work, but it's just the variables I need to actually get the file name on it's own so I can then pass it to the archive directory.

SSIS Moving File Settings

name   : value
dstDir : D:\Target_folder\Old 
srcDir : D:\Target_folder

FileName : 
srcPath : this should be an expression concating srcDir **and FileName**
dstPath :this should be an expression concating dstDir **and FileName**


Maybe there is another way, if not, any suggestions for above set up?

Thanks

G

Since you already know what the srcDir and dstDir are, you can change the ForEach loop to get only the filename and extension. Then cat that to the filename when moving. Probably a good idea to set DelayValidation to True. I Used a script task to move the files since I don't trust the File System Task.

I don't use the file enumeration in the For Each container - I prefer building an object in a script task to parse a directory and then enumerate that object.

Next - I use project deployment so I have access to project parameters. In the project parameters I setup a root directory parameters (RootDirectory - String = UNC path to root folder) which points to the location where the files will be dropped.

In the For Each - the Enumerator is 'Foreach ADO Enumerator', the object created in the script task and 'Rows in the first table'.

The variables populated from the object are: User::Filename, User::ArchiveFileName, User::FileDate

I then have package variables defined:

ArchiveDirectory string = @[$Project::RootDirectory] + "Archive\"
ArchiveFileName string
FileDate datetime
FileName string
FullFileName string = @[$Project::RootDirectory] + @[User::FileName]
SortedFileList Object

With this - I then use a script task inside the for each. Assign read only variables: User::ArchiveFileName, User::FullFileName

The script has this code:

            bool fireAgain = true;

            // Get the source and destination files
            string fullFileName = Dts.Variables["User::FullFileName"].Value.ToString();
            string archiveFileName = Dts.Variables["User::ArchiveFileName"].Value.ToString();

            try
            {
                if (File.Exists(fullFileName))
                {
                    File.Copy(fullFileName, archiveFileName, true);
                    File.Delete(fullFileName);

                    Dts.Events.FireInformation(0, "Archive File Task", fullFileName + " archived to " + archiveFileName, string.Empty, 0, ref fireAgain);
                }
    
                Dts.TaskResult = (int)ScriptResults.Success;
            }
            catch (Exception ex)
            {
                Dts.Events.FireError(0, "Archive File Task", ex.ToString(), string.Empty, 0);
            }

The reason we copy and delete - instead of move - is because Windows will retain file settings on move and if the destination directory is compressed the file will not be compressed. When a file is copied - the file picks up the settings from the directory and will be compressed.

The key is creating package variables that use the root directory to build variables for the full path to the source file and the archive file name.

To use the same concept with the 'Foreach File Enumerator' - you set a project or package parameter to the root directory and use an expression in the foreach loop container to set the Folder property. Retrieve the name and extension only and set that into your variable User::FileName. Using the same variables I have set up - you can then use the file system task and specify the variables for the move option to archive the file.

1 Like

Or if you are not fully vested in ssis you could use scripting languages such as python or powershell script. Eliminates a lot of the ssis widgets and configuration

You could do that - but then you have to code all the 'widgets' yourself - including building a framework for parallel execution of tasks, logging of errors, etc... :wink:

1 Like

In the for each loop container I have the enumerator configuration set to the folder where the file sits. I have files set to "*.xls"

There is one expression in the foreach loop set up called "filenameretrieval" which has the variable @[User::xlFileName] in it, this file is set to blank in the variable definition. I'm not sure if this has to have a hard coded path to the destination, but does the folder path in the enumerator configuration not do that?

Variable mapping is set to index = 0

The problem seems to be with the excel connection manager,

Exception from HRESULT: 0xC020801C
Error at Package [Connection manager "Excel Connection Manager"]: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E4D.

Error at Data Flow Task [Excel Source [2]]: SSIS Error Code DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER. The AcquireConnection method call to the connection manager "Excel Connection Manager" failed with error code 0xC0202009. There may be error messages posted before this with more information on why the AcquireConnection method call failed.

I'm trying to load the file name only but it seems the connection is looking for a path, every time I put the path in to the excel connection it disappears back to an empty path.

I've got delay validation set to true on all objects,

If I go into the Excel Source and click on Columns a message pops up at the bottom saying:

Exception from HRESULT: 0xC020801C
Error at Package [Connection manager "Excel Connection Manager"]: The connection string format is not valid. It must consist of one or more components of the form X=Y, separated by semicolons. This error occurs when a connection string with zero components is set on database connection manager.

Error at Package: The result of the expression "@[User::varSourceFolder]" on property "\Package.Connections[Excel Connection Manager].Properties[ConnectionString]" cannot be written to the property. The expression was evaluated, but cannot be set on the property.

Error at Package [Connection manager "Excel Connection Manager"]: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E4D.

Error at Data Flow Task [Excel Source [2]]: SSIS Error Code DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER. The AcquireConnection method call to the connection manager "Excel Connection Manager" failed with error code 0xC0202009. There may be error messages posted before this with more information on why the AcquireConnection method call failed.

I had this working before but nothing seems to work now. My thinking on this is that I need 2 variables with source and destination, and one extra variable with the file name, then use a combination of source + Filename for picking up the file, then dest + Filename to move file into the archive.

But I seem to need a full path to define the variable file name, rather than just name only, but this wont then concatenate onto the variable for the destination because it would be like concatenating 2 file paths.

G

In the ForEach - you specify both the folder and the file pattern. You can use a variable to define the Directory in the Expressions in the Collection:

image

Set the Directory to an expression that defines where to find your files. Set the enumerator to pull name and extension.

Create a variable for the FullFileName as an expression using the variable/parameter defined for the directory + the variable returned from the for each loop container. Create a second variable for the archive directory - and a third variable using the archive directory + the variable returned from the for each for the archive file name.

Now - in the file system task...use the variable that defines the full file name as the source, and the archive file name as the destination - with either Move or Copy.

In the connection manager - use the variable for the full file name as the connection or something like this:

image

just a teaser with python. No widgets @jeffw8713 :wink:

import os
import shutil
archive_directory = 'C:\mcc\Projects\archive'
directory = 'C:\mcc\Projects\snipeit\IT'
for filename in os.listdir(directory):
    print(filename)
    
    shutil.move(os.path.join(directory, filename), 
os.path.join(archive_directory, filename))

Not really much different from the code I posted for the script task to archive files - this uses 'move' which is a bad idea if the destination is compressed, since the file would not be compressed after the move.

This does not import the file into a table - although that wouldn't be hard to do in either PS or python it does not easily provide for derived columns, conditional splits, merge, join or other transformations in SSIS. Nor would it be easy to setup a process that reads multiple files and unions/merges those files into a single destination - or easily allow splitting the data from a single source to multiple destinations...all of which would need to be hand-coded.

Yes - you are correct that you can do this in PS or python (or any language - including .NET), but there is functionality in SSIS that isn't easily reproducible.

yep just a teaser for the move part. a lot of the stuff you mention is doable in python though