I have a two part question I hope someone could assist me with.
Part 1
I have multiple .txt files I am loading into a SQL table, there is no indicator within the data as to which file is which (ex. file1, file2, etc.). The first file went through perfectly but the second had an error but still loaded some lines. I know the line count of the first file, so the question is can I delete all the rows that come after the last line of the first file or does the new data from the second file just get mixed in with the existing data?
Part 2
To avoid this in the future, is there a way to add a column to the text file that indicates which file it is (for example adding File# column at the end)? That way I can delete rows that are in specific file number if I need to. Unfortunately the files are too big for me to open to add the column manually
Mike I checked out the link, bit I am not that advanced in SQL, but I did try to find other articles that were a little less advanced and apparently all those seem to be targeted for dba's and other advanced users as well. Trying something on my own, I was able to add a new column (see below)
As you can see, when I got to the data source page I added a column and named it FileName but I didn't know how to get it to use the files name to populate the field. When I run it I get an error during execution.
It looks like you are using the import wizard from SSMS to import your files. If that is the case - what you should do is build the import from the wizard, but instead of running it immediately - save it as an SSIS package.
Once you have that - you can create a project in SSDT and add that package. In SSDT you can then modify how this works. In SSDT you can add a derived column transformation where you add a derived column with the data you want included.
If that is too much (for now) - then import each file into separate staging tables. Once both files have been imported - you can then write a query that includes your additional column(s) - then use UNION ALL to combine each table into a single result.
Thanks Jeff, and Yes I am using the wizard. That is only the half of it though..I have an encrypted external hard drive that only allows me less than 3 minutes to get things running after I put in the password. If the files are idol for 3 mins. or less the hard drive locks down again. I actually have 20 45GB files that I need to load. This problem came about as I loaded the first file without an issue yesterday afternoon. Started the 2nd one before I went to sleep (that's how long each file takes to run). When I woke up I saw that the file had stopped loading but still loaded some items to the table. Because I didn't have the file names I had to delete all 108 million lines and start over instead of just backing out the data from the second file. I am not sure if I would have the time to save it as a SSIS package. Also, is SSDT something I have to get and download?
Even more of a reason to save the package - and run it from SSDT. Yes - you have to download Visual Studio and enable the data services tools, but that is a free download (Community Edition). Once installed - use the Visual Studio Installer to modify and enable and install the SSIS tools. You then create a project using the Integration Services template and add the saved import/export package to that project.
Once you have that - it is very easy to make changes and for new files you can just copy/paste to a new project. It saves time from having to reconfigure everything and then you can run the package directory from SSDT.
I was able to download Visual Studio and there are some things that I am not sure of but I first wanted to take your suggestion and build the import from the wizard and saving it as an SSIS package but I am getting the following error:
Thanks @yosiasz that's another very complicated procedure. My Visual Studio doesn't even look close to what's in that article and as stated before its . I appreciate the help from everyone but I am going to just try to figure something else out. The article says in its introduction that is a true friend for DBA and developers. I am neither of those and not at an advanced level. But again, thanks for the help. I got a few ideas from all the comments.