SQLTeam.com | Weblogs | Forums

Import text file


#1

Hi please help..
I have difficulties of importing a csv file from excel into sql server. The problem is the file has 2 kind rows.
first row have the header data and the second have detail transaction for example

'1002','2016-04-11','Bell,Mr','2','15.000'
'1002','M-001','1','10.000'
'1002','M-002','1','5.000'

note
first row is the header of the record 1002= transaction number,'2016-04-11' is transaction date, 'Bell.Mr' is customer name, ' 2' is number of record, '15.000' is total value.
second row and the third is the detail data.

Please tell me how to import it into sql server..

thanks,


#2

If it was me I would import it into a Staging table, with each column defined as VARCHAR(8000). I would add an IDENTITY column so that I could reference the rows in the order they were imported (i.e. Header Row followed by one/many Item Rows).

I would add addition columns for Type (and update the table so that was set to Header or Item) and perhaps also an Error Message column so that I could put a suitable message in that column for any data that was goofy. Then I would import the data, from the staging table, into the appropriate Header and Item tables.

I would either exclude any rows that had Error Messages from the import, put suitable, dummy, data in the column with goofy data, or not run the import at all if there were any rows with error messages (whatever makes more sense for the client in order to fix up the source data and re-run the import)


#3

do you have any example of doing this ?

thx


#4

Don't think I do, no.

  1. Create table with enough columns (at least as many as the row with the most columns). Include columns for IDENTITY, Type and ErrorMessage
  2. Import your XLS into that table (using whatever method you are most comfortable with - it will need to be a method that allows variable numbers of columns, per row)
  3. UPDATE the table to assign a value to the Type column. I suspect the easiest way to do that would be to reference the previous row, if COL1 (TransactionNumber) is the same value then it is a Detail Row, otherwise it is a Header Row.

Folk here can help with that if you aren't sure about that

  1. (THAT IS 4!! The useless formatting in this forum is changing it to a "1") UPDATE the ErrorMessage column for any goofy data. I would use multiple update statements for that, one for each data validation "rule".
UPDATE U
SET ErrorMessage = COALESCE(ErrorMessage, ''), + 'Invalid date [' + COL2 + ']. ',
    COL2 = NULL  --- Remove the goofy data (so that the row will not cause an error, IF you then import it)
FROM YourStagingTable
WHERE Type = 'Header'
AND IsDate(COL2) = 0 AND COL2 IS NOT NULL

UPDATE U
SET ErrorMessage = COALESCE(ErrorMessage, ''), + 'Customer Name too long [' + COL3 + ']. ',
    COL3 = LEFT(COL3, 29) + '+'  --- Shorten data which is too long
FROM YourStagingTable
WHERE Type = 'Header'
AND LEN(COL3) > 30 -- Max length for Customer name

... and so on ...

Then update the actual table

INSERT INTO YourHeaderTable(TransactionNumber, TransactionDate, ...)
SELECT COL1, COL2, ...
FROM YourStagingTable
WHERE Type = 'Header'
--    AND ErrorMessage IS NOT NULL --- Comment this in if you want to EXCLUDE any rows that have error messages

Repeat for YourDetailTable (but beware if you have excluded any HeaderTable rows that have error messages, as you should not insert DetailTable rows where there is no HeaderTable row!!


#5

thanks a lot