Deserialize SQL Server image field back in Excel format

Hello,
I have a SQL Server table that contains serialized Excel files, with 3 fields:

IdDocument -> int (PK)
DataFile -> image
FileName -> nvarchar

where DataFile contains the Excel file serialized, and FileName the name of the file (with path).

Something like this:

0xD0CF11E0A1B11AE100.....
U:\SAP_R3V4_Validation_Documents\March2012.xls

Now I need to get these files back in Excel format.

How can I accomplish this?

Thank you in advance.

Luis

You can do this from the command line using the bcp utility:

You will need to use a format file, copy and paste the text below and save to an ANSI/ASCII (non-Unicode) txt file (e.g. excel.fmt)

9.0
1
1 SQLBINARY 0 0 "" 1 DataFile ""

You can then do the following using bcp (must all be on a single line):

bcp "SELECT DataFile FROM myTable WHERE IDDocument=1" queryout "U:\SAP_R3V4_Validation_Documents\March2012.xls" -S myServer -T -f excel.fmt

You would need to do some additional programming to dynamically assign the filename to each Excel file. If that's the case let me know, I'll post a solution here.

Yes, I should iterate through every Id in the table, and create the Excel with the name in the FileName column.
Then I can save them where I want, that's not a problem.
The important thing is to have the Excel files somewhere.
Maybe an aspx page, or a C# Console Application could be fine.

A C# console app could also work, but bcp is installed by default on your SQL Server, and can be downloaded and installed from the link I provided. It's much simpler than writing additional code.

If you go that route, or use an ASPX page, you need to ensure that you get the raw bytes from the image column and write them out exactly as-is. bcp by default will include some metadata, that's why the format file is required. I'm rusty on C# but I believe the StreamReader class will provide what you need.

Using the BCP road, how can I iterate for each Id?
I have 600 rows in the table, so I should restore 600 Excel files.
The DB name is:
SapProfile
and the server is localhost.

This bcp will create a text file of IDs and file names:

bcp "select IdDocument,FileName FROM SapProfile.dbo.myTable ORDER BY 1" queryout docs.txt -S localhost -T -c -t,

The docs.txt file will have both columns separated by a comma.

This next command will read that file and parse each column based on a comma delimiter, assigning the ID to variable %a and the file name to %b. It will then construct a bcp command to query for the correct document and write it to the correct file name:

for /F "tokens=1-2 delims=," %a in (docs.txt) do bcp "SELECT DataFile FROM SapProfile.dbo.myTable WHERE IDDocument=%a" queryout "%b" -S localhost -T -f excel.fmt

If you need to change the file path, you can use the following to extract just the file name:

bcp "select IdDocument,reverse(substring(reverse(FileName),1,charindex('\',reverse(FileName))-1)) FROM SapProfile.dbo.myTable ORDER BY 1" queryout docs.txt -S localhost -T -c -t,

You can then embed a fixed path in the subsequent command:

for /F "tokens=1-2 delims=," %a in (docs.txt) do bcp "SELECT DataFile FROM SapProfile.dbo.myTable WHERE IDDocument=%a" queryout "D:\MyNewPath\%b" -S localhost -T -f excel.fmt

Let me know if this is not clear.

I've tried this command:

bcp "select IdDocumento,reverse(substring(reverse(Nomefile),1,charindex('',reverse(Nomefile))-1)) FROM DBGestioneProfiliSAP.dbo.tblDocumento ORDER BY 1" queryout D:\SAPdocs.txt -S XSPW11P491F -U sanapp -P SQL01 -T -c -t

(with the real names)

but I receive this error:

SQLState = 28000, NativeError = 18456
Error = [Microsoft][SQL Server Native Client 11.0][SQL Server]Login failed for user 'PHARMA\PU129359'.

The PU129359 is my username that I use to connect via Remote Desktop to the server.

Maybe because I'm not an Admin in this server?

Luis

Now I've added my login to the server, but I got this error:

Starting copy...
SQLState = 37000, NativeError = 537
Error = [Microsoft][SQL Server Native Client 11.0][SQL Server]Invalid length parameter passed to the LEFT or SUBSTRING
function.

This one works:

bcp "select IdDocumento,Nomefile FROM DBGestioneProfiliSAP.dbo.tblDocumento ORDER BY 1" queryout D:\SAPdocs.txt -S XSPW11P491F -U sanapp -P SQL01 -T -c -t

and I get a txt file like this:

9 U:\SAP_R3V4_Validation_Documents\Revisione 2012\primo trimestre 2012\Arena profili semplici marzo 2012.xls
10 U:\SAP_R3V4_Validation_Documents\Revisione 2012\primo trimestre 2012\Bongiorni profili semplici marzo 2012.xls

The columns are separated by TAB, not by comma.
Is a problem?
If I manually change the paths in this file, could be enough all the same?

And how can I run the
for.....
statement in Command console?

Regarding the "separated by tabs", you need to have

-t,

... with the trailing comma at the end of the bcp command, otherwise it will default to tab as column separator.

Regarding the SUBSTRING error, this should avoid it:

bcp "select IdDocument,reverse(substring(reverse(FileName)+'\',1,charindex('\',reverse(FileName)+'\')-1)) FROM SapProfile.dbo.myTable ORDER BY 1" queryout docs.txt -S localhost -T -c -t,

My fault on that last one, I hadn't considered file names without a path.

Running the "for" command is done in any Windows command console, but NOT in a PowerShell command console. I DO NOT recommend trying to run these via xp_cmdshell either. There will be Start menu items for "Command Prompt" under "Windows System" or "Windows Administrative Tools" if they don't appear elsewhere.

If you can't find them on the Start menu, choose the "Run" option and enter "cmd.exe" without quotes.

1 Like

Regarding the "separated by tabs", you need to have

-t,

... with the trailing comma at the end of the bcp command, otherwise it will default to tab as column separator.

**** Yes, I've lost the comma, sorry for that (default behaviour is TAB, right). *******

Regarding the SUBSTRING error, this should avoid it:

bcp "select IdDocument,reverse(substring(reverse(FileName)+'',1,charindex('',reverse(FileName)+'')-1)) FROM SapProfile.dbo.myTable ORDER BY 1" queryout docs.txt -S localhost -T -c -t,

My fault on that last one, I hadn't considered file names without a path.

***** I'll try tomorrow. I'll let you know. ****

Running the "for" command is done in any Windows command console, but NOT in a PowerShell command console. I DO NOT recommend trying to run these via xp_cmdshell either. There will be Start menu items for "Command Prompt" under "Windows System" or "Windows Administrative Tools" if they don't appear elsewhere.
If you can't find them on the Start menu, choose the "Run" option and enter "cmd.exe" without quotes.

*** For sure I'd prefer not use PowerShell. I've just seen the option about using xp_cmdshell in SSMS but, ok, I'll try with the Command Prompt ****

THANK YOU for now Robert. Really appreciated.

Luis

I've tried this command from the Command Prompt:

for /F "tokens=1-2 delims=," %a in (D:\SAPdocs3Rows.txt) do bcp "SELECT FileData FROM DBGestioneProfiliSAP.dbo.tblDocumento WHERE IDDocument=%a"
queryout "D:\Excel%b" -S XSPW11P491F -U sanapp -P SQL01mil -T -f excel.fmt

where the folder D:\Excel is already present in the server, and the file SAPdocs3Rows has just 3 rows like these:

9,D:\Estrazioni\SAP_R3V4_Validation_Documents\Revisione_2012\primo_trimestre_2012\Arena_profili_semplici_marzo_2012.xls
10,D:\Estrazioni\SAP_R3V4_Validation_Documents\Revisione_2012\primo_trimestre_2012\Bongiorni_profili_semplici_marzo_2012.xls
11,D:\Estrazioni\SAP_R3V4_Validation_Documents\Revisione_2012\primo_trimestre_2012\Bongiorni_profili_composti_marzo_2012.xls

but I got this error, and I don't understand why:

SQLState = S1000, NativeError = 0
Error = [Microsoft][SQL Server Native Client 11.0]Unable to open BCP host data-file

Simplify more the filedata (only 1 row), I've got this error:

Starting copy...
SQLState = S1000, NativeError = 0
Error = [Microsoft][SQL Server Native Client 11.0]Format file could not be opened. Invalid name specified or access denied.
SQLState = S1000, NativeError = 0
Error = [Microsoft][SQL Server Native Client 11.0]Unable to resolve column level collations

BCP copy out failed

Now I have the file with just one row:

9,D:\Excel\Arena_profili_semplici_marzo_2012.xls

The name is:

D:\Excel\SAPdocs1Row.txt

and I run this command:

for /F "tokens=1-2 delims=," %a in (D:\Excel\SAPdocs1Row.txt) do bcp "SELECT FileData FROM DBGestioneProfiliSAP.dbo.tblDocumento WHERE IDDocumento=%a" queryout "%b" -S XSPW11P491F -U sanapp -P SQL01mil -T -f excel.fmt

and I got this error:

C:\Users\PU129359>for /F "tokens=1-2 delims=," %a in (D:\Excel\SAPdocs1Row.txt) do bcp "SELECT FileData FROM DBGestioneProfiliSAP.dbo.tb
lDocumento WHERE IDDocumento=%a" queryout "%b" -S XSPW11P491F -U sanapp -P SQL01mil -T -f excel.fmt

C:\Users\PU129359>bcp "SELECT FileData FROM DBGestioneProfiliSAP.dbo.tblDocumento WHERE IDDocumento=9" queryout "D:\Excel\Arena_profili_
semplici_marzo_2012.xls" -S XSPW11P491F -U sanapp -P SQL01mil -T -f excel.fmt

Starting copy...
SQLState = S1000, NativeError = 0
Error = [Microsoft][SQL Server Native Client 11.0]Format file could not be opened. Invalid name specified or access denied.
SQLState = S1000, NativeError = 0
Error = [Microsoft][SQL Server Native Client 11.0]Unable to resolve column level collations

BCP copy out failed

C:\Users\PU129359>

The Excel file is present, but is empty and could not be opened.

My original example assumed that excel.fmt file is in the same directory that you're running bcp from. If it's not, then include the full drive and path for it:

-f D:\path_to_format_file\excel.fmt

If that still doesn't work, make sure your account has permissions on the folder to read the format file. (which I would already assume if you were able to write the file, but never hurts to check)

1 Like

Here the solutions that works:

The more important aspect is to have a FMT file.
For Excel extractions, should be like this:

9.0
1
1 SQLBINARY 0 0 "" 1 FileData ""

and saved with a .fmt extension.

"FileData" is the name of the image/varbinary column in the SQL Server table, from which we need
to extract the Excel files.

Then I've used this script to extract the file name:

bcp "select IdDocumento,reverse(substring(reverse(Nomefile)+'',1,charindex('',reverse(Nomefile)+'')-1)) FROM DBGestioneProfiliSAP.dbo.tblDocumento ORDER BY 1" queryout D:\Excel\SAPdocsFileNames.txt -S -S XSPW11P491F -U sanapp -P SQL01 -T -c -t,

and save the file names, with IDs, in the TXT file "D:\Excel\SAPdocsFileNames.txt".

Then, from the Command Prompt, I run this BCP command:

for /F "tokens=1-2 delims=," %a in (D:\Excel\SAPdocs101-200.txt) do bcp "SELECT FileData FROM DBGestioneProfiliSAP.dbo.tblDocumento WHERE IDDocumento=%a" queryout "%b" -T -f D:\Excel\excel.fmt

Thanks a lot Robert, your help has been incalculable.

Luis

1 Like

Glad it worked for you.

Also note that this will work for any binary file stored in an image or varbinary(max) column in the database. I've extracted JPEG and PDF files using this technique. The same format file will work for them all, you may just need to change the column name in the file, or modify your bcp query to alias the real column name to the one in the format file.

1 Like