SQLTeam.com | Weblogs | Forums

Pros / Cons of Database Backup Compression

compression

#1

Any reason NOT to use SQL's inbuilt Backup Compression?

Seems to me it saves disk space and may reduce performance of the backup, but I suspect that in practical terms the performance issue is non existent?

From what I have read (nothing easily found that was recent) the concensus was to turn it on and forget about it, with possible exception of databases storing Image/PDF/etc data or using Transparent Data Encryption.

For me the gains would be saved disk space (not a current issue ...) and tape space (much more of a benefit) plus the dramatically (I presume) reduced time if we ever had to retrieve from tape and then restore. I gather that restore-from-compressed-backup-file-on-disk is faster (less I/O) which is definitely an advantage, not that we restore often - but when we do its normally DEVs on TEST servers and they'll only sit twiddling their thumbs during the Restore :slight_smile: so its actual money-saved. Reduced backup time (i.e. I/O reduction) is not of particular interest to me (or so I think??) as backup [well ... "big backups"] happen out-of-hours and on dedicated backup-only devices.

Only other PRO that occurs to me is that our backups are synchronous - one database after the other - so when the backups are big their is a delay before the last databases in the list gets serviced.

Makes me wonder why Compressed is not now the default setting in SQL install ...


I want to take a database backup automatically and to zip the database automatically
Log shipping and log backups with truncate
Backup compression
#2

You can make it the default:

USE master;
GO
EXEC sp_configure ‘backup compression default’, '1';
RECONFIGURE WITH OVERRIDE;

#3

Thanks - is if the default in YOUR shop? (and if not then it begs the question "why not"? :slight_smile: )


#4

Frankly I don't know. We have thousands of instances from 2005 through 2014 and I'm not on the team that manages new deployments.


#5

I'll be interested to hear what others do.

It just strikes me that I haven't considered this until now. It become available without massive licence ("Enterprise") cost in SQL2008R2 (I think) and strikes me that there is no reason NOT to use it and, as such, it could be turned on by default on new installs (as .docX is the default SAVE AS type in Word, rather than the original .DOC).

I'm guessing that there are lots of shops, like mine, merrily upgrading through the versions (heading for SQL2014 here) who overlook gaining a benefit like this.

An install option "MS Recommend changing your current setting" would do me just fine :smile:


#6

It improves performance time and has huge storage savings. We have it as the default on our servers.


#7

We backup without compression to another local computer (seated on secondary fast network, to avoid network "jam" for normal users), which then compress using 7-zip, distributes compressed archive to various cloud locations and finally puts archive in queue for next tape backup session (together with other archives to be backed up).

We found 7-zip produces much smaller archives than sql server with compression (mssql 2008 r2). Maybe compression is better on newer versions (I haven't tested).


#8

Interesting, however post-processing to e.g. ZIP requires I/O bandwidth to create the original (full size) backup in the first place. I've read that that actually uses more CPU than compressing (i.e. within SQL) and thus writing a smaller file all in one step. I suppose you also have the extra downtime of decompressing the ZIP file before restore can start - with the Business Manager marching up and down in your office ... sweating !!!

But if maximum compression is the goal then you need the best compression tool you can get. I expect that SQL is doing a compromise of minimal-CPU at the risk of inefficient compression. When I have used ZIP compression tools with the "maximum compression" settings they take much longer, and usually achieve very little extra compression - in fact if I have a huge file to ZIP-and-transfer I usually use the FAST compression method as the filesize for transfer is not that much bigger, but the time saved on compression (and decompression :slight_smile: ) is significant.

There used to be a tool (forgotten the name now) which was installed as a Driver. Given the location of SQL backups (and BCP output folders etc.) it intercepted the file generated by SQL and compressed it on-the-fly, and similarly decompressed it when read. It was able to optionally use the ZIP file format, so compressed backup files could be used on other servers by just de-compressing with a ZIP tool. Maybe it's died a death now, superseded by the in-built compression etc., it must have been 2010, maybe earlier, when I knew of it. But that still suffered from needing the same I/O bandwidth as an uncompressed backup


#9

How big are your databases? The backup time savings is huge on VLDBs. I forget what the ratio is, but I believe that we were getting at least 50%, maybe even 75%, faster backups. And the storage savings was around 90%.

Regarding the restores, we use SAN snapshots and avoid backups/restores. We can have a VLDB "restored" in just a few minutes to a test environment.


#10

The only possible reason I could think of to not use native backup compression would be if your server was already extremely tight on CPU time, which nowadays should be extraordinarily rare.


#11

I admit my solution might not be optimal with our current setup, but at the time it was implemented, it was the best I could come up with, taken the demands from management and customers into account. Price of storing in cloud and speed of wan were the "killer" issues at the time.

Having changed much to our environment, I will have another look into our setup.

@Kristen: the size of our databases are, what most databases experts will call small. The IO is on "dedicated" secondary lan and the setup capacity of the full backup, are well within acceptable margins. But you are absolutely right about the restore process - its much more difficult. Thankfully we only used restore when setting up test/demo (which is not that often).

@TaraKizer: speed is not the only factor - at least it wasn't for us at the time of implementation. But as our environment have changed a lot, it is definitely worth exploring(re-evaluating) our setup. Our SAN is currently due for retirement, and only used by fileservers. A new SAN (faster and bigger) is on my top 5 wishlist, in order to also have Exchange and SQL Server data moved to SAN. But we can't avoid backup, due to requirments from our customers.

But first, its vacation time (at least for me) :smile: Have a nice vacation when your time comes :sunny:


#12

I'm not saying to avoid backups. They are absolutely necessary. I'm saying we don't use the backups when we need a copy of the production database in a non-prod environment. We use SAN snapshots for that. We still have backups.

Backup speed is a huge factor for VLDBs. Backup compression is a must.


#13

I did a COMPRESSION / NO_COMPRESSION comparison.

Biggest database I have on the server we are testing is only 5GB. No activity on that server, so LOG should be "empty" and not growing and Server shouldn't be doing anything else at the time.

COMPRESSION : 23 seconds - Filesize: 1,231,308,288 bytes
Processed 697416 pages for database 'MyDatabase', file 'MyDatabase_data' on file 1.
Processed 4 pages for database 'MyDatabase', file 'MyDatabase_log' on file 1.
BACKUP DATABASE successfully processed 697420 pages in 23.221 seconds (234.640 MB/sec).
--
NO_COMPRESSION : 39 seconds - Filesize: 5,714,305,536 bytes
Processed 697416 pages for database 'MyDatabase', file 'MyDatabase_data' on file 1.
100 percent processed.
Processed 2 pages for database 'MyDatabase, file 'MyDatabase_log' on file 1.
BACKUP DATABASE successfully processed 697418 pages in 38.142 seconds (142.849 MB/sec)

(Not sure why LOG is 2 pages smaller in the second backup?? - there was definitely NO intervening LOG backup)

I make that a 78% disk saving, 39% time saving, 64% improvment in apparent MB/sec write speed


#14

Found a bigger database (35GB) on a client's SQL2008R2 machine

Their server is absolutely rubbish - their IT people have hooked it up to a SAN with no dedicated spindles and thus physical media shared with Exchange and gawd knows what else ...

Its middle of the night here, but I expect plenty of other processes running batch tasks ... but here are the figures. The database is a Transfer Database, running in SIMPLE Recovery Model. batch processes may be populating tables from BCP imports etc. If I have time I'll re-run this during the day, during the weekend, when the server is relatively quiet

Non-Compressed - 35,255,873,024 bytes

Processed 4303336 pages for database 'MyDatabase2', file 'MyDatabase2_data' on file 1.
Processed 283 pages for database 'MyDatabase2', file 'MyDatabase2_log' on file 1.
BACKUP DATABASE successfully processed 4303619 pages in 1189.981 seconds (28.254 MB/sec).

Compressed - 4,843,690,496 bytes

Processed 4303552 pages for database 'MyDatabase2', file 'MyDatabase2_data' on file 1.
Processed 22 pages for database 'MyDatabase2', file 'MyDatabase2_log' on file 1.
BACKUP DATABASE successfully processed 4303574 pages in 707.470 seconds (47.523 MB/sec).

86% disk saving, 41% time saving, 68% improvement in apparent MB/sec write speed


#15

Just ran a test and using db compression gives me 86% disk saving, where as 7-zip gives me 96%.
I don't even want to try measure time, as I know the 7-Zip solution is way slower.
So 7-zip wins focusing on size (which we did).


#16

I did an experiment. Sample Size of One, so not sure how much worth it is ...

Database Backup file 5GB (5,714,305,536)

Size Time(s) Ratio MB/s Method  
---- ------- ----- ---- ------- 
87%     68    1.62 10.1 Fastest
88%     42    1.00 15.3 Fast
90%    538   12.81  1.0 Normal
91%    585   13.93  0.9 Max

(Got bored so didn't run Ultra. All other 7Zip parameters left at Default)

On my hardware, at least, Normal or Max is 13x the elapsed time for just a couple of percentage points more disk space saved. Hence my earlier thought that for a Zip-and-Copy the "Fastest" (which wasn't! on this test run) or "Fast" method is preferable


#17

I don't disagree with you :slight_smile: , but in my case (back then) size mattered VERY much. Time, not so much as we have a big "wiggle room"/enough ressources to manage.
If I were to set it up from scratch, with todays environment, I most likely would choose the db server to do the compression.


#18

Good discussion. Good recommendations. I cannot think of a good reason to not use compression in my experience. I cannot add much more, other than disc space and backup time is future money saved. It may not be a problem yet, but often will be eventually.


#19

We use commvault to backup our SQL databases ... Commvault uses deduplication to reduce backup size :

http://documentation.commvault.com/commvault/v10/article?p=features/deduplication/c_deduplication_overview.htm

Commvault best practice section states "To improve the SQL backup performance with deduplication, ensure that the Native SQL backup compression option is 'OFF' on the Advanced Backup Options dialog box." :

http://documentation.commvault.com/commvault/v10/article?p=products/sql/best_practices.htm

I guess this is one possible situation where it is best to have compression disabled as a default.


#20

I think backup compression is a must, even if you have mid-size database. The benefit of the backup and restore time saving is huge, as it just a little option you have to check.

many other companies might use 3rd party software for backup, eg. Litespeed, redgate etc. I did a comparison using native and redgate here - http://www.sqlservercentral.com/blogs/denniss-sql-blog-1/2015/07/02/backup-compression-native-vs-redgate/

I think MS should have a standard post configuration document for having this as default :slightly_smiling: