SQL Compression vs Hardware compression when using flash storage for SQL 2016

PAPutzback · December 15, 2016, 9:23pm

Where we were - Test and Prod environments onSQL Server 2012 using database compression and spindle storage.

We moved to Pure Storage earlier this year and left the SQL compression on since we didn't have the storage space or a third server to see how long it will take to change over to hardware compression, how much space it would lose or gain and we were waiting for the first service pack for sql 2016.

Now we are getting a third DEV environment, and the non technical people are trying to figure out whether to enable hardware compression or use SQL compression. So I am hoping that someone here has some history with SQL server 2016 and flash storage and which compression performed better.

Our VLDW contains a few terabyte of data from EMR systems used to populate QlikView applications and SSAS tabular models.

ScottPletcher · December 15, 2016, 9:48pm

My guess is that hardware compression would be faster. If you decide to try to compress log files, be sure to test that very carefully to make sure it truly is transparent to SQL Server.

PAPutzback · December 15, 2016, 10:11pm

Thanks for the quick response. I could not find a good white paper at the Pure Storage site, but I did find one for EMC's XTRMIO flash array. I assume they are somewhat similar. It looks like for our scenario you are right, hardware compression is the way to go and will save a lot of CPU cycles in addition to drive space.

ScottPletcher · December 15, 2016, 10:41pm

Exactly. I should have explicitly stated that. The reason MS's is much slower is because of its CPU usage.

If you don't have disk compression available, MS compression is definitely still worthwhile. But table rebuilds do take a lot longer than without compression.

PAPutzback · December 23, 2016, 6:01am

Thanks. I think the plan is too go with hardware compression for now and then test SQL compression when we upgrade to 2016

JeffModen · December 25, 2016, 3:44pm

In my admittedly limited testing, I've found that data compression in SQL Server is a source of overhead that isn't so effective. It's not like you're going to get the wicked amounts of compression that you might with something like ZIP technology and the performance was relatively bad. Hardware compression might be fine but remember that unless the data arrives at the server in a compressed state as is decompressed at the server, you've gained nothing for transmission rates, which is a major part of performance.

I DO do compressed backups. They really did things right there and is proof that compressing and decompressing at the server is well worth it. I backup about 2TB of databases from my production server in a single threaded, serial manner in a couple of minutes less than an hour. Restores are nearly as quick if you don't let transaction log files get out of hand (they still need to be VLF formatted which isn't helped by instant file initialization) and has made it much easier for me to do full restores of my most important large databases every night both as a test of the backups and as an area to test emergency fixes.

I was told that the backup drives we use automatically do compression and deduplication and that I shouldn't need to do any SQL Server compression but that didn't help much for backup times because we still had to go through the pipe like you have to with all SAN and NAS storage. When I shifted to SQL Server's backup compression, the duration of the backups dropped through the floor.

ScottPletcher · December 27, 2016, 5:46pm

Interesting. I've found data compression to be extremely good at reducing execution time for nearly all queries against large tables. We do have lots of idle CPU time available to do the compression/decompression.

For most large tables, page compression is worth it for us. I use row compression on virtually every table.

I have noticed that rebuilds are much slower. That makes sense when you think about it, but it's still something to be aware of.

yosiasz · December 27, 2016, 6:07pm

Question for you. What kind of recovery model are you using in DEV? are you restoring from prod often? Do you really need all of that data in DEV?

JeffModen · January 1, 2017, 4:23pm

That's encouraging. Thanks, Scott. Perhaps I need to revisit compression on the larger tables just to make sure I wasn't introducing some unknown element that made it look like compression was bad for us.

PAPutzback · January 1, 2017, 5:45pm

It looks like I have a lot of testing to do.