Before and After?

ahmeda86 · September 14, 2022, 3:48pm

Hi all.

so, I was not sure the correct category to post this in but hope this is it!

Imagine the following:

A SQL Server instance that has quite a few DB's. It DOES get hammered by many different SQL Agent jobs OR by external applications that are all around a company or across teams.

Given the high usage, is there a way to take a before and after snapshot of the DB's when a particular job runs? (The ones I have control over at least!)

Since I am exploring this black box, I want a way of knowing the state of the data before the job runs and after and see the delta.

This is for on-prem SQL And NOT SQL Azure

Thoughts?

yosiasz · September 14, 2022, 4:04pm

Hello

Other than taking a backup before the particular job runs not sure of another way. I am worried though about this statement "exploring this black box"

Why is a black box or is that part of the forensic work you are doing since you are trying to find what is causing a certain thing you are investigating?

Basically what is your end game, why are you doing this?

ScottPletcher · September 14, 2022, 4:16pm

Likely easiest would be to use database snapshots, if you know the table(s) you need to compare.

(1) Snapshot the db;
(2) let the job run;
(3) snapshot the db again;
(4) compare data between the two(+) snapshots.
(!!) Be sure to drop the snapshots as soon as possible because they take up space (if/as data is modified) and thus have some overhead.

ahmeda86 · September 14, 2022, 4:45pm

For snapshots - is there a way of taking them "offline" and comparing so we can save the space?

ahmeda86 · September 14, 2022, 4:46pm

Because of project discovery. Since there is no real documentation and not knowing what tables/DB's the jobs run or expected outcomes after a job, I am trying to untangle it and understand the black box - that's why. Think 10+ year old system.

ScottPletcher · September 14, 2022, 5:36pm

Actually, with a snapshot, you do not need to take them offline. That's one of the big advantages of a snapshot.

ScottPletcher · September 14, 2022, 5:37pm

A snapshot uses zero space until/unless a page changes (which is why creating a snapshot is almost instantaneous). (Only) When a page changes, the original page gets copied into the snapshot area and uses space for that page(s).

ahmeda86 · September 14, 2022, 5:45pm

Thanks! I guess I need to look more into this but ultimately I want to do a snapshot and then review it whenever I "get to it" (could be hours or days later) and maybe replicate/restore it on a lower environment

ScottPletcher · September 14, 2022, 5:48pm

If it will be days, you should likely instead take a full backup and then restore that to a different db name when you need it.

yosiasz · September 14, 2022, 6:12pm

another way might be doing a sql profiler just before the job starts. but what @ScottPletcher said might be the way to go

ahmeda86 · September 14, 2022, 7:02pm

Thanks Scott. Yeah this is the problem where there is pretty much a prod environment and no lower environments and due to other processing (dependencies), it's difficult to replicate the jobs in lower environments (jobs = external from SQL but uses SQL).
It's why I thought maybe in PROD, if we can take a delta (before and after) and then take that delta "offline" and restore to a lower environment to see the data differences, that would help big time and nail is down to which job does what (as multiple jobs could overlap or run at the same time)

JeffModen · September 14, 2022, 10:21pm

Let's just say that a perfect "Before'n'After" snapshot were easily possible. What do you expect to learn by comparing the two?

ScottPletcher · September 14, 2022, 10:39pm

A diff backup would allow you to do that too. Make a diff backup; let the job run; make a separate diff backup.

Restore both diff backups separately to different db names and then do the comparisons you want.

You might also try CDC, but the internal system procs you're forced to use by default are a royal pita. I ended up writing my own versions to make pulling the changes vastly easier and far, far less buggy.

ahmeda86 · September 15, 2022, 12:05am

Excellent question.
Pretty simple - to know what tables are being touched and what the data changes would be. This will help with analysis or expectations to build a roadmap and have more of a clearer understanding of the process unlike right now

ahmeda86 · September 15, 2022, 12:09am

Sure, I would do that but thought there would be more of a better/efficient way and more "to the point" of finding the data changes as there are MANY DB's and MANY tables.

yosiasz · September 15, 2022, 12:24am

a well tuned SQL Profiler is quick and dirty way of seeing things. you will have to tune it just right so it does not capture all kinds of junk.

ahmeda86 · September 15, 2022, 12:47am

absolutely. And that's what I want to try and avoid... LOL

yosiasz · September 15, 2022, 12:49am

are you "allowed" to grab a backup even an old one and spin up a dev/qa/sandbox of the app?

ahmeda86 · September 15, 2022, 1:03am

Im working on that. But even so, the setup process is going to be big and to replicate prod behavior even harder due to external out of company dependencies.

yosiasz · September 15, 2022, 1:24am

are those dependencies tied with the job in question?