Reverting or rollback on an Availability Group upgrade to new nodes

I have a question about rolling back from an Availability Group upgrade. I have searched the Internet and consulted the all-knowing LLM / AI / bot / clankers.

I thought I would see if someone else has experience before request new nodes and go through the pain of testing this.

The LLMs tell me it can’t be done, but in the same response they say two or three things I know are wrong. When prodded, it provides links and says the answer is “implied”. :man_shrugging:. The Internet has lots of instructions on how to upgrade, but no good articles on how to go back beyond backup and restore. (Which may be my answer :frowning:)

My plan

  1. I have a two node AG on Windows 2016 and SQL Server 2016
  2. I add two new nodes on Windows 2019 and SQL Server 2019
  3. I failover to one of the SQL Server 2019 thus upgrading the databases

So far so good. Until something goes horribly wrong and I have to go back. I assume the data in the new 2019 version of the databases cannot be brought back except by manual export/import.

I want to know about the state of the databases on SQL 2016 nodes. Could I

  1. Drop all databases from the AG
  2. Move the AG primary back to one of the SQL Server 2016 nodes
  3. Restore each database WITH RECOVERY to make them readable and writeable. At this point, the databases should be available on the old node with the Listener name and IP.
  4. Evict the two new nodes from the AG and cluster and put these newly restored databases back in the AG.

The LLMs are telling me that this can’t happen because failing to the new upgraded AG node will send enough log information back to the old SQL Server 2016 node that the databases cannot be recovered. But they already told me things I know are wrong so I’m skeptical.

Does anyone have any first-hand experience testing this?

1 Like

Once you fail over to an upgraded version, the data files are stamped with the upgraded file version and cannot be restored/attached/failed back to an earlier version of SQL Server. Nothing is sent back from a higher version AG replica to a lower version. You would need to restore from an earlier version’s restore chain.

The plan you laid out in steps 1-4 will work, but only with the 2016 restored to the point before failover/upgrade. You might not have to kick all the databases out of the AG, except maybe the 2019 nodes. I haven’t tried that, but as long as the version 2016 log is intact you can restore with recovery on any 2016 node and join the DB to the AG on any other node. Adding additional nodes would then work as normal, auto-seed, backup/restore, whatever you choose.

1 Like

You should treat the failover to SQL Server 2019 as a one-way cutover and design your rollback strategy before that point rather than trying to revert the AG afterward. The safest approach is to keep SQL Server 2016 as primary during the upgrade, add the 2019 replicas, and run them in sync for a validation/burn-in period; only fail over once you’re confident. Right before the cutover, take full and log backups and verify restores this becomes your rollback point. If something goes wrong after failover, restore those backups back to the 2016 environment. If preserving post-upgrade data is critical, then instead of rollback you’ll need a side-by-side fallback plan (e.g., replication, ETL, or export/import). In short, don’t rely on AG mechanics for rollback use validated backups for fallback and delay the failover until you’re sure you won’t need to go back.