Multiple Databases

MGDIJK · July 4, 2023, 7:23pm

Hi All,

I want to setup several databases and one master database. Where the master collects all data of all clients.

Connected to each client database is an Application. In this application an user can update, insert or delete records. An client is not always connected to the master. And on all clients the same tables are there, but installed on a different computer.

I want to connect each client to the master. Which gathers all data. And then the clients can after updating the master, update it self with record which are altered by other clients.

I'm doubting about merged replication, or create just queries manually to merge the data. But what about priorities? If the same record is edited on both clients?

How to prevent concurrency errors/ deadlocks?

In the master the data cannot be edited. And I want to solve it in the database, and not in the applications.

Can someone guide me in the right direction? Or provide a good book or online course. I have followed an udemy course, but that didn't cover it engough.

robert_volk · July 5, 2023, 3:11am

This statement:

And then the clients can after updating the master, update it self with record which are altered by other clients.

Seems contradictory to this:

In the master the data cannot be edited.

Also, regarding SQL Server, it uses a system database named "master", and for the sake of avoiding confusion, you might consider this database of yours as "central" or something similar. I'm going to use the term "central".

You mention replication, and that databases are maintained on separate systems. Are these databases distributed as, for instance, separate offices of one customer? And that these offices or locations need to operate independently, but then synchronize their data changes to a central location or database?

We will need these questions clarified before we can give a solid answer. Part of your post suggests a data warehouse, loading data from different sources. But other parts suggest a hub-and-spoke database architecture, where each component can originate writes.

MGDIJK · July 5, 2023, 6:39am

Hi Robert,

Thank you for your answers.

The clients are all local databases installed on laptops used on-site. On-site, there is not always a connection to the 'master' possible.

The user can edit (insert, update and delete) several tables in the local database.
When going back to the office the database needs to synchronize with the 'master'.
And at the same time 2 other users can have done the same. Which may result in conflict. But assuming they have not altered the same data.

The 'master' gathers all data of the three clients. And is purely used for data storage.

Please let me know if you require more info,

robert_volk · July 5, 2023, 11:34am

Thanks. Since any of your database instances can originate writes, this scenario is probably best served by merge replication.

The 'master' gathers all data of the three clients. And is purely used for data storage.

Again, I'll use "central" instead, for my own sanity. Based on this description, it sounds more like a data warehouse or repository. If your users won't be modifying data in this central database, and it's only written by synchronization operations, you have other options besides merge replication. SSIS packages, Change Data Capture (CDC). However, merge replication could still be used to sync it with remote instances, as and when they can make a connection.

On the other hand, I'm not seeing how this central database would matter, other than as a redundant copy, possibly for reporting. It wouldn't be necessary for a merge replication topology.

Regarding your question about deadlocks and concurrency, since the remote instances can't directly communicate, and are not updating a central database, these will unlikely to be a problem. Merge conflicts are different from deadlocks, and typically will not block multiple processes.

As for merge replication conflicts, you'd have to figure out a policy for handling them, the best way is to familiarize yourself with the existing features:

There's no magic or shortcut on how to handle conflicts. If the default handling won't cover your needs, you'll have to look at the more advanced options.

NoellaG · July 14, 2023, 6:57pm

We have a similar setup : central database has an interface to SAP and gets the SAP data. These are pushed to all local DB's through a transactional replication. Deliveries are created locally and are send to the central server every 15 minutes using a merge replication. Once they are on the central server the SAP interface sends the deliveries to SAP. Works like a charm for 5 years now.