Peer to Peer replication locking issue on multiple deletes?

kayakingJeff · November 6, 2018, 4:19pm

About a year ago I asked about what the best way to delete large quantities of records while minimizing the number of locks on the table. The solution that I went with at the time was to use partitions to isolate the data that was being lopped off. Original Question Now I have a similar situation where I need to delete large quantities of records but now my table participates in peer to peer replication. Dropping/adding partitions are not allowed in this case. So what I've done is to have a stored procedure run periodically that deletes records in batches of a 1000. This had been working great up until a couple days ago. On a couple of the other servers that participate in the replication I'm starting to see timeouts on the table in question. My first thought was that a transaction was being kept open longer than necessary by one of the applications my users use. But then it happened again late at night over the weekend when I know there was no other activity on the system except for my deletion process.

So my first question is: When a row is inserted/updated/deleted in a table that participates in peer to peer replication, does the lock that is issued stay active until all the other servers report success? I typically see a latency of about 1 second between the data centers. We have 3 data centers that participate in peer to peer replication.

My second quest is: When I delete several batches of 1000 records at a time, does replication keep the deletions in batches of 1000 or does it lump them all together into a larger batch?

Or... am I barking up the wrong tree altogether?

Thanks,
Jeff