Another Easy Foreign Key Question

So if you've been following my posts, you know I have a slew (150 of them) of foreign keys spread across 76 tables. I need to purge old data in these tables and I CAN do that if I painstakingly delete them in the exact sequence needed.
My question is if I drop all FKs, then un my delete statements.... Will I have to create the FKs in a certain order/sequence? That would be just as painful as what I'm doing now with all FKs in place.
Thanks.

I have done the following at a previous job:

  1. Disabled the foreign key constraints (ALTER TABLE child NOCHECK CONSTRAINT FK_To_Parent)
  2. Deleted from the parent and child tables those IDs I no longer want
  3. Re-enabled the FKs (ALTER TABLE child WITH CHECK CHECK CONSTRAINT FK_To_Parent)

If I get FK violations, I run DBCC CHECKCONSTRAINTS on the affected tables to get all the affected rows, and handle accordingly, usually just delete from the child table(s). I've done variations on this in a 1400 table database with 2500+ foreign key constraints. (FYI, one table had over 300 FKs referencing it, and about half a dozen other tables had 100+ FK references)

One technique that might help is to create temp tables of each referenced (parent) table that had just an ID column, with a unique or PK constraint, and insert just the IDs you want to delete. You can then:

DELETE A FROM childTable A INNER JOIN #tempParent B ON A.ParentID=B.ID

Repeat that for all the child tables you want to delete, then join the temp to the parent when you're done. The point of disabling the constraints is that you can then delete tables in any order without getting a violation.

I wrote a script to generate a cascading FK delete, but I don't think I can post it (and it's spaghetti-Medusa code anyway, would need serious cleanup). You can find similar scripts on StackOverflow or possibly an article on SQLTeam.com if you go to the home page.

2 Likes

Also, depending on how much data you need to delete, you might want to consider inverting the operation: create new tables (possibly in a new database), and INSERT ONLY the rows you want to keep. FKs can be disabled as well, but it's usually a lot easier to write the correct sequence of INSERT statements.

You can then point your application to the new database, and then keep the old one as an archive. Any new data would be added to the new DB, and you'd repeat your archive task as needed.

If you're finding 40-50% of you data is going to be removed, inverting it to an INSERT will likely save you a lot more time. It's also friendlier to your transaction log, especially if you have availability groups, replication, or other log reading activities.

Another benefit to a new separate DB is that you can leave the index creation until after all the INSERTs are done, this can be significantly faster than if they exist at time of INSERT.

1 Like

" one table had over 300 FKs referencing it." Good Lord. Was that the day you took your profile photo? :slight_smile:

That photo is about 20 years old (taken Dec. 2001 or Jan. 2002). I WISH I had that job still. (well, it got bad after I left, so...)

If you think 300 FKs to 1 table is bad, there were child, grandchild, and great-grandchild tables that had multiple FK paths to the same parent table. So it wasn't enough to generate a single DELETE on a table, I had to enumerate all the lineage tables in between and JOIN them along the way to ensure all the proper rows were deleted. If you can imagine what a mess that would be, congrats, that's why I'm not posting the code. You'll just shoot yourself reading it.

I think the deepest path had seven generations, e.g. joining 7 tables between the temp and the child. There was 1 table that had something like five 6th generation paths, and a few 4th and 5th generation paths. For a total of about 300 tables to be deleted, there were over 700 DELETE statements to run.

Fortunately this kind of bad design is pretty difficult to stumble into, it's not a problem I'd expect other people to have.

1 Like