I have a script that will be clearing down 10 tables that will be run frequently.
However, 3 of these tables have tens of thousands of records and this causes the script to take over 2 and a half hours to run. This isn't good enough because it will be run during the workday to test a suite of scripts that are migrating data.
We have tried a few different approaches:
DELETE FROM x
Requires nulling some foreign keys but this is the most reliable way we have found. However, like I've mentioned, the runtime is far too long for our purposes.
- A mixture of
DELETE FROM xandTRUNCATE TABLE x
This is the fastest way we've found. Deleting tables as normal and then truncating the problematic tables. But this requires an explicit order to clearing down the tables. This is awkward in and of itself but our DB schema is also changing frequently which makes this cumbersome.
- Disable constraints and then
TRUNCATE TABLE x
We have also began writing a script to disable all PK/FK constraints related to the tables we would like to empty, truncate those tables and their child/sibling/parent tables, and then reenable those constraints. This seems like a good approach but is likely overengineering.
There is one further approach that I have considered but I'm not sure would it make a difference. This would be removing all FK constraints and adding a ON DELETE CASCADE to them. This might hopefully optimise approach 1. But I said I would throw it to SO first and see if anybody else has had to deal with this before.
CodePudding user response:
The TRUNCATE statement is way faster than the DELETE statement to empty a table.
In Oracle (as in PostgreSQL), there is a CASCADE option on TRUNCATE statement (as on DELETE statement), which might help about the "overengineering" part (when you want to disable FK, then reable them).
TRUNCATE TABLE x CASCADE;
CodePudding user response:
Truncating is better than deleting, especially for large tables. As you already found out, it is faster, but it also resets the high water mark so queries you run afterwards will run better.
Whichever option you choose (2 or 3) is OK with me, as long as you pick the one you prefer better. Knowing the order matters, so - yes, that's kind of tedious.
I wouldn't remove foreign key constraints; they exist for reason. If you do remove them, don't forget to recreate them afterwards.
You didn't say whether there are other tables involved in that schema. If not, perhaps you could even drop/create user and then recreate tables within the same script. Drawback: "normal" users can't do that, you'll need someone with DBA privileges.
