DELIMITER $$
CREATE PROCEDURE sp_delete_data()
BEGIN
DECLARE i INT DEFAULT 0;
DECLARE loop_counter INT DEFAULT 0;
DECLARE retain_days datetime;
DECLARE delete_days datetime;
SET loop_counter=(SELECT ROUND(count(*)/100,0) FROM data2 WHERE datetime<(SELECT DATE_ADD(min(datetime),INTERVAL 1 DAY) FROM data2));
SET retain_days=(SELECT DATE_SUB(now(),INTERVAL 5 DAY)); -- 5 days data will keep
SET delete_days =(SELECT DATE_ADD(min(datetime),INTERVAL 1 DAY) FROM data2); -- check old data from table data2
WHILE i <= loop_counter DO
IF retain_days>delete_days THEN
DELETE FROM data2 where datetime<delete_days LIMIT 1000;
END IF;
SET i = i 1;
END WHILE;
END$$
DELIMITER ;
I want to keep only 5 days data and delete rest of the data if it is less than the retention date. Since each day data generate almost 2000000 rows that is why it is difficult to delete whole data by one shot. That is why I want to delete 100000 data in each loop. Here, loop_counter variable used to find how many loop we should use for this day data. retain_days variable define to find retention date delete_days variable define to find deleted date. based on retain_days and delete_days varoable data will retain and delete. Fnally this procedure will call by event every 1 day.
My code loop is not working as expected. Need expert solution. If is there any performance issue to delete data like this please let me know. Thanks In Advance
CodePudding user response:
Just create an event that runs once a day:
CREATE EVENT purge_old_data
ON SCHEDULE EVERY '1' DAY
STARTS CURRENT_TIMESTAMP()
ON COMPLETION PRESERVE
COMMENT 'Delete rows older than 5 days'
DO
BEGIN
DELETE
FROM data2
WHERE `datetime` < DATE_SUB(CURRENT_TIMESTAMP(), INTERVAL 5 DAY);
END;
You should not use a loop to delete smaller chunks. In SQL databases loops actually make performance worse. If you have to delete 100s of millions of rows in the first run it is really not a problem for MariaDB.
CodePudding user response:
PARTITION BY RANGE with each partition being, say, 2 hours' worth of data. Then DROP PARTITION will very rapidly drop data -- much better than DELETE.
More on using partitioning: http://mysql.rjweb.org/doc.php/partitionmaint
Alternatives: http://mysql.rjweb.org/doc.php/deletebig
In particular, the second link show how to 'continually' run through the data via the PRIMARY KEY, deleting 1000 rows at a time. Repeat when finished.
Note: The following is problematic:
DELETE FROM data2
where datetime<delete_days LIMIT 1000;
Without INDEX(datetime), it will spend much of its looking for any rows to delete. With such an index, there is still the overhead of bouncing between the index and the data 1000 times. In either case, the 1000 rows must be put into the redo log for the off chance of a crash.
