I am working on 2 tables in AWS- Aurora which is using my-sql as driver. Lets call two tables as Table A and Table B.
- Table B is my main source of truth.
- Table A is mostly out of sync with Table B.
So, I started updating my Table A using left join with Table B. I started running my update statement in a cron job. It was working all smooth until I got to know that update is creating deadlocks which is hampering my complete data pipeline.
I am looking out for solution to sync my Table A:
- Where I can avoid deadlocks on Table A.
- Also I would like to avoid making any change to data pipeline.
CodePudding user response:
deadlock is caused by two or more transactions running at the same time. And the reality is, you can do a lot to reduce deadlocks, but in a practical sense with a lot of different kinds of activity, eliminating deadlocks completely isn't generally possible.
The only way to deal with deadlocks is to retry any transactions that fail due to a deadlock.
Now, if your update is causing transactions to be rolled back in the existing application and you don't want to change that, the best bet is to keep the number of rows you update to a minimum in each statement. If you update one row at a time, they are unlikely. Of course, that is slow.
Now, you could use lock table to lock the tables before running your cron. That will cause everything else to wait and possibly cause performance issues, but it would work.
You can do a select ... for update to lock the rows you are planning to update. But I suspect you are trying to do all rows at the same time so you'd effectively be locking the entire table anyway.
