Home > OS >  Concurrency issues on adding entries to db
Concurrency issues on adding entries to db

Time:01-11

We have a table called USER_INVOICE where the user_id, invoice_id and invoice_status are inserted. The table has pk constraint. Status of the invoice can be ACTIVE or CLOSED. An user can have just one ACTIVE invoice at any moment, but more than one CLOSED invoice. Since there is no way of adding a unique constraint to the table, we created a temp table called LOCK_TABLE that creates a new lock entry before adding a new invoice and releases the entry lock after the insert of the invoice and the operations on it are done.

Lately, I saw that there are still some duplicate entries somehow added to USER_INVOICE for some requests that are concurrent. The user clicks on add invoice multiple times and it creates the lock, adds the active invoice, then a constraint violation on LOCK_TABLE is displayed with error:

"HHH000010: On release of batch it still contained JDBC statements"

and another active invoice is added afterwards. I have not figured out how the second entry to invoice is added.

CodePudding user response:

You need to check the so-called transaction isolation level of your database, and, of course, use transactions.

There are 4 levels.

You probably want the TRANSACTION_SERIALIZABLE, but be aware of what that means. There are only 2 ways to do serializable, kinda like how you can't go faster than the speed of light (it's not a matter of "nobody has written something better", it's a matter of: "That's fundamental to the universe"):

  • Lock a lot of things, even when just selecting. Meaning, your app is going to be incredibly slow.
  • Use optimistic locking. Which means retries.

Here's what optimistic locking means:

When you start the transaction, no other transactions are locked out. Every thread just does its thing. When you commit() the transaction, the DB goes back and checks if you were to go back in time and act like you never ran, and then you run every DB statement you've run in this transaction again, if you would get the same results, then it's okay. If you don't, then the commit fails, indicating back to the code: Just.. do it again. This is called retry.

Sounds nuts? Don't knock it - it powers ethernet and is generally much, much faster than the 'lock all the things' approach!

In other words, given an ATM and a bank teller in the building, and simultaneously, you run this operation on both sides:

  • User wants €50,-
  • start transaction
  • Fetch user's balance
  • Subtract 50 euros from it
  • Write this adjusted balance back to the DB
  • Commit transaction
  • Hand over €50,-

Then only SERIALIZABLE is safe here. Either we lock all the things, or we use optimistic locking. With optimistic locking, they both simultaneously do their thing, but one of them will fail their commit as the 'fetch user's balance' operation would no longer be returning the same result as it did.

The solution is that you need to capture all your transactions in a closure block, because your DB 'runner' needs to catch the SQLException that means 'retry it', and then start the transaction from the top again. It also means all such code needs to be idempotent (it may run more than once). For example, if you spit out €50,- as part of it, that is not idempotent (spitting out 50,- 10 times in a row is different from doing it once. Contrast to writing 'hello' into the same row in a database 10x over and over again - the end result is no different from doing that once). There's a reason I put the hand over the 50,- outside of the transaction - it would be a money-costing bug that can be abused to swindle the bank for millions if you put it inside!

So, if you did not start out with SERIALIZABLE, you can't 'just' introduce it, because doing so will either break a ton of stuff (at first, SQLExceptions in the log that should have been addressed by retrying instead, and if you wrap all your db code in loops, problems where the code you wrapped wasn't idempotent), or make things slow as molasses.

The alternatives, however, are tricky. As you discovered, testing that you've truly 100% protected your app from ever messing up is very difficult to do (testing concurrency is always very very difficult). A lot is lost in translation between those 4 levels and what a DB actually does (for example, the JDBC abstraction has no idea what retry is, so to write code to detect that some SQLException is a retry requires custom code for every DB engine!).

Thus, go back to your DB documentation and figure out how to do this safely. For example, at lesser levels, SELECT * FROM foo FOR UPDATE (the FOR UPDATE part) makes read/write locks, and you probably have to use that when you run your SELECT statement on the LOCK_TABLE.

To summarize the advice:

  • Read up on SERIALIZABLE and consider rewriting all DB interactions into retry blocks, for example, see The homepage of JDBI which right off the bat shows how a serializable db interaction model works (the .withHandle(db -> {...}) part). Read up on what dirty, phantom, and non-repeatable reads are. Understand what SERIALIZABLE is for even if you aren't ready to change your codebase to switch to that.
  • No matter what you do, everything is a transaction (there's a reason its called .setAutoCommit() and not .setTransactions() - if you run without transactions, newsflash: You are, the DB is just injecting a commit after every statement).
  • Unless switching to SERIALIZABLE is an option, look into SELECT .. FOR UPDATE for the read on the lock_table and check your DB engine's docs about this sort of thing

CodePudding user response:

An user can have just one ACTIVE invoice at any moment, but more than one CLOSED invoice. Since there is no way of adding a unique constraint to the table

Well there is a way using the fact that Oracle ignores nulls in the index, you may define a functional based index as follows:

create unique index USER_INVOICE_UX1 on USER_INVOICE( decode(invoice_status,'ACTIVE',user_id));

Note that in the rows where the status is 'ACTIVE' the index key is mapped to the user_id, for other status 'CLOSED' is will be nulland therefor ignored (not indexed) - this allows the uniqueness limited to the active state.

Full Example

create table USER_INVOICE 
(user_id int, 
invoice_id int,
invoice_status varchar2(10),
CONSTRAINT CHK_Status CHECK (invoice_status in ('ACTIVE','CLOSED'))
);

alter table USER_INVOICE add primary key (invoice_id);

create unique index USER_INVOICE_UX1 on USER_INVOICE( decode(invoice_status,'ACTIVE',user_id));

-- user_id (1) can have more CLOSED invoices
insert into USER_INVOICE(user_id,invoice_id,invoice_status) values (1,1,'CLOSED');
insert into USER_INVOICE(user_id,invoice_id,invoice_status) values (1,2,'CLOSED');

-- but only one ACTIVE invioce
insert into USER_INVOICE(user_id,invoice_id,invoice_status) values (1,3,'ACTIVE');
insert into USER_INVOICE(user_id,invoice_id,invoice_status) values (1,4,'ACTIVE');
-- ORA-00001: unique constraint (XXXX.USER_INVOICE_UX1) violated
  •  Tags:  
  • Related