MariaDB Transaction Isolation Levels-CodePudding

I am facing a problem, and I am trying to wrap my head around the isolation levels. To understand these isolation levels, I've read the documentation found on the MariaDB website.

The base isolation level used by InnoDB tables is stated to be REPEATABLE_READ.

Consider the following problem. I have the following two tables structure:

/** tableA **/
id INT AUTO_INCREMENT PRIMARY_KEY

/** tableB **/
id INT
claimedBy INT NULLABLE

and also have a function, which pseudocode looks like this

/** should create a new row in "tableA", and update the rows from "tableB" which ids match the ones in the array from the input parameter, and that are null, to the id of this new row - in case the number of updated rows does not match the length of the input array, it should roll back everything **/
claim(array what) {
   - starts transaction
   - inserts a row into "tableA" and retrieve's it's id, storing it inside "variableA"
   - updates "claimedBy" to "variableA" on all rows from "tableB" that have "claimedBy" set to null and have "id" that is in "what"
   - counts the number of rows from "tableB", where "claimedBy" equals to "variableA"
     - if the count does not match the length of the "what" parameter, rolls back the transaction
     - if the count matches, commits the transaction 
}

My questions, which would help me understand isolation levels more concretly are the following:

In case two separate calls are made concurrently to this function which both have "what" arrays that intersect at any point, if I understand correctly, REPEATABLE_READ would prevent my data to become corrupted, because all the rows will be locked in the table as soon as the first update begins to perform, thus whichever function calls update is executed second, will be completely rolled back. Am I right in this? Based on the example on the official documentation it would seem like that rows are checked for the where condition and locked one-by-one. Is this the case? If yes, is it possible, that on concurrent calls to the function, both queries get rolled back? Or worse, is it possible that a deadlock would occur here?
In this concrete example, I could safely decrease the isolation level for the transaction to READ_COMMITED, which would also prevent the data corruption, but would not retain the locks for the duration of the update for rows that are not affected by the update, am I correct in this?
The lock retaining for manual TRANSACTIONS in MariaDB are for the duration of the query operation that create these locks, or for the duration of the complete transaction operation? (ie, until the transaction is either rolled back or commitd?)

FOLLOWUP QUESTION

Am I mistaken, that if using READ_COMMITED isolation, the following two concurrent calls could execute at the same time (without one, waiting for the lock of the other to be released), but not if REPEATABLE_READ isolation was used?

/** Session #1 **/
claim(array(1,2,3));

/** Session #2 **/
claim(array(4,5,6));

CodePudding user response：

There's very little difference between REPEATABLE-READ and READ-COMMITTED in the scenario you describe.

The same locks are acquired in both cases. Locks are always held until the end of the transaction.

REPEATABLE-READ queries may also acquire gap locks to prevent new rows inserted, if those rows would change the result of some SELECT query. The MySQL manual explains gap locks better, and it works the same way in MariaDB: https://dev.mysql.com/doc/refman/8.0/en/innodb-locking.html#innodb-gap-locks Regardless, I don't think this will be an issue.

I don't think you're at risk of a deadlock in the scenario you describe. Your UPDATE should lock all the rows examined. Rows are not locked one by one; the lock request is atomic. That is, if any of the set of examined rows cannot be locked because another session already has some or all of them locked, then the new lock request waits.

Once your UPDATE succeeds (locks are acquired and then the rows are updated), then your session has them locked and keeps them locked until the end of the transaction. Subsequently doing a count would reference only locked rows, so there's no way another session could slip in and cause a deadlock.

One subtle point about locking that you may not notice in the documentation: locking SQL statements act as if they are run in READ-COMMITTED mode, even if your transaction is REPEATABLE-READ. In other words, locks are acquired on the most recently committed version of a row, even if a non-locking SELECT query would not read the most recent version of that row. This is surprising to some programmers.

Re your comments:

I wrote a demo of the locking/nonlocking odd behavior in my answer here: How MVCC works with Lock in MySql?

Regarding releasing locks, yes, that's correct, in READ-COMMITTED mode, a lock is released if your UPDATE doesn't make any net change to the row. That is, if your update sets a column to the value that it already had. But in your case, you are changing values on rows that match your conditions. You specifically select for rows where the claimedBy is NULL, and you are setting that column to a non-NULL value.

Regarding your followup question, you don't have an index on the claimedBy column, so your query will have to at least examine all the rows. In READ-COMMITTED mode, it will be able to release the lock pretty promptly on rows that don't match the search conditions. But it would be better to have an index on claimedBy so it is able to examine only rows that match the condition. I would think it's better (if only by a slight margin) to avoid locking extra rows, instead of locking them and releasing the locks.

I don't think that transaction isolation is such an important factor in performance optimization. Choosing indexes to narrow down the set of examined rows is a much better strategy in most cases.