Best practice for writing with primary key constraint?-CodePudding

Before each Update/Insert statement, should I :

IF...EXIST to test the primary key
Just let a transaction fail if primary key is already there (and rely on @@rowcount if I have some logic related to primary key already being there)
TRY ... CATCH an error (raised by the Update/Insert statement itself or have a trigger test primary key and raise errors)
Other solutions ?

How do you write with primary key constraint ?

CodePudding user response：

My preferred method for single-row upsert is:

BEGIN TRANSACTION;

UPDATE dbo.t WITH (HOLDLOCK, SERIALIZABLE)
  SET ...
  WHERE [key] = @key;

IF @@ROWCOUNT = 0
BEGIN
  INSERT dbo.t ...
END

COMMIT TRANSACTION;

If you believe you will much more often be performing an insert, you can swap the logic around so you try that first:

BEGIN TRANSACTION;
 
INSERT dbo.t ...
  SELECT @key, ...
  WHERE NOT EXISTS
  (
    SELECT 1 FROM dbo.t WITH (UPDLOCK, SERIALIZABLE)
      WHERE [key] = @key
  );
 
IF @@ROWCOUNT = 0
BEGIN
  UPDATE dbo.t SET val = @val WHERE [key] = @key;
END
 
COMMIT TRANSACTION;

Some background:

CodePudding user response：

What you describe is often called an "UPSERT" (in case you need a Google term for further research).

We use MERGE statements, since they allow us to specify both actions in one statement.

However, the syntax is a bit complex and there are some gotchas (don't forget to use HOLDLOCK, etc.), so we have abstracted away the actual SQL generation into an InsertOrUpdate(table, fieldsAndValuesToUpdate, keyFieldsAndValues) helper method in our source code. This also allows us to change the implementation later, if required.

When writing SQL code manually, I use IF...EXISTS (inside a transaction and also with HOLDLOCK), since it's easier to read and easier to write.

CodePudding user response：

That depends on the situation.
Suppose you would write an insert with a where not exists clause and when @@rowcount = 0 you would do an update because this row already seems to exist.

If this is the most performant way to do it, that depends on your data.

if you would know that for example in 80% of the cases the insert would succeed, then this approach would actually perform very good.
If it seems that most of the times an update is needed, then you could turn the code around, do the update and then check the @@rowcount.

This only works off course if you can determine before you start if you will have mostly updates or mostly inserts.

The advantage of this method (certainly when you do update first) is that you do not need to check each row first with an if...exists first, you just do you insert/update and find out after if it worked or not. And because you know before that the insert or update will succeed most of the times, you gain performance