I'm currently in a scenario where we have a clunky, old, monolith of a database powering one or two of our less user-friendly systems (this is controlled by a third-party, so I can't change that).
Going forward, I'm looking to push the necessary data into a new (better structured) database and implement a mechanism to keep the data in sync between the two.
One of the quirky traits of this old database is that, instead of just using null values, it uses empty strings and treats them as if they're null.
In the new database, I want to use nulls where there's no data (because I can't think of any good reasons not to).
My issue is when I'm pulling the data from the old database to the new one (using MERGE statements), I'm using a NULLIF([myCol], '') to check there's actually any data - and just treating it as a null if not.
For example, when syncing the data for students I'm planning on doing something along the lines of (the name of the databases and schemas isn't relevant to the question so I'm anonymizing the structure a bit):
USE [NewDB];
DROP TABLE IF EXISTS #myTempTable;
SELECT *
INTO #myTempTable
FROM [OldDB].[schemaName].[Students];
MERGE [schema].[Person] p
USING #myTempTable s
ON TRIM(s.STUD_ISN) = p.StudentDBID
--WHEN MATCHED and not equal, update
WHEN NOT MATCHED
THEN INSERT (
[Surname] ,
[PreferredSurname] ,
[FormerSurname] ,
[Forename] ,
[MiddleNames] ,
[PreferredForename] ,
[DoB] ,
[Gender] ,
[GenderIdentity] ,
[Title] ,
[Deceased] ,
[DeceasedDate] ,
[StudentDBID])
VALUES (
NULLIF(TRIM([s].[STUD_Surname]), '') ,
NULLIF(TRIM([s].[STUD_Preferred_Surname]), '') ,
NULLIF(TRIM([s].STUD_Former_Surname), '') ,
NULLIF(TRIM([s].[STUD_Forename_1]), '') ,
NULLIF(TRIM([s].[STUD_Forename_2]), '') ,
NULLIF(TRIM([s].[STUD_Known_As]), '') ,
[s].[STUD_DOB] ,
NULLIF(TRIM([s].[STUD_Gender]), '') ,
NULLIF(TRIM([s].STUD_Gender_Identity), '') ,
NULLIF(TRIM([s].STUD_Title), '') ,
[s].STUD_Deceased ,
[s].STUD_Date_of_Death ,
TRIM([s].STUD_ISN)
);
Now, obviously this current query only deals with the following scenario:
The old database has data and the new one doesn't
Because I'm having some grief with the logic for the following scenario:
The new database has the corresponding record, but it needs updating.
For the text based fields there's 3 different scenarios where the "new" data would need updating:
- Neither value is null/empty, but they're not equal
- The record in the "old" database is null/empty and the "new" one isn't
- The record in the "new" database is null and the "old" one isn't
I was hoping to do something relatively simple like:
WHEN MATCHED AND (
(NULLIF(TRIM([s].[STUD_Surname]), '') IS NOT NULL
AND p.Surname IS NULL)
OR (NULLIF(TRIM([s].[STUD_Surname]), '') IS NULL
AND p.Surname IS NOT NULL)
OR (NULLIF(TRIM([s].[STUD_Surname]), '') != p.Surname)
) -- do the same for the other columns
THEN UPDATE SET
Surname = s.STUD_Surname
But, NULLIF(TRIM([s].[STUD_Surname]), '') IS NOT NULL is apparently not valid?
Is my best option to just switch out the NULLIF for an ISNULL in the WHEN MATCHED clause... Or am I missing something that'll make it more efficient?
CodePudding user response:
You can compare the two values like so:
COALESCE(NULLIF(TRIM(s.STUD_Surname), ''), '@') <> COALESCE(p.Surname, '@')
...
SET Surname = TRIM(s.STUD_Surname)
That being said, I would rather trim and/or replace empty strings with nulls inside the temp table, after select into and before merge.
CodePudding user response:
The checks for the update can be simplified.
Since the source temp table doesn't have NULL's.
You just need to compare the trimmed values.
MERGE Person t USING #tmpTable s ON s.STUD_ISN = t.StudentDBID WHEN NOT MATCHED THEN INSERT ( StudentDBID , Forename , PreferredForename , Surname ) VALUES ( NULLIF(TRIM(s.STUD_ISN), '') , NULLIF(TRIM(s.STUD_Forename), '') , NULLIF(TRIM(s.STUD_Forename_1), '') , NULLIF(TRIM(s.STUD_Surname), '') ) WHEN MATCHED AND ( COALESCE(t.Forename, '') != TRIM(s.STUD_Forename) OR COALESCE(t.PreferredForename, '') != TRIM(s.STUD_Forename_1) OR COALESCE(t.Surname, '') != TRIM(s.STUD_Surname) ) THEN UPDATE SET UpdatedOn = SYSDATETIME() , Forename = NULLIF(TRIM(s.STUD_Forename), '') , PreferredForename = NULLIF(TRIM(s.STUD_Forename_1), '') , Surname = NULLIF(TRIM(s.STUD_Surname), '') ;
SELECT * FROM PersonStudentDBID | Forename | PreferredForename | Surname | UpdatedOn | CreatedOn :---------- | :------- | :---------------- | :------ | :-------------------------- | :-------------------------- B9701 | Bob | Bobby | Modest | 2022-01-14 12:58:05.9862269 | 2022-01-14 12:58:05.9862269 J0402 | Jane | JD | Doe | 2022-01-14 12:58:05.9862269 | 2022-01-14 12:58:05.9862269 J0504 | Jim | Jimmy | Savage | null | 2022-01-14 12:58:05.9862269 J7201 | John | null | Doe | null | 2022-01-14 12:58:05.9862269 M0803 | Mike | null | Hammer | 2022-01-14 12:58:05.9862269 | 2022-01-14 12:58:05.9862269
Demo on db<>fiddle here
CodePudding user response:
You can use the Coalesce Function available in SQL to achieve the desired result
It will return the first Non-Null value from the passed parameters
SELECT COALESCE( NULL ,'First Non Null','Alex')
