I have a customer master table called Customer. Everything in this table comes from a key/value style table called Cust_Property.
The Cust_Property table has 3 columns:
CustomerID, Property, Value
The Property column may contain First_Name with a value of John. Sort of like a pre-pivoted table. I need to update the columns in Customer table with the values of the associated Property columns in Cust_Property table.
Rules
- If there is a new
CustomerIDin theCust_Propertytable, it will need to be added as a new row to theCustomertable, as well as all appropriate properties. - All data in the
Customertable will also be in theCust_Propertytable. That means that not every record needs to be updated. Only those that have changed or are new. - Records are only added/updated in
Customertable, not removed. - There are properties in the
Propertytable where a corresponding column doesn't exist in theCustomertable, so those are just ignored.
DDL
CREATE TABLE #Customer
(
Customerid int,
FirstName varchar(50),
LastName varchar(50),
Address1 varchar(100),
Address2 varchar(100),
Address3 varchar(100)
)
CREATE TABLE #Cust_Property
(
CustomerID int,
Property varchar(50),
Value varchar(50)
)
INSERT INTO #Customer (Customerid, FirstName, LastName, Address1, Address2, Address3)
VALUES(1, N'John', N'Smith', N'123 happy lane', NULL, NULL);
INSERT INTO #Customer (Customerid, FirstName, LastName, Address1, Address2, Address3)
VALUES(2, N'Dwight', N'Schrute', N'33 1st Ave', N'Apt 5', NULL);
INSERT INTO #Customer (Customerid, FirstName, LastName, Address1, Address2, Address3)
VALUES(3, NULL, NULL, NULL, NULL, NULL);
INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(3, N'First_Name', N'Michael');
INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(3, N'Last_Name', N'Scott');
INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(8, N'First_Name', N'Jim');
INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(8, N'Last_Name', N'Halpert');
INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(8, N'Address1', N'644 Scranton Rd');
INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(8, N'Nickname', N'Jimmy');
INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(1, N'First_Name', N'John');
Table screenshots:
Customer: https://i.imgur.com/yePz1Di.pngCust_Property: https://i.imgur.com/YwvZExx.png
Desired Customer table end result:
Customer3'sFirst_NameandLast_Namecolumns are updated- Add customer 8 to
Customertable b/c it doesn't already exist - Update all of customer 8's properties except property
Nicknamebecause that doesn't exist in theCustomertable - Ignore the
First_Nameproperty forCustomerID = 1because it is the same in theCustomertable, so no update needed.
My current method: first find and insert new CustomerIDs
INSERT INTO #Customer (CustomerID)
SELECT DISTINCT CustomerID
FROM #Cust_Property a
WHERE NOT EXISTS (SELECT * FROM #Customer x
WHERE a.CustomerID = x.CustomerID)
Then update properties
UPDATE #Customer
SET #Customer.FirstName = a.Value
FROM #Cust_Property a
WHERE #Customer.CustomerID = a.CustomerID
AND a.Property = 'First_Name'
UPDATE #Customer
SET #Customer.LastName = a.Value
FROM #Cust_Property a
WHERE #Customer.CustomerID = a.CustomerID
AND a.Property = 'Last_Name'
UPDATE #Customer
SET #Customer.Address1 = a.Value
FROM #Cust_Property a
WHERE #Customer.CustomerID = a.CustomerID
AND a.Property = 'Address1'
UPDATE #Customer
SET #Customer.Address2 = a.Value
FROM #Cust_Property a
WHERE #Customer.CustomerID = a.CustomerID
AND a.Property = 'Address2'
UPDATE #Customer
SET #Customer.Address3 = a.Value
FROM #Cust_Property a
WHERE #Customer.CustomerID = a.CustomerID
AND a.Property = 'Address3'
In my actual tables, there are hundreds of different properties in Cust_Properties table, 40 columns in the Customer table that need updating and ~2M customer records.
Is there a better way that separate update statements for each? At present, I can't use an ETL tool, though I could technically use Python if that would be more efficient/faster.
CodePudding user response:
maybe one update like -
UPDATE c
set c.[FirstName] = isnull(cp.[First_Name],c.[FirstName])
,c.[LastName] = isnull(cp.[Last_Name],c.[FirstName])
,c.[Address1] = isnull(cp.[Address1],c.[Address1])
FROM #Customer c
INNER JOIN (
SELECT *
FROM #Cust_Property tb
pivot(
max(value)
for Property in ( [First_Name],[Last_Name],[Address1],[Nickname])
)pv
)cp
on c.Customerid = cp.CustomerID
