SQLTeam.com | Weblogs | Forums

Simple Update to table from staging table


#1

I am new to SQL.... I need a stored procedure that can update existing records and copy new records from one table in azure to another.

The staging table is dbo.OV_IncontactStage and the prod table is dbo.OV_Incontact. Contact_ID is the primary key on the prod table. The staging table only has new records or updated records.

What I currently have does not work exactly right because in the staging table, there are multiple records with the same unique identifier (Contact_ID). I only need the last modified one.

I did some reading on this and I believe there is a select Row_Number () Over (partition) statement to place an instance number on the records so I can get the most recent one if there are multiple, but I am not sure where to put this in the statement. Here is what I currently have

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

ALTER PROCEDURE [dbo].[spUpsertCallRecordv2]
AS
BEGIN
WITH dbo.OV
AS(
Select ROW_NUMBER() OVER PARTITION by Contact_ID, Contact_Code, Start_Date
ORDER BY Start_Date DESC) AS Rownum,
Contact_ID,
Contact_Code,
Start_Date
FROM dbo.OV_IncontactStage)

--if succeeds
MERGE dbo.OV_Incontact AS target_sqldb
USING dbo.OV_IncontactStage AS source_tblstg
ON ( target_sqldb.Contact_ID = source_tblstg.Contact_ID)
When MATCHED 
    AND RowNum = 1
    Then
	Update
	Set
	   Contact_ID=source_tblstg.Contact_ID,
		Contact_Code=source_tblstg.Contact_Code,
		Start_Date=source_tblstg.Start_Date
            WHEN NOT MATCHED THEN
         INSERT(
                       Contact_ID, 
                     Contact_Code, 
                     Start_Date
                      )
           VALUES(
                       source_tblstg.Contact_ID,
			source_tblstg.Contact_ID,
			source_tblstg.Contact_Code
                          );

END


#2

You would change your Using Clause to be

USING (select Contact_ID, Contact_Code, Start_date, 
			   Row_Number() over (partition by ContactID order by Start_Date) RowNum
         from dbo.OV_IncontactStage) AS source_tblstg

#3

I changed the table name for simplification. teststage is the staging table and test2 is the prod table. Here is what I have so far. No errors when running it. I will test to see if it does what I want it to.

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

create PROCEDURE [dbo].[spUpserttest]
AS
BEGIN
WITH TEST
AS(
Select ROW_NUMBER() OVER (PARTITION by Contact_ID, Master_Contact_ID, Start_Date
ORDER BY Start_Date DESC) AS Rownum,
Contact_ID,
Master_Contact_ID,
Start_Date
FROM dbo.teststage)

--if succeeds
MERGE dbo.test2 AS target_sqldb
USING (select Contact_ID, Master_Contact_ID, Start_date,
Row_Number() over (partition by Contact_ID order by Start_Date) RowNum
from dbo.teststage) AS source_tblstg
ON ( target_sqldb.Contact_ID = source_tblstg.Contact_ID)
When MATCHED
AND RowNum = 1
Then
Update
Set
Contact_ID=source_tblstg.Contact_ID,
Master_Contact_ID=source_tblstg.Master_Contact_ID,
Start_Date=source_tblstg.Start_Date
WHEN NOT MATCHED THEN
INSERT(
Contact_ID,
Master_Contact_ID,
Start_Date
)
VALUES(
source_tblstg.Contact_ID,
source_tblstg.Master_Contact_ID,
source_tblstg.Start_Date
);
END


#4

I did some testing with this. It will create a new entry. This will update an entry. If I have multiple updates on one entry (Contact_ID) then it only takes the first one. I need it to take the last one (depending on Start_Date). Any help is appreciated.


#5

See if this is what you are looking for:

   With baseData
     As (
 Select Contact_ID
      , Master_Contact_ID
      , start_date
      , row_number() over(Partition By Contact_ID, Master_Contact_ID Order By start_date desc) As Rownum
   From dbo.teststage
        )
      , sourceData
     As (
 Select bd.Contact_ID
      , bd.Master_Contact_ID
      , bd.Start_Date
   From baseData        bd
  Where bd.RowNum = 1
        )
  Merge dbo.test2 As tgt
  Using sourceData As src
     On tgt.Contact_ID = src.Contact_ID
    And tgt.Master_Contact_ID = src.Master_Contact_ID

   When Matched Then
 Update 
    Set tgt.Start_Date = src.Start_Date
   When Not Matched By Target Then

 Insert (
        Contact_ID
      , Master_Contact_ID
      , Start_Date
        )
 Values (
        src.Contact_ID
      , src.Master_Contact_ID
      , src.Start_Date
        );

It looks like you are keying of both contact_id and master_contact_id - so I included that, but if the master contact id can change then you need to change the row number and the relationship.


#6

I actually meant to only look for Contact_ID. If it exists already then take the most recent record based on Start_Date.


#7

Then you would remove that from the row number - and the using statement and include it in the update.

   With baseData
     As (
 Select Contact_ID
      , Master_Contact_ID
      , start_date
      , row_number() over(Partition By Contact_ID Order By start_date desc) As Rownum
   From dbo.teststage
        )
      , sourceData
     As (
 Select bd.Contact_ID
      , bd.Master_Contact_ID
      , bd.Start_Date
   From baseData        bd
  Where bd.RowNum = 1
        )
  Merge dbo.test2 As tgt
  Using sourceData As src
     On tgt.Contact_ID = src.Contact_ID

   When Matched Then
 Update 
    Set tgt.Start_Date = src.Start_Date
      , tgt.Master_Contact_ID = src.Master_Contact_ID
   When Not Matched By Target Then

 Insert (
        Contact_ID
      , Master_Contact_ID
      , Start_Date
        )
 Values (
        src.Contact_ID
      , src.Master_Contact_ID
      , src.Start_Date
        );

If that contact is not found in the target - it will be inserted. If that contact is found - it will be updated with the latest start date. This will update the record regardless of whether or not it has changed.