SQLTeam.com | Weblogs | Forums

Find differences for SQL Server tables and data

I want to compare the data between two tables (article1) and (article2). Remember both the tables have the same column name. I want to insert a difference data into a differ table. Please check the screenshots. I am troubling in writing SQL queries for generating difference data.Here is the SQL query. id field is a primary key.

image
image

CREATE DATABASE dbtest01
GO
USE dbtest01
GO

CREATE TABLE [dbo].[article1] ([id] [nchar](10) NOT NULL, [type] [nchar](10) NULL, [cost] [nchar](10) NULL,[name] [varchar](500) NULL,
  CONSTRAINT [PK_article1] PRIMARY KEY CLUSTERED
  (
  [id] ASC
  ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

INSERT INTO [dbo].[article1]
  VALUES ('001', '1', '40','Rahul'),
  ('002', '2', '80','abc'),
  ('003', '3', '120','John'),
  ('004', '40', '160','Mary')
GO

USE dbtest01
GO

CREATE TABLE [dbo].[article2] ([id] [nchar](10) NOT NULL, [type] [nchar](10) NULL, [cost] [nchar](10) NULL,[name] [varchar](500) NULL,
  CONSTRAINT [PK_article2] PRIMARY KEY CLUSTERED
  (
  [id] ASC
  ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

INSERT INTO [dbo].[article2]
  VALUES ('001', '1', '40','Rahul'),
  ('002', '2', '80','abc'),
  ('003', '3', '120','Jo'),
  ('004', '4', '160','Marie')
GO

CREATE TABLE [dbo].[differ](
	[id] [nchar](10) NOT NULL,
	[columnname] [nchar](10) NOT NULL,
	[valuearticle1] [varchar](1000) NOT NULL,
	[valuearticle2] [varchar](1000) NOT NULL
) ON [PRIMARY]
GO

Can anyone help me to write sql queries to generate differ table? Differ table contains data only for those columns where value has been updated or changed.

I would solve it like this:

INSERT INTO dbo.differ (id, columnname, valuearticle1, valuearticle2)
SELECT 
	article1.id,
	'name',
	article1.[name],
	article2.[name]
FROM article1 
	INNER JOIN article2
		ON article1.id=article2.id
WHERE
	article1.[name] <> article2.[name]
UNION ALL
SELECT 
	article1.id,
	'cost',
	article1.cost,
	article2.cost
FROM article1 
	INNER JOIN article2
		ON article1.id=article2.id
WHERE
	article1.[cost] <> article2.[cost]
UNION ALL
SELECT 
	article1.id,
	'type',
	article1.[type],
	article2.[type]
FROM article1 
	INNER JOIN article2
		ON article1.id=article2.id
WHERE
	article1.[type] <> article2.[type];

If you need to query it daily/more often and it is a large table then you can also create a hash in both tables so you can select all the diffrences in once by selecting the id and the hash values.

1 Like

I prefer an approach using CROSS APPLY, both for performance and for ease of maintenance. Since you allowed NULLs in the columns, technically you must check for that separately, since just doing <> will not find a column that changed from NULL to a value or from a value to NULL.


SELECT ca1.id, ca1.columname, ca1.valuearticle1, ca1.valuearticle2
FROM dbo.article1 a1
LEFT OUTER JOIN dbo.article2 a2 ON a2.id = a1.id
CROSS APPLY (
    SELECT *
    FROM ( VALUES
        (a1.id, 1, 'All', a1.id, '<no matching id found in table2>',
            CASE WHEN a2.id IS NULL THEN 1 ELSE 0 END),
        (a1.id, 2, 'type', a1.type, a2.type,
            CASE WHEN a2.type <> a1.type OR CASE WHEN a2.type IS NULL THEN 1 ELSE 0 END + CASE WHEN a1.type IS NULL THEN 1 ELSE 0 END = 1 THEN 1 ELSE 0 END),
        (a1.id, 3, 'cost', a1.cost, a2.cost,
            CASE WHEN a2.cost <> a1.cost OR CASE WHEN a2.cost IS NULL THEN 1 ELSE 0 END + CASE WHEN a1.cost IS NULL THEN 1 ELSE 0 END = 1 THEN 1 ELSE 0 END),
        (a1.id, 4, 'name', a1.name, a2.name,
            CASE WHEN a2.name <> a1.name OR CASE WHEN a2.name IS NULL THEN 1 ELSE 0 END + CASE WHEN a1.name IS NULL THEN 1 ELSE 0 END = 1 THEN 1 ELSE 0 END)
    ) AS changes(id, columnsequence, columname, valuearticle1, valuearticle2, is_changed)
) AS ca1
WHERE 
    ca1.is_changed = 1
ORDER BY id, columnsequence
1 Like