Missing records in groups within same table

ITTrucker · March 7, 2017, 5:08pm

I am trying to check and ensure that all the employees in a table have matching records for all stores. So each employee needs to be listed with the same data multiple times in the same table and I want to make sure they are included in every store.

I can do this by validating a single store and then using CTE's to compare to the rest (store 12 is the master store in query below), but I have 20 stores and 2,000 employees for each store and I'm looking for differences in names or Employee #'s so I'll have to run it multiple times on multiple criteria. (yes I know this is a terrible table structure, but it's what I was given)

Is there a better way to check for records that are missing or incorrect data or am I stuck with massive LEFT OUTER queries?

DECLARE @TempStore Table (Store int, EmpID1 varchar(10), EmpID2 varchar(10), EmpFirst varchar(50), EmpLast varchar(50), EmpStart date, EmpAdded date)
INSERT INTO @TempStore
VALUES
(3,'ABC123','123456','Joe','Smith','20170101','20161231'),
(12,'ABC123','123456','Joe','Smith','20170101','20161231'),
(9,'DEF234','987654','Anne','Jones','20170201','20161231'),
(12,'DEF234','987654','Anne','Jones','20170201','20161231'),
(3,'GHI564','24680','Pat','Williams','20170208','20161231'),
(9,'GHI564','24680','Pat','Williams','20170208','20161231'),
(12,'GHI564','24680','Pat','Williams','20170208','20161231')
;
WITH C12 AS
(
SELECT * FROM @TempStore WHERE Store = 12
),C9 AS
(
SELECT * FROM @TempStore WHERE Store = 9
),
C3 AS
(SELECT * FROM @TempStore WHERE Store = 3
)

select * from C12 LEFT OUTER JOIN C9 ON C12.EmpID1 = C9.EmpID1 LEFT OUTER JOIN C3 ON C12.EmpID1 = C3.EmpID1
WHERE
C9.Store IS NULL OR C3.Store IS NULL

ScottPletcher · March 7, 2017, 5:11pm

SELECT EmpID1
FROM @TempStore
GROUP BY EmpID1
HAVING COUNT(DISTINCT Store) < (SELECT DISTINCT Store FROM <store_master>)

ITTrucker · March 7, 2017, 5:44pm

@ScottPletcher Thanks, COUNT was a good starting point, the interesting thing is when I just looked at a table that had 2 less employees on the counts, I have 3 NULL values when comparing employee ID's so someone is duplicated (which supposedly shouldn't happen...)

So now that I know there can be dupes, I can't just go off the counts, I have to look for HAVING COUNT>1 in each store, and then clean that and then I can try to pinpoint the values that are missing.

ScottPletcher · March 7, 2017, 6:06pm

Note, though, that I used "COUNT(DISTINCT" so dupes **won'**t affect the count, viz:

SELECT COUNT(DISTINCT value), COUNT() FROM (VALUES('A'),('B'),('B'),('C'),('C'),('C')) x(value)
--vs.
SELECT COUNT(value), COUNT() FROM (VALUES('A'),('B'),('B'),('C'),('C'),('C')) x(value)

JeffModen · March 7, 2017, 7:41pm

This is a pretty bad idea design-wise. It violates 1st normal form and will continue to plague you with differences.

The proper way to do this would be to have 1 table for employees, 1 table for stores, and a "bridge" table to join the two in a many-to-many fashion.