Find users with most accordance in a survey

Kai23 · July 19, 2019, 7:22am

Hi all,
I have a datatable with results of a survey. As a result, i want to create an output for each user with the greatest and the least accordance.

For Example, this is my datatable: (QuestionID and UserID are UniqueIdentifiers while Answer is Integer)

+------------+--------+--------+
| QuestionID | UserID | Answer |
+------------+--------+--------+
| a          | 1      |    1   |
| a          | 2      |    1   |
| a          | 3      |    2   |
| a          | 4      |    3   |
| b          | 1      |    2   |
| b          | 2      |    1   |
| b          | 3      |    2   |
| b          | 4      |    1   |
| c          | 1      |    3   |
| c          | 2      |    3   |
| c          | 3      |    2   |
+------------+--------+--------+

Now I want to create a stored procedure, using a UserID as input parameter. In this example, I'm using the first user '1' (UniqueIdentifier in live table) as input parameter. The SP should now compute a list with the top 3 users with the greatest accordance (so in the survey the answered most questions like user '1') and the top 3 users with the least accordance (so in the survey they answered most questions not like user '1').

Regarding the accordance, there could be only 'true' (same answer) or 'false' (not same answer). There is no relation in the answers like "3 is a higher miss than 2 if the user answered 1". So if the user answers "1", only "1" is correct accordance while "2" and "3" are both mismatches.

It's also possible that users did not answer all questions, e.g. see in above table that user 4 did not answer last question c. This will count as mismatch. So if user 1 answered 10 question and user 2 answered only 5 questions, he could not get more than 50% accordance even if he answered all 5 questions same as user 1. (This would be a Bonus later on: Only consider the users which answered at least 50% of all questions which are also answered by user 1. If so, missing answered will not count as mismatch but just be ignored. So if user 2 answered 6 questions and all with the same answer as user 1, he would have 100% accordance. But I'm already struggling with the basic part, so that's out of scope)

Of course the live table contains more than just 4 users so Top 3 would be best to return. The return table should than look like this: (Accordance has the number of same question answers and percentage is Accordance divided by number of questions answered by user 1 (in current example: 3))

+--------+------------+------------+
| UserID | Accordance | Percentage |
+--------+------------+------------+
| 2      | 2          | 66.67      |
| 3      | 1          | 33.33      |
| 4      | 0          | 0          |
+--------+------------+------------+

Any ideas how to do this? I've tried with several GROUP BY but was not successfull

harishgg1 · July 19, 2019, 12:37pm

hi

i am trying to do this ..not able to make any sense of what you are saying!!!!

please explain this in a way that we can understand ....

this is a very very easy thing to do
if we can understand you

Kai23 · July 19, 2019, 1:09pm

Hi Harishgg1,
hm... now I have the same problem with your reply as you had with my description: I don't understand it Would have been helpful if you describe what is not understandable... Everything?

I could give you more details about this, hopefully this helps:

We have an intranet where our users need to login.
Now there is a survey with a dozen of questions, e.g. "Would you like to have a christmas party this year?" Users could answer "Yes", "No", "Don't know" or skip this question. As a result, the number of the selected option will be stored in the database table. So that's either "1" (for Yes), "2" (for No), "3" (for Don't know) or no record for this question for this user (for Skip).

That's the 1st table I've shown above. As said, within the database, 'QuestionID' and 'UserID' are UniqueIdentifiers.

When the survey is over, I want to show for each user, which other users voted in most cases the same as this user does. That's the 2nd table. For example: When User 1 opens the result, I want to show "User 2 voted in 2 of 3 questions with the same result, giving you a total accordance of 66.67%".

Was that helpful?

harishgg1 · July 19, 2019, 1:14pm

I will try to understand it
Thanks Kai

How to make others understand
What you are saying

Does not matter
If you make
Mistakes or anything

Is a p.h.d subject by itself
And an art and skill

Diagrams
2 or 3 words
How things are linked?

I am very very interested
In this subject
Also

Please forgive me
If I offended you

Thanks
Harish

Kai23 · July 19, 2019, 1:28pm

No problem Harish, everything fine!

Maybe it helps speaking in SQL language? I've created this little script which shows the datatable and the expected result. Of course we have dozens of questions and hundreds of users, but I kept it simple in this example:

DECLARE @tblResults TABLE 
	(
		QuestionID UNIQUEIDENTIFIER,
		UserID UNIQUEIDENTIFIER,
		Answer INT
	)

INSERT INTO @tblResults VALUES ('00000000-0000-0000-0000-000000000001', '11111111-1111-1111-1111-111111111111', 1)
INSERT INTO @tblResults VALUES ('00000000-0000-0000-0000-000000000001', '11111111-1111-1111-1111-111111111112', 1)
INSERT INTO @tblResults VALUES ('00000000-0000-0000-0000-000000000001', '11111111-1111-1111-1111-111111111113', 3)
INSERT INTO @tblResults VALUES ('00000000-0000-0000-0000-000000000001', '11111111-1111-1111-1111-111111111114', 4)
INSERT INTO @tblResults VALUES ('00000000-0000-0000-0000-000000000002', '11111111-1111-1111-1111-111111111111', 3)
INSERT INTO @tblResults VALUES ('00000000-0000-0000-0000-000000000002', '11111111-1111-1111-1111-111111111112', 3)
INSERT INTO @tblResults VALUES ('00000000-0000-0000-0000-000000000002', '11111111-1111-1111-1111-111111111113', 2)
INSERT INTO @tblResults VALUES ('00000000-0000-0000-0000-000000000002', '11111111-1111-1111-1111-111111111114', 1)
INSERT INTO @tblResults VALUES ('00000000-0000-0000-0000-000000000003', '11111111-1111-1111-1111-111111111111', 1)
INSERT INTO @tblResults VALUES ('00000000-0000-0000-0000-000000000003', '11111111-1111-1111-1111-111111111112', 2)
INSERT INTO @tblResults VALUES ('00000000-0000-0000-0000-000000000003', '11111111-1111-1111-1111-111111111113', 2)
INSERT INTO @tblResults VALUES ('00000000-0000-0000-0000-000000000004', '11111111-1111-1111-1111-111111111111', 1)
INSERT INTO @tblResults VALUES ('00000000-0000-0000-0000-000000000004', '11111111-1111-1111-1111-111111111112', 1)
INSERT INTO @tblResults VALUES ('00000000-0000-0000-0000-000000000004', '11111111-1111-1111-1111-111111111113', 2)
INSERT INTO @tblResults VALUES ('00000000-0000-0000-0000-000000000004', '11111111-1111-1111-1111-111111111114', 1)

SELECT * FROM @tblResults

-- This should be the result when I call the SP with UserID 11111111-1111-1111-1111-111111111111 as input parameter
SELECT '11111111-1111-1111-1111-111111111112' as UserID, 3 as Accordance, 75 as 'Percentage'
UNION
SELECT '11111111-1111-1111-1111-111111111113', 0, 0
UNION
SELECT '11111111-1111-1111-1111-111111111114', 1, 25
ORDER BY [Percentage] DESC

So User 1111 has

3 of 4 questions (questions 0001, 0002, 0004) answered same as User 2
0 of 4 questions answered same as user 3
1 of 4 questions (question 0004) answered same as User 4. (User 4 did not answer question 0003 but it's nevertheless 1 of 4 (not 3) because User 1 answered 4 questions).

Does this help? Feel free to ask in case there is still something unclear!

Many thanks in advance!

harishgg1 · July 19, 2019, 4:24pm

hi

i got the accordance and average part also

please check
hope it helps
i love feedback thanks

drop create data ...

drop table tblresults
go 

create table tblResults  
	(
		QuestionID UNIQUEIDENTIFIER,
		UserID UNIQUEIDENTIFIER,
		Answer INT
	)

INSERT INTO tblResults VALUES ('00000000-0000-0000-0000-000000000001', '11111111-1111-1111-1111-111111111111', 1)
INSERT INTO tblResults VALUES ('00000000-0000-0000-0000-000000000001', '11111111-1111-1111-1111-111111111112', 1)
INSERT INTO tblResults VALUES ('00000000-0000-0000-0000-000000000001', '11111111-1111-1111-1111-111111111113', 3)
INSERT INTO tblResults VALUES ('00000000-0000-0000-0000-000000000001', '11111111-1111-1111-1111-111111111114', 4)
INSERT INTO tblResults VALUES ('00000000-0000-0000-0000-000000000002', '11111111-1111-1111-1111-111111111111', 3)
INSERT INTO tblResults VALUES ('00000000-0000-0000-0000-000000000002', '11111111-1111-1111-1111-111111111112', 3)
INSERT INTO tblResults VALUES ('00000000-0000-0000-0000-000000000002', '11111111-1111-1111-1111-111111111113', 2)
INSERT INTO tblResults VALUES ('00000000-0000-0000-0000-000000000002', '11111111-1111-1111-1111-111111111114', 1)
INSERT INTO tblResults VALUES ('00000000-0000-0000-0000-000000000003', '11111111-1111-1111-1111-111111111111', 1)
INSERT INTO tblResults VALUES ('00000000-0000-0000-0000-000000000003', '11111111-1111-1111-1111-111111111112', 2)
INSERT INTO tblResults VALUES ('00000000-0000-0000-0000-000000000003', '11111111-1111-1111-1111-111111111113', 2)
INSERT INTO tblResults VALUES ('00000000-0000-0000-0000-000000000004', '11111111-1111-1111-1111-111111111111', 1)
INSERT INTO tblResults VALUES ('00000000-0000-0000-0000-000000000004', '11111111-1111-1111-1111-111111111112', 1)
INSERT INTO tblResults VALUES ('00000000-0000-0000-0000-000000000004', '11111111-1111-1111-1111-111111111113', 2)
INSERT INTO tblResults VALUES ('00000000-0000-0000-0000-000000000004', '11111111-1111-1111-1111-111111111114', 1)

drop create execute procedure

DROP PROCEDURE dataproc 

go 

CREATE PROCEDURE Dataproc @userid UNIQUEIDENTIFIER 
AS 
    ; WITH cte 
         AS (SELECT DISTINCT userid 
             FROM   tblresults 
             WHERE  userid <> @userid), 
         cte1 
         AS (SELECT a.userid, 
                    a.questionid, 
                    a.answer 
             FROM   (SELECT * 
                     FROM   tblresults 
                     WHERE  userid <> @userid) a 
                    JOIN (SELECT * 
                          FROM   tblresults 
                          WHERE  userid = @userid) b 
                      ON a.questionid = b.questionid 
                         AND a.answer = b.answer), 
         ctetotcount 
         AS (SELECT Count(answer) + 0.0 AS cnttot 
             FROM   cte1), 
         ctecnta 
         AS (SELECT userid, 
                    Count(answer) AS cnta 
             FROM   cte1 
             GROUP  BY userid) 
    SELECT a.userid, 
           a.cnta            AS Accordance, 
           a.cnta / b.cnttot AS avg 
    FROM   ctecnta a, 
           ctetotcount b 
    UNION ALL 
    SELECT userid, 
           0, 
           0 
    FROM   (SELECT userid 
            FROM   cte 
            WHERE  userid NOT IN (SELECT userid 
                                  FROM   cte1)) a 

go 

EXECUTE Dataproc 
  '11111111-1111-1111-1111-111111111111'

Kai23 · July 22, 2019, 5:49am

Brillant Harish!!! Works perfekt, and I'm sure I never came to that solution on my own!!

Really great!!! Thanks a lot Harish!!!