Thank you @ScottPletcher, @harishgg1 and @Wim_Leys for your input. I think I was close with my code but it had a bug and was going my head in
Yeah, this was an interesting and complex issue. I came up with several approaches before I settled on the one above.
I do think this ultimately could be done using recursion, but the performance would be truly hideous, so it's not worth the trouble to do.
Similar (sort of) to @ScottPletcher
use sqlteam
go
create table dbo.StudyOverSeas(id varchar(50),
name nvarchar(50), nationality nvarchar(50))
create table groups(groupid int, groupname varchar(50));
INSERT INTO StudyOverSeas (id, name, nationality)
VALUES
('1234567', 'steve', 'Nigeria'),
('1234568', 'stuart', 'Nigeria'),
('1234569', 'alix', 'Australia'),
('1234561', 'heather', 'Nigeria'),
('1234562', 'john', 'Sweden'),
('1234563', 'julie', 'Nigeria'),
('1234564', 'tracy', 'Nigeria'),
('1234565', 'Pam', 'Australia'),
('1234566', 'Sam', 'Sweden'),
('1234567', 'Ukuele', 'UK'),
('1234568', 'Ursula', 'USA'),
('1234569', 'Max', 'Mexico'),
('1234570', 'Bill', 'Brasil')
go
/*
create multi groups then assign one group of people
to the other where it is not already assigned.
*/
declare @counter int, @jedi int = 1 ;
create table dbo.grouper(id varchar(50), name nvarchar(50),
nationality nvarchar(50), groupid int)
;with src
as
(
select count(nationality) _count
from StudyOverSeas
group by nationality
)
select @counter = max(_count)
from src;
WHILE (@jedi) <= @counter
BEGIN
insert into dbo.groups(groupid, groupname)
select distinct @jedi, nationality from StudyOverSeas;
IF (@jedi = @counter)
begin
BREAK
end
ELSE
begin
insert into grouper
select distinct top 3 so.id, so.name, so.nationality , g.groupid
from dbo.StudyOverSeas so
join dbo.groups g on so.nationality = g.groupname
where groupid = @jedi
and not exists(select 1
from grouper tgt
where tgt.id = so.id and tgt.name = so.name
and so.nationality = tgt.nationality and tgt.groupid <> @jedi)
set @jedi = @jedi + 1
CONTINUE
end
END
select * from grouper;
drop table StudyOverSeas
drop table groups
drop table grouper
hi Wim
I tried and i tried and i tried and i tried and i tried and i tried
not able to come up non recursive solution
if somebody gave me a million dollars
I could keep trying till i find a solution
so could any one else !!!
i give up !!!
I think we have spend enough time on this. Let's move on.
Okay - deleted the original post because I found an issue, here is an updated version:
Declare @StudyOverSeas Table (StudentId int, StudentName varchar(20), Nationality varchar(20));
Insert Into @StudyOverSeas (StudentId, StudentName, Nationality)
Values ('1234567', 'steve', 'Australia')
, ('1234568', 'stuart', 'Japan')
, ('1234569', 'alix', 'Australia')
, ('1234561', 'heather', 'China')
, ('1234562', 'john', 'Sweden')
, ('1234563', 'julie', 'Africa')
, ('1234564', 'tracy', 'Africa')
, ('1112212', 'dave', 'United States')
, ('1113312', 'david', 'United States')
, ('2221121', 'jack', 'England')
, ('3334333', 'jason', 'Germany')
, ('2221122', 'jackson', 'England')
, ('4443433', 'mike', 'Spain')
, ('4423242', 'scott', 'Italy');
Declare @totalStudents int = (Select count(*) From @StudyOverSeas sos);
With studentGroups
As (
Select *
, GroupID = ntile(@totalStudents / 3) over(Order By sos.StudentId, sos.Nationality)
From @StudyOverSeas sos
)
, studentRanks
As (
Select *
, RankID = dense_rank() over(Partition By GroupID Order By Nationality)
From studentGroups
)
, uniqueGroups
As (
Select *
, rn = row_number() over(Partition By GroupID, Nationality Order By StudentId)
From studentRanks
)
Select *
From uniqueGroups
Where rn = 1;
Needs further testing and validation - this will cut short a group if 2 of the students placed in that group have the same nationality and only 3 students were identified in the group.
In this method - the StudentId will cause students with id's that are close together to be grouped together...and if they have the same nationality then one (or more) of them would be removed. To get those to be moved to another group we would need some other value that will group differently....
You have the same nationality in the same group. The core rule is that each member of the group must be a different nationality.
hi Jeff
Many thanks for your Trying
Two things have to happen !!!!
- It must be divided into groups of 3
- In each group all nationalities must be different
I did not see it in your SQL Output
Unless I am missing something
i used a different data set ..
that could be the reason WHY !!!!
..... My different data set
drop table StudyOverSeas
go
create table StudyOverSeas
(
Studentid varchar(100),
name varchar(100),
nationality varchar(100)
)
go
INSERT INTO StudyOverSeas (Studentid, name, nationality)
VALUES
('1234567', 'steve', 'Africa'),
('1234568', 'stuart', 'Africa'),
('1234569', 'alix', 'Australia'),
('1234561', 'heather', 'Africa'),
('1234562', 'john', 'Sweden'),
('1234563', 'julie', 'Africa'),
('1234564', 'tracy', 'Africa'),
('1234565', 'Pam', 'Australia'),
('1234566', 'Sam', 'Sweden'),
('1234567', 'Ukuele', 'UK'),
('1234568', 'Ursula', 'USA'),
('1234569', 'Max', 'Mexico'),
('1234570', 'Bill', 'Brasil'),
('1234572', 'Sil', 'Brasil'),
('1234575', 'Kill', 'Brasil'),
('1234577', 'Chill', 'Brasil'),
('1234579', 'Lill', 'Brasil')
go
Hello - I think this meets your requirements..
if object_id('tempDb..#StudyOverSeas' , N'U') is not null drop table #StudyOverSeas;
create table #StudyOverSeas
(
id int primary key,
name varchar(100),
nationality varchar(20)
)
INSERT INTO #StudyOverSeas (id, name, nationality)
VALUES ('1234567', 'steve', 'Australia'),
('1234568', 'stuart', 'Japan'),
('1234569', 'alix', 'Australia'),
('1234561', 'heather', 'China'),
('1234562', 'john', 'Sweden'),
('1234563', 'julie', 'Africa'),
('1234564', 'tracy', 'Africa');
Declare @cnt int = (select count(*) from #StudyOverSeas);
;with cte as
(
select *
,ROW_NUMBER() over(partition by a.nationality order by newId()) idx
,ROW_NUMBER() over(order by newId()) grp
from #StudyOverSeas a
)
,cte_2
as
(
select *
from cte a
outer apply
(
select distinct top (3) b.nationality
from cte b
where a.nationality <> b.nationality
) b( grp_nat)
outer apply
(
select (abs(checksum(newId())) % count(*)) + 1 nat_idx
from #StudyOverSeas c
where c.nationality = b.grp_nat
) c
)
select top (@cnt) b.id, b.name , b.nationality, a.grp
from cte_2 a
left join cte b on a.grp_nat = b.nationality and b.idx = a.nat_idx
order by a.grp;
Please let me know if this helps
hi Wim
Since I have a lot of free time !!!! I am still on it ...
I have added a little variety to your data set
and created my own data set
Here's how far I got ... with a new approach Please share your opinion and thoughts
A little filtering magic from here should do the job
please click arrow to the left for DROP CREATE data ..
drop table StudyOverSeas
go
create table StudyOverSeas
(
Studentid varchar(100),
name varchar(100),
nationality varchar(100)
)
go
INSERT INTO StudyOverSeas (Studentid, name, nationality)
VALUES
('1234567', 'steve', 'Africa'),
('1234568', 'stuart', 'Africa'),
('1234569', 'alix', 'Australia'),
('1234561', 'heather', 'Africa'),
('1234562', 'john', 'Sweden'),
('1234563', 'julie', 'Africa'),
('1234564', 'tracy', 'Africa'),
('1234565', 'Pam', 'Australia'),
('1234566', 'Sam', 'Sweden'),
('1234567', 'Ukuele', 'UK'),
('1234568', 'Ursula', 'USA'),
('1234569', 'Max', 'Mexico'),
('1234570', 'Bill', 'Brasil'),
('1234572', 'Sil', 'Brasil'),
('1234575', 'Kill', 'Brasil'),
('1234577', 'Chill', 'Brasil'),
('1234579', 'Lill', 'Brasil')
go
please click arrow to the left for SQL ..
; with cte as
(
select
ROW_NUMBER() over(order by (select null)) as rn,
a.Studentid as aStudentid ,
a.name as aname,
a.nationality as anationality,
b.Studentid as bStudentid ,
b.name as bname,
b.nationality as bnationality,
c.Studentid as cStudentid ,
c.name as cname,
c.nationality as cnationality
from
StudyOverSeas a
join StudyOverSeas b
on a.nationality <> b.nationality
join StudyOverSeas c
on a.nationality <> c.nationality
and b.nationality <> c.nationality
)
select rn, aStudentid , aname, anationality from cte
union all
select rn, bStudentid , bname, bnationality from cte
union all
select rn, cStudentid , cname, cnationality from cte
order by rn
hi femiolan
Your solution is doing it ...
Amazing stuff ... i have been breaking my head for hours and hours hours !!!
Will see how you did it and memorize it ...
Many thanks
Thanks harishgg1! Appreciate the feedback. I agree that this one was pretty complex, I hope it helps the OP.
Thanks again
My solution does separate into groups of 3 with different nationalities. The issue with this approach is that it will also group students into the same group if their student ID is within the same range...
When that occurs - we drop one of the duplicates which means some groups may have less than 3 members. To account for that we need a different mechanism for distributing the students...
Declare @totalStudents int = (Select count(*) From @StudyOverSeas sos) / 3 - 1;
With studentGroups
As (
Select *
, GroupID = ntile(@totalStudents) over(Order By newid(), sos.Nationality)
From @StudyOverSeas sos
)
, studentRanks
As (
Select *
, RankID = dense_rank() over(Partition By GroupID Order By Nationality)
From studentGroups
)
, uniqueGroups
As (
Select *
, rn = row_number() over(Partition By GroupID, Nationality Order By StudentId)
From studentRanks
)
Select *
From uniqueGroups
--Where rn = 1
-- And RankID <= 3;
A small change in this - using newid() instead of StudendId in the ntile function will randomly distribute the students across each group...however, even this will sometimes place a student with the same nationality into the same group.
How you approach that depends on the source set of data and how many groupings you need. At some point you won't have enough data that can be evenly distributed - for example, if the set of data has 10 students from Australia - and another 10 students split across five other nationalities - there is no way to evenly distribute across groups of 3 without duplicates...
With the above change we can reduce the number of groups to 1 less than the total available...this will allow 'extra' rows to be assigned to each group...we can then use the row number and the RankID to remove the 'duplicate/extra' rows from the results.
If you want the 4 groups with the possibility that one or more groups will have less than 3 members - change the @totalStudents to:
Declare @totalStudents int = (Select count(*) From @StudyOverSeas sos) / 3;
Again - depending on the set of data you may end up with groups that have less than the total number of members requested. Even in a cursor approach you will end up with extra rows that cannot be placed into any grouping - so you either end up with groups with less than the total or you eliminate those rows from the results.
I am not sure this meets the requirements...although not explicitly stated by the OP I assumed the requirement was that a student/member could only be listed one time and this result will place members into multiple groups.
TOTALLY agree with you Jeff
If you don't have the count of rows exactly a mutiple of 3
then what do you do with the extra rows !!
All depends on what is the requirement !!!
N number of things can be done
Many thanks for your efforts !!!
We can simplify the query to this:
Declare @totalStudents int = (Select count(*) From @StudyOverSeas sos) / 3; --- 1;
With studentGroups
As (
Select *
, GroupID = ntile(@totalStudents) over(Order By newid(), sos.Nationality)
From @StudyOverSeas sos
)
, studentRanks
As (
Select *
, RankID = dense_rank() over(Partition By GroupID Order By Nationality)
, rn = row_number() over(Partition By GroupID, Nationality Order By StudentId)
From studentGroups
)
Select *
From studentRanks sr
--Where rn = 1
-- And RankID <= 3;
I ran the code, using Harish's test data (post #31) and Jeff's query (post #37) , including the WHERE clause (otherwise groups of more than 3 people are formed).
Where rn = 1
And RankID <= 3;
There are 17 students, only 14 end up in the resultset.
The query should not discriminate people based on their nationality
I just ran the query, now going to have a better look at how it works.
And I learned something new: NTILE().
Harlington most likely never imagined his question would result in such a lengthy and vivid discussion.