I bumped into a problem that in my opinion is really interesting. I found a relatively easy itinerary solution, but it completely lacks performance when applied to a bigger stack of records... and in the end it turned out to be even wrong!
Imagine a list of names with an underlying, BUT NOT KNOWN order: (we make it visible to make it easier to follow )
Peter, Paul, John, George, Mary, Anne, Eric, Thomas, Susan, Liz
The only thing you have are samples with subsets of those names, listed in the correct hierarchical order.
sample1: John, Anne, Susan
sample2: George, Thomas, Susan, Liz
sample3: Peter, Paul, Eric, Liz
sample4: Paul, Susan
sample5: Mary, Eric, Susan
sample6: Paul, George, Mary, Anne, Thomas
sample7: Paul, John, Mary, Eric
The goal is to find the correct order; or at least a possible correct order for the amount of given samples. Just imagine that there will be 400 names and 500'000 samples.
I post what I thought was a solution: Taking the average ranking within the samples as a measure. This will for sure bring #1 in Top position but lower ranked names will be calculated wrong. You'll need a splitter function that returns the name and its position. like this one -> http://www.sqlservercentral.com/articles/Tally+Table/72993/
Create TABLE #Sample ( ID int, Sample varchar(100) ) INSERT into #Sample (ID,Sample) values (1,'John,Anne,Susan'),(2,'George,Thomas,Susan,Liz'), (3,'Peter,Paul,Eric,Liz'),(4,'Paul,Susan'),(5,'Mary,Eric,Susan'),(6,'Paul,George,Mary,Anne,Thomas'), (7,'Paul,John,Mary,Eric') Create TABLE #Names ( ID int, Name varchar(10), rank int ) Insert into #Names ( ID, Name, rank ) SELECT ID, Item, ItemNumber FROM #Sample CROSS APPLY ( SELECT Item, Itemnumber FROM dbo.Split(Sample, ',') ) AS Y select * from #Names select name, 1.0*sum(rank)/count(rank) as ord from #Names Group by name order by ord asc drop Table #Sample drop Table #Names