well, of course you can apply it to a DB table, Why else would it be in SQL Server? If you install Master data Services, you can use this algorithm and others:
DQS matching uses nGrams. This algorithm tokenizes strings to all possible substrings of length n.
These tokens are called nGrams. Then it checks how many different
nGrams two strings have in common. The more they have in common, the
more similar the strings are. By dividing the number of nGrams in common
with the number of all possible nGrams, you can get a coefficient with a
value between 0 and 1 that estimates the similarity of the strings. The
closer the coefficient is to the value 1, the more similar the strings
SSIS fuzzy transformations use a much more sophisticated algorithm.
The algorithm was developed by Microsoft Research and is not public.
However, some details are described in the Books Online for SQL Server
2012 article “Fuzzy Lookup Transformation” at http://msdn.microsoft.com.ezproxy.torontopubliclibrary.ca/en-us/library/ms137786.aspx.
In addition to DQS and SSIS, MDS has some public string similarity
algorithms as well. You can find these algorithms implemented in the MDS
database through the mdq.Similarity function. The function implements
the following four public algorithms:
- Levenshtein distance (also called edit distance)
- Jaccard index
- Jaro-Winkler distance
- Simil (longest common substring; also called Ratcliff/Obershelp)
In addition, you can get nGrams with the mdq.NGrams function, also
provided in the MDS database. Finally, there is also a function that
finds similar dates called mdq.SimilarityDate. You can use all of the
MDS functions in Transact-SQL queries.