UDF Return a float.
UDF Return a float. 0 is no match, and 1 is full match
UDF Return a float.
UDF Return a float. 0 is no match, and 1 is full match
UDF Return a float.
UDF Return a float. 0 is no match, and 1 is full match
UDF Return a float.
UDF Return a float. 0 is no match, and 1 is full match
UDF Return a float.
UDF Return a float. 0 is no match, and 1 is full match
UDF Return a boolean.
UDF Return a boolean. True if Soundex of the two string are exectly matched
StringMetricUDFs is a collection of string similarity measures Implemented using Scala StringMetrics lib
UDFs with Boolean returns
-
soundexMatch: ture if the Soundex of the strings matched exactlyUDFs with Float returns
N-gram based measures
- nGram2: 2-gram with formula (number of overlaped gramCnt)/max(s1.gramCnt, s2.gramCnt) - nGram3: 3-gram with the same formula above - diceSorensen: 2-gram with formula (2 * number of overlaped gramCnt)/(s1.gramCnt + s2.gramCnt)
Editing distance measures
- levenshtein - jaroWinkler