This blog post I will show you how to implement did you mean by in JavaScript. This algorithm can be easily translated to any language like C# ,Java or python. The implementation of Did You Mean By
can be done by using edit distance algorithm. This algorithm calculates how close two strings are based on edit operations like insert, delete, or substitute. It does this by taking a string and comparing it to all other strings using these operations until it finds one that produces an edit distance of zero.
What is Edit Distance ( Levenshtein )Algorithm?
The Levenshtein distance (LD) measures the similarity of two strings, which we will refer to as the source string (s) and the target string (t). The number of deletions, insertions, or substitutions required to transform s into t is the distance. As an example,
- If
s
is “temp” andt
is “temp”, then LD(s,t) = 0, because no transformations are needed. The strings are already identical. - if
s
is ‘temp’ and t is ‘tenp’ then LD(s,t)=1, because one substitution (change “m” to “n”) is sufficient to transform s into t.
The Levenshtein distance algorithm has been used in:
- Spell checking
- Speech recognition
- DNA analysis
- Plagiarism detection
function levenshteinDistance(a: string, b: string) {
if (a.length === 0) return b.length;
if (b.length === 0) return a.length;
let matrix:any = [];
// increment along the first column of each row
let i;
for (i = 0; i <= b.length; i++) {
matrix[i] = [i];
}
// increment each column in the first row
let j;
for (j = 0; j <= a.length; j++) {
matrix[0][j] = j;
}
// Fill in the rest of the matrix
for (i = 1; i <= b.length; i++) {
for (j = 1; j <= a.length; j++) {
if (b.charAt(i - 1) == a.charAt(j - 1)) {
matrix[i][j] = matrix[i - 1][j - 1];
} else {
matrix[i][j] = Math.min(matrix[i - 1][j - 1] + 1, // substitution
Math.min(matrix[i][j - 1] + 1, // insertion
matrix[i - 1][j] + 1)); // deletion
}
}
}
return matrix[b.length][a.length];
}
It is now time to put the algorithm into action. The concept is straightforward. In this case, I’m searching a dictionary for the given word and calculating the edit distance. I’m returning the word once I find one that has a minimum edit distance.
function didYouMeanBy(word: string, words: string[]) {
let min = 100000;
let similarWord = "";
for (let i = 0; i < words.length; i++) {
let dist = levenshteinDistance(word, words[i]);
dist;
if (dist < min) {
min = dist;
similarWord = words[i];
}
}
return similarWord;
}
Output
//Dictionary of words (you can use prefilled dictionary) or you can use `Trie` data structure to
//find the prefix of the word and then search for the word in the dictionary
const dictionary=["santos", "sandal", "saturday", "sunday", "same"];
console.log(didYouMeanBy("sandy",dictionary));
if you run the above program you will see the output sunday
Happy Coding 😊😊