Good afternoon! Here's our prompt for today.
The Edit Distance Challenge
What's the Challenge?
Your task is to find the Edit Distance between two strings. The edit distance measures how different two strings are by calculating the minimum number of transformations required to convert one string into another. The allowed transformations are:
- Insertion: Adding a character.
- Deletion: Removing a character.
- Substitution: Replacing one character with another.
For instance, if the two strings are identical, such as cat
and cat
, the edit distance is 0
. That's because no transformations are needed to make the two strings the same.
An Example to Illuminate
Consider two strings mitten
and sitting
. If we call getEditDistance('mitten', 'sitting')
, the function should return 3
. Here's how we arrive at this number:
- mitten → sitten: Substitute "s" for "m."
- sitten → sittin: Substitute "i" for "e."
- sittin → sitting: Insert "g" at the end.
Method Signature
You are to implement a function called getEditDistance(a, b)
that will return the edit distance between strings a
and b
.
1function getEditDistance(a, b) {
2 // Your code here
3}
Constraints to Keep in Mind
- The lengths of both given strings
a
andb
will be less than or equal to1000
. - The answer will fit within the integer range.
- You are only allowed to change one of the strings.
- Let
m
andn
be the lengths ofa
andb
, respectively. - Time Complexity:
O(n * m)
- Space Complexity:
O(n * m)
Try to solve this here or in Interactive Mode.
How do I practice this challenge?
xxxxxxxxxx
console.log('PASSED: ' + "getEditDistance('busa', 'ebcar') should return 4");
var assert = require('assert');
function getEditDistance(a, b) {
return '';
}
class Graph {
constructor() {
this.adjacencyList = new Map();
this.verticesCount = 0;
}
addVertex(nodeVal) {
this.adjacencyList.set(nodeVal, []);
this.verticesCount++;
}
addEdge(src, dest) {
this.adjacencyList.get(src).push(dest);
this.adjacencyList.get(dest).push(src);
// push to both adjacency lists
}
removeVertex(val) {
if (this.adjacencyList.get(val)) {
this.adjacencyList.delete(val);
}
this.adjacencyList.forEach((vertex) => {
Here's our guided, illustrated walk-through.
How do I use this guide?
The Essence of Levenshtein Edit Distance: A Mathematical Lens
The Birth of the Concept
The Levenshtein edit distance algorithm was introduced by Vladimir Levenshtein in 1965. It's a mathematical formula that calculates the least number of single-character edits needed to change one string into another.
The Intimidating Equation
Upon first glance, the formula for Levenshtein Distance may seem complex:
Breaking Down the Formula

The Min Function
The min
function is key to understanding the formula. What we're doing is considering three options at each step:
- Insertion: This is represented by
- Deletion: This is represented by
- Substitution: This is represented by
We take the minimum of these three options because we want to find the least number of edits needed.
The Cost Factor
The "cost" is either 0 or 1, depending on whether the current characters in the strings a
and b
are equal or not. If they are equal, the cost is 0; otherwise, it's 1.
A More Intuitive View
Think of this as a journey where you start at one corner of a grid. Your destination is the opposite corner. The grid cells represent the characters in the strings you're comparing. Each move from one cell to another corresponds to an edit operation:
- Moving horizontally represents insertion.
- Moving vertically represents deletion.
- Moving diagonally represents substitution.
You're trying to find the shortest path from one corner to another, making the least number of moves (edits). That's your Levenshtein Distance.
Skip this screen if you don't care about the math and want to know how the implementation works.
The mathematical definition is as follows: the Levenshtein distance between two strings a
and b
can be calculated by:
This makes sense, we're inputting a
and b
and are arriving an outcome after processing the two values somehow.
So how do we process them?
By using an indicator function
, which is a defined as a function defined on a set X
that indicates membership of an element in a subset A
of X
. This is represented by a 1
in all elements of A
where there's a membership and a 0
for all elements of X
not in A
.
So the following:
Is the indicator function
equal to 0
when ai
= bj
(when the letters are the same in the same position) and equal to 1
otherwise (if they differ).
To calculate the actual distance, we use:
The above is the distance between the first i
characters of a
and the first j
characters of b
.
Confused? Don't worry, we'll move beyond the math in the next section.
Unlocking the Intuition: The Shopping Analogy for Edit Distance
Shopping for String Transformations: The Essence of Edit Distance
Imagine you're at a shopping mall, but this is no ordinary mall. Instead of clothes and gadgets, you're shopping for string transformations. Your goal is to transform one string into another with the least amount of "money" spent. Each "store" in this mall specializes in one kind of string operation: insertion, deletion, or substitution.
The Cost of String Operations: Understanding the Pricing System
Just like in any shopping spree, understanding the pricing system is crucial:
No Purchase Necessary: If both strings have the same letter at a given position, no action is needed. It's like finding an item you already own; you don't need to buy it again. Cost:
0
.Make a Purchase: If the letters are different, then it's time to shop! You have three stores to consider:
- The "Insertion" store
- The "Deletion" store
- The "Substitution" store
Your aim is to make the most cost-effective purchase to align the strings.
Why the Matrix? Your Shopping List for Optimal Choices
So, why are we filling out a matrix? Think of the matrix as your shopping list or, better yet, your comparison shopping chart. Each cell in the matrix represents a decision point where you have to choose the most cost-effective operation.
From Above (Insertion Store): Imagine you're considering adding a new letter to string 1. Look at the price from the cell directly above and add 1.
From the Left (Deletion Store): Now imagine you're contemplating deleting a letter from string 1. Check the price from the cell directly to the left and add 1.
From the Diagonal (Substitution Store): Finally, you're thinking about substituting one letter for another. Look at the diagonal price and add the cost, which is
0
if the letters are the same and1
otherwise.
Completing Your Shopping List: The Bottom-Right Cell
As you fill in the matrix (your shopping list), you're effectively tracking your "spending" at each point, always opting for the cheapest viable operation. When you finally reach the bottom-right cell of the matrix, you've found the least expensive route to make both strings identical. That's your total "spending," or the final edit distance.
The Beauty of This Approach
This approach works beautifully because it allows you to compare every possible pair of prefixes between the two strings. You are essentially exploring all potential shopping routes and jotting down the least expensive "purchases" you can make at each decision point. By the time you reach the bottom-right cell, you've done your due diligence and found the best deal. That's why this method is so effective and why the final number in the bottom-right cell truly represents the minimum edit distance between the two strings.
Are you sure you're getting this? Is this statement true or false?
Suppose we're trying to find the edit distance
of two strings, string A = "milk"
and string B = "melt"
.
When comparing the character of string A at index 0, and string B at index 0, no edit is neeed so we should not add 1
to our running distance.
However, at index 1 for the two strings, we want to add 1
to account for I
and E
.
Press true if you believe the statement is correct, or false otherwise.
So we now know when there is a deletion, addition, or substitution, we'll need to account for that. The key to understanding the levenshtein
edit distance is that we'll use an array matrix
to account for the n^2
possible transformations.
Here's a nice visual to grok how we'll apply this logic across a matrix:

Image credit to https://www.python-course.eu.
Building the Edit Distance Matrix: A Concrete Example
Let's dive into another example and step through the entire thing.
The Core Idea: Cumulative Transformation Distance
To find the smallest number of transformations needed, we will fill in a matrix while keeping track of the cumulative transformation distance. This is the running total of changes needed to make the strings match up to that point.
Here's how we fill in each cell of the matrix:
- From Above: Take the value from the cell immediately above and add 1:
- From the Left: Take the value from the cell immediately to the left and add 1:
- From the Diagonal: Take the value from the cell diagonally above and to the left, and add the cost:
Step-by-Step Matrix Building: The Example of best
vs test
Let's make this real by constructing the matrix for the strings best
and test
. First, we initialize a row and a column based on the lengths of our two strings:
b | e | s | t | ||
---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | |
t | 1 | START | |||
e | 2 | ||||
s | 3 | ||||
t | 4 |
Now let's fill it in, guided by our three rules:
- From Above: If we just want to insert a letter, we look at the cell above and add 1.
- From the Left: If we want to delete a letter, we look at the cell to the left and add 1.
- From the Diagonal: If we want to substitute a letter, we look diagonally and add the cost (which is 0 if the letters are the same, and 1 otherwise).
Moving Left to Right: Starting with Column "B"
To understand how to fill in the matrix, let's begin with the column corresponding to the letter B
from the string best
. Our goal is to move left to right and populate each cell with the least "costly" way to transform the strings up to that point.
The Three Choices for Each Cell
Remember our guidelines for each cell:
- From Above: The cell immediately above plus 1.
- From the Left: The cell immediately to the left plus 1.
- From the Diagonal: The cell diagonally above and to the left, plus the cost (0 if the letters are the same, and 1 otherwise).
Filling In the "B" Column
Starting at the *
in the prior table, we see that B
and T
are different. So, the edit cost is 1
. With this in mind, we have three calculations:
- From Above: (1 + 1 = 2)
- From the Left: (1 + 1 = 2)
- From the Diagonal: (0 + 1 = 1)
The minimum of these is 1
, so we place it in the cell where B
intersects with T
.
Here's how our matrix looks after filling in the "B" column:
B | e | s | t | ||
---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | |
T | 1 | 1 | |||
e | 2 | 2 | |||
s | 3 | 3 | |||
t | 4 | 4 |
We've successfully filled in the first column for the letter B
. We'd continue this process for each of the remaining columns to complete the matrix and find the total edit distance.
The Column for Letter "E"
Now that we've filled in the column for the letter B
, let's move on to the column for the letter E
from the string best
.
Calculating the Costs
Starting with E
and T
, we again find that the two letters are different, so the edit cost is 1
. Here are the calculations for the cell:
- From Above: (1 + 1 = 2)
- From the Left: (2 + 1 = 3)
- From the Diagonal: (1 + 1 = 2)
The minimum of these is 2
, so we place it in the cell where E
intersects with T
.
Next, we move to E
and e
. The two letters are the same, so the edit cost is 0
. The calculations for this cell are:
- From Above: (2 + 1 = 3)
- From the Left: (1 + 1 = 2)
- From the Diagonal: (0 + 1 = 1)
The minimum is 1
, so we place it in the cell where E
intersects with e
.
The Updated Matrix
Here's how our matrix looks after filling in the "E" column:
b | E | s | t | ||
---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | |
T | 1 | 1 | 2 | ||
e | 2 | 2 | 1 | ||
s | 3 | 3 | 2 | ||
t | 4 | 4 | 3 |
We've successfully filled in the second column for the letter E
. As before, we'd continue this process for the remaining columns to complete the matrix and find the total edit distance.
The Column for Letter "S"
Having filled the columns for B
and E
, let's now move on to the column for the letter S
from the string best
.
Calculating the Costs for "S"
Starting with S
and T
, we find that the two letters are different, so the edit cost is 1
. The calculations for this cell are:
- From Above: (2 + 1 = 3)
- From the Left: (2 + 1 = 3)
- From the Diagonal: (2 + 1 = 3)
The minimum of these is 3
, so we place it in the cell where S
intersects with T
.
The Updated Matrix with "S" Column Filled
Here's how our matrix looks after filling in the "S" column:
b | e | S | t | ||
---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | |
T | 1 | 1 | 2 | 3 | |
e | 2 | 2 | 1 | 2 | |
s | 3 | 3 | 2 | 1 | |
t | 4 | 4 | 3 | 2 |
With the column for S
filled in, we're getting closer to completing our matrix. We would continue this step-by-step approach for the remaining columns to determine the final edit distance between the two strings.
The Column for Letter "T"
Now that we've successfully filled in the columns for B
, E
, and S
, we can wrap up our matrix with the last column for the letter T
from the string best
.
Calculating the Costs for "T"
Starting with the first row, T
and t
are the same characters, so the edit cost is 0
. The calculations for this cell are:
- From Above: (4 + 1 = 5)
- From the Left: (3 + 1 = 4)
- From the Diagonal: (3 + 0 = 3)
The minimum of these is 3
, so we place it in the cell where T
intersects with t
.
The Final Matrix
Here's how our fully completed matrix looks:
b | e | s | T | ||
---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | |
t | 1 | 1 | 2 | 3 | 3 |
e | 2 | 2 | 1 | 2 | 3 |
s | 3 | 3 | 2 | 1 | 2 |
t | 4 | 4 | 3 | 2 | 1 |
Congratulations, you've successfully completed the matrix! The value in the bottom-right corner tells us that the edit distance between best
and test
is 1
, confirming that it would take a minimum of 1 edit to transform one string into the other.
Decoding the Matrix: The Final Edit Distance
After filling in our matrix, we look to the bottom-right corner to find our final edit distance. In this example, that value is 1
, which confirms that it takes just a single edit to transform "best" into "test". Specifically, we can swap the first b
with a t
.
Here's the final matrix for quick reference:
b | e | s | T | ||
---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | |
t | 1 | 1 | 2 | 3 | 3 |
e | 2 | 2 | 1 | 2 | 3 |
s | 3 | 3 | 2 | 1 | 2 |
T | 4 | 4 | 3 | 2 | 1 (*) |
Complexity Analysis of the Final Solution
Space Complexity
The matrix we've built has dimensions that depend on the lengths of our two strings. Specifically, if m
and n
are the lengths of the two strings, the matrix is of size m x n
. This gives us a space complexity of O(m x n)
.
Time Complexity
To fill in this matrix, we used two nested loops that iterate through each row and column, performing constant-time operations at each cell. This results in a time complexity of O(m x n)
as well.
Both the time and space complexities for this solution are O(m x n)
, making it a quadratic algorithm.
One Pager Cheat Sheet
- The Edit Distance between two strings is calculated by finding the minimum number of
insertion
,deletion
, andsubstitution
transformations required to turn the first string into the second. - The Levenshtein edit distance is a
mathematical formula
developed by Vladimir Levenshtein in 1965, which measures the number ofedits
required to convert one string to another. - The Levenshtein distance between two strings
a
andb
can be calculated using anindicator function
and a mathematical formula. - The cost of transformation between two strings is measured by the distance between them, being
0
for no difference and increasing for any differences. - The running distance between two strings should be incremented by
1
if their characters at the same index are different. - The Levenshtein edit distance can be calculated using an
array matrix
to track then^2
possible transformations. - The algorithm uses a table to accumulate costs in order to find the minimum cumulative transformation distance between two strings.
- We need to move from the
*
indicated cell to the right, adding up the cost of 1 at each intersection as we go, and selecting the minimum of the three options at every step. - By comparing the letters
e
andt
first, and noting thate
ande
have no cost, we can determine the optimal cost of1 + 1
for the first2
columns. - We arrive at
3
(2
+1
) bys
, the bottom-right corner. - The
2x2
matrix with values1 through 4
results in the best score whendiagonally mirrored
. - Our Final Solution has
O(m*n)
time and space complexity.
This is our final solution.
To visualize the solution and step through the below code, click Visualize the Solution on the right-side menu or the VISUALIZE button in Interactive Mode.
xxxxxxxxxx
}
var assert = require('assert');
function getEditDistance(a, b) {
if (a.length == 0) return b.length;
if (b.length == 0) return a.length;
var matrix = [];
// increment along the first column of each row
var i;
for (i = 0; i <= b.length; i++) {
matrix[i] = [i];
}
// increment each column in the first row
var j;
for (j = 0; j <= a.length; j++) {
matrix[0][j] = j;
}
// Fill in the rest of the matrix
for (i = 1; i <= b.length; i++) {
for (j = 1; j <= a.length; j++) {
if (b.charAt(i - 1) == a.charAt(j - 1)) {
matrix[i][j] = matrix[i - 1][j - 1];
} else {
matrix[i][j] = Math.min(
matrix[i - 1][j - 1] + 1, // substitution
Math.min(
matrix[i][j - 1] + 1, // insertion
matrix[i - 1][j] + 1
)
You're doing a wonderful job. Keep going!
If you had any problems with this tutorial, check out the main forum thread here.