A Deeper Look at the std::string Class

# Calculating Levenshtein Distance

## How can I efficiently calculate the Levenshtein distance between two std::strings?

The Levenshtein distance, also known as the edit distance, is a measure of the difference between two strings. It represents the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one word intoÂ another.

Calculating this distance efficiently using std::string is an excellent exercise in dynamic programming and stringÂ manipulation.

Let's implement an efficient Levenshtein distanceÂ calculator:

#include <iostream>
#include <string>
#include <vector>
#include <algorithm>

int levenshteinDistance(const std::string& s1,
const std::string& s2) {
const size_t m{s1.length()};
const size_t n{s2.length()};

std::vector<std::vector<int>> dp(
m + 1, std::vector<int>(n + 1)
);

// Initialize first row and column
for (size_t i{0}; i <= m; ++i) dp[i][0] = i;
for (size_t j{0}; j <= n; ++j) dp[0][j] = j;

// Fill the dp table
for (size_t i{1}; i <= m; ++i) {
for (size_t j{1}; j <= n; ++j) {
if (s1[i - 1] == s2[j - 1]) {
dp[i][j] = dp[i - 1][j - 1];
} else {
dp[i][j] = 1 + std::min({
dp[i - 1][j],     // deletion
dp[i][j - 1],     // insertion
dp[i - 1][j - 1]  // substitution
});
}
}
}

return dp[m][n];
}

int main() {
std::string word1{"kitten"};
std::string word2{"sitting"};

int distance{levenshteinDistance(word1, word2)};

std::cout
<< "The Levenshtein distance between '"
<< word1 << "' and '" << word2 << "' is: "
<< distance << '\n';
}
The Levenshtein distance between 'kitten' and 'sitting' is: 3

Let's break down the levenshteinDistance()Â function:

1. We create a 2D vector dp to store our dynamic programming table.
2. We initialize the first row and column of dp to represent the distance from an empty string.
3. We iterate through both strings, filling the dp table:
• If the characters match, we copy the diagonal value (no edit needed).
• If they don't match, we take the minimum of the three possible operations (deletion, insertion, substitution) and add 1.
4. The final cell dp[m][n] gives us the Levenshtein distance.

This implementation has a time complexity of $O(mn)$ and a space complexity of $O(mn)$, where $m$ and $n$ are the lengths of the inputÂ strings.

We can optimize this further to use only $O(min(m,n))$Â space:

#include <iostream>
#include <string>
#include <vector>
#include <algorithm>

int levenshteinDistance(
const std::string& s1, const std::string& s2
) {
const size_t m{s1.length()};
const size_t n{s2.length()};

// Ensure s1 is the shorter string
if (m > n) return levenshteinDistance(s2, s1);

std::vector<int> prevRow(m + 1);
std::vector<int> currRow(m + 1);

for (size_t i{0}; i <= m; ++i) prevRow[i] = i;

for (size_t j{1}; j <= n; ++j) {
currRow[0] = j;
for (size_t i{1}; i <= m; ++i) {
if (s1[i - 1] == s2[j - 1]) {
currRow[i] = prevRow[i - 1];
} else {
currRow[i] = 1 + std::min(
{prevRow[i],
currRow[i - 1],
prevRow[i - 1]
});
}
}
std::swap(prevRow, currRow);
}

return prevRow[m];
}

int main() {/*...*/}

The Levenshtein distance between 'kitten' and 'sitting' is: 3

This optimized version uses only two rows of the DP table at a time, significantly reducing memory usage for longÂ strings.

The Levenshtein distance has many practical applications,Â including:

• Spell checking and correction
• DNA sequence alignment in bioinformatics
• Fuzzy string matching in search algorithms

By implementing this algorithm efficiently with std::string, we've demonstrated advanced string manipulation techniques and the power of dynamic programming inÂ C++.

This Question is from the Lesson:

### A Deeper Look at the std::string Class

A detailed guide to std::string, covering the most essential methods and operators

Answers to questions are automatically generated and may not have been reviewed.

This Question is from the Lesson:

### A Deeper Look at the std::string Class

A detailed guide to std::string, covering the most essential methods and operators

Part of the course:

## Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Free, unlimited access

### This course includes:

• 124 Lessons
• 550+ Code Samples
• 96% Positive Reviews
• Regularly Updated
• Help and FAQ
Free, Unlimited Access

### Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.