Mark As Completed Discussion

Good evening! Here's our prompt for today.

Given a string of characters, can you find the longest substring of the string that has no repeating characters?

An example is shown below:

Description
JAVASCRIPT
1lengthOfLongestSubstring("abracadabar");
2// 4 because it is 'brac'

Constraints

  • Length of the string <= 100000
  • The string will only contain lowercase letters
  • Expected time complexity : O(n)
  • Expected space complexity : O(1)

In this challenge, we'll discuss a simple but efficient method for finding the longest substring with no repeating characters.

Try to solve this here or in Interactive Mode.

How do I practice this challenge?

JAVASCRIPT
OUTPUT
:001 > Cmd/Ctrl-Enter to run, Cmd/Ctrl-/ to comment

We'll now take you through what you need to know.

How do I use this guide?

Brute Forcing It

This is one of those problems which is deceptively challenging. When you're given the prompt, it sounds like an easy iteration problem-- but we'll need to have it perform at a decent complexity.

Before even considering optimization, though, let's consider the most straightforward solution possible. So we want to know the longest substring with no duplicate characters. How would we do this manually, given the word algodaily? To achieve this, why not just look at every single substring?

Here's how the steps would go:

  1. First, we consider a. Longest max is 1.
  2. Then, we look at al. Longest max is now 2.
  3. We look at alg. It's now 3.
  4. o - 4
  5. d - 5
  6. a - wait, we've seen this before. Max is still 5, and the next few steps are also invalid and won't change the max.
  7. We have substring algodai - i - substring has duplicate, so invalld
  8. algodail - l - substring has duplicate, so invalld
  9. algodaily - y - substring has duplicate, so invalld
  10. Now we can start from l, and repeat from step 1... (l, lg, etc.)

We've provided an example of how this brute force solution would look when implemented.

JAVASCRIPT
OUTPUT
:001 > Cmd/Ctrl-Enter to run, Cmd/Ctrl-/ to comment

However, at O(n^2), this is way too inefficient. Let's determine a better solution. At a quick glance, we know we'll need to iterate and have some form of on-going storage for tracking purposes.

Solution to Longest Substring Without Duplicate Characters

Our optimized method solves the problem efficiently with respect to time, yet requires very little extra space. Remember the manual steps our brute force approach above? Oftentimes, efficient solutions just cut out steps from slower ones.

Consider the step by step we already did. I highlighted/bolded some steps we probably don't need:

  1. First, we consider a. Longest max is 1.
  2. Then, we look at al. Longest max is now 2.
  3. We look at alg. It's now 3.
  4. o - 4
  5. d - 5
  6. a - wait, we've seen this before. Max is still 5, and the next few steps are also invalid and won't change the max. 1. We have substring algodai - i - substring has duplicate, so invalid 1. algodail - l - substring has duplicate, so invalid 1. algodaily - y - substring has duplicate, so invalid
  7. Now we can start from l, and repeat from step 1... (l, lg, etc.)

Notice something? The last 3 steps before we start from l could've been cut out! We knew that once we encountered the second a, all substrings that we extended to would be invalid. So why don't we jump straight to starting from l as soon as we see the duplicate?

That's the idea behind the optimal solution.

More Setup

To find the longest substring with no repeating characters, we'll read each character of the input string from left to right.

At the same time, we'll keep track of the current substring and the maximum substring length found so far. To keep track of the current substring, we'll record its starting index in a variable startInd and its last index in a variable i.

This solution uses an array, which has the same size as the size of the alphabet. The array stores the index of the character of the alphabet last found in the input string (in other words, it's tracking the last seen character element). In this tutorial, we call this array lastPos.

JAVASCRIPT
1const lastPos = [];

As we read the characters, we'll also maintain the lastPos array for storing the index of each character's last occurrence in the string. The lastPos array would be initialized to -1.

Suppose at index position i, we have read character x. Before that, we would've encountered this character at lastPos[x]. lastPos helps us remember where we last saw a duplicate character. That way, we can use lastPos[x] + 1 to "remove" the first duplicate from the current substring, and test against a valid substring.

Cases

There are two possible cases that are occurring, which are shown in the figure and explained below. For now, we'll assume that lastPos can be indexed by x. Later we'll discuss how to implement the indexing for this array.

These two cases help us make sense of when to start or stop consideration of the current substring.

Solving the Problem
  1. Case 1: x has already occurred in the current substring. In this situation startInd <= lastPos[x].

    To handle this case, start a new substring after lastPos[x]-- to do this, we can do startInd = lastPos[x]+1. Update the length of the current substring as (i-startInd+1).

  2. Case 2: x has either not occurred in the string or it has occurred in some previous substring. In this case, startInd > lastPos[x] and we can read the next character of the input string and keep updating length and maximum length found so far.

Algorithm for Longest Substring Without Duplicates

The big picture pseudo-code algorithm for finding the longest substring without duplicates is provided here. We'll make the same assumption about the lastPos array as mentioned previously (that it can be indexed by x and lastPos[x] holds the last found occurrence of x in the input string).

The figure below shows the entire process for an example.

Our Algorithm
TEXT
OUTPUT
:001 > Cmd/Ctrl-Enter to run, Cmd/Ctrl-/ to comment

Time Complexity

If n is the length of the input string, then the time complexity for the algorithm described here is O(n). This is evident from the described algorithm, which traverses the entire input string from left to right only once.

The space complexity of the algorithm is, however, O(A), where A is the size of the alphabet. Most of the time, the size of the alphabet is constant (26 for lower-case letters, 25 for all letters, and 256 for ASCII) and quite low compared to memory of a modern computer. So space complexity can be thought of O(1).

Concluding Remarks

In this challenge, we discussed a linear time algorithm for finding the shortest substring of a string such that the substring has no duplicate characters

In trade-off of very little memory, the algorithm is time-efficient. But if the alphabet is very large, a more efficient data structure, that consumes less space can be used. One possibility is a hash table that would only store the characters that occurred last in the string and not the entire alphabet.

One Pager Cheat Sheet

  • We'll use an efficient method which has an O(n) time complexity and O(1) space complexity to find the longest substring without repeating characters in a string of length 100000 or less which only contains lowercase letters.
  • The problem of finding the longest substring with no repeated characters can be solved by manually checking every substring with a brute force approach.
  • Our optimized solution efficiently solves the problem with respect to time, while requiring very little extra space, by dynamically tracking the last seen character element and jumping straight to the next valid substring when a duplicate is spotted.
  • This algorithm finds the longest substring without duplicates by iterating through each character in the string and using lastPos to store the last occurrence of the character.
  • The time complexity of the algorithm discussed is O(n) and the space complexity is O(1) or, depending on the size of the alphabet, O(A).

This is our final solution.

To visualize the solution and step through the below code, click Visualize the Solution on the right-side menu or the VISUALIZE button in Interactive Mode.

JAVASCRIPT
OUTPUT
:001 > Cmd/Ctrl-Enter to run, Cmd/Ctrl-/ to comment

You're doing a wonderful job. Keep going!

If you had any problems with this tutorial, check out the main forum thread here.