The Foundation: Understanding the Problem
To find duplicate words in a sentence, we need a systematic approach for analyzing the occurrences of each word. Essentially, our task is to examine each word in the given string and keep a tab on how often it shows up.
Step 1: Tokenizing the Sentence into Words
What Does Tokenizing Mean?
Tokenization is the process of converting a sequence of text into individual "tokens" or units. In our context, this means splitting the given sentence into individual words.
How to Tokenize?
We can start by converting the entire string to lowercase to make our function case-insensitive. Then, we'll use the split
function to break it into an array of words.
1let split_s = s.toLowerCase().split(' ');
xxxxxxxxxx
13
let s = "Original String";
let splitS = s.toLowerCase().split(" ");
let occurrences = {};
​
for (let word of splitS) {
if (!occurrences.hasOwnProperty(word)) {
occurrences[word] = 1;
} else {
occurrences[word]++;
}
}
​
console.log(occurrences);
OUTPUT
:001 > Cmd/Ctrl-Enter to run, Cmd/Ctrl-/ to comment