String Tokenization
String tokenization is the process of splitting a string into smaller parts or tokens based on a specific delimiter or set of delimiters. Each token represents a meaningful unit of information within the string.
In C++, there are multiple ways to tokenize a string:
- Using the
std::stringstream
class - Using the
std::string
member functions - Using the
boost::tokenizer
library
Let's explore each of these methods in more detail.
Tokenization using std::stringstream
The std::stringstream
class provides a convenient way to split a string into tokens using streams. Here's an example:
TEXT/X-C++SRC
1#include <iostream>
2#include <sstream>
3#include <vector>
4
5int main() {
6 std::string str = "Hello,World,How,Are,You";
7
8 std::vector<std::string> tokens;
9 std::stringstream ss(str);
10 std::string token;
11
12 while (std::getline(ss, token, ',')) {
13 tokens.push_back(token);
14 }
15
16 for (const auto& t : tokens) {
17 std::cout << t << std::endl;
18 }
19
20 return 0;
21}
xxxxxxxxxx
21
int main() {
std::string str = "Hello,World,How,Are,You";
std::vector<std::string> tokens;
std::stringstream ss(str);
std::string token;
while (std::getline(ss, token, ',')) {
tokens.push_back(token);
}
for (const auto& t : tokens) {
std::cout << t << std::endl;
}
return 0;
}
OUTPUT
:001 > Cmd/Ctrl-Enter to run, Cmd/Ctrl-/ to comment