Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

c++
string
communitycreator

How to tokenize a string in C++

Harsh Jain

In this shot, we are going to learn how to tokenize a string in C++. We can create a stringstream object or use the built-in strtok() function to tokenize a string. However, we will create our own tokenizer in C++.

Follow the steps below to tokenize a string:

  • Read the complete string.

  • Select the delimiter the point that you want to tokenize your string. In this example, we will tokenize the string in every space.

  • Iterate over all the characters of the string and check when the delimiter is found.

  • If delimiter is found, then you can push your word to a vector.

  • Repeat this process until you have traversed the complete string.

  • For the last token, you won’t have any space, so that token needs to be pushed to the vector after the loop.

  • Finally, return the vector of tokens.

Now let’s look at the code for clarity.

#include <bits/stdc++.h>
using namespace std;

vector<string> mystrtok(string str, char delim){
    vector<string> tokens;
    string temp = "";
    for(int i = 0; i < str.length(); i++){
        if(str[i] == delim){
            tokens.push_back(temp);
            temp = "";
        }
        else
            temp += str[i];           
    }
    tokens.push_back(temp);
    return tokens;
}

int main() {
    string s = "Learn in-demand tech skills in half the time";
    vector<string> tokens = mystrtok(s, ' ');
    for(string s: tokens)
        cout << s << endl;
}
Build your own string tokenizer in C++

Explanation

  • In line 1, we include the bits/stdc++.h library, which includes all the libraries for us (we do not need to include each library explicitly).
  • In line 4, we create a mystrtok() function that accepts the string and the delimiter and returns a vector of tokens.
  • In line 5, we create a vector that will store the tokens.
  • From lines 7 to 14, we run a loop to traverse each character in the string. If we find the delimiter, then we push the token to the vector. Otherwise, we continue to build our token.
  • In line 15, as discussed above, the last token will not be pushed to the vector, so we need to push the last token after the loop.
  • In line 16, we return the vector of tokens.
  • In the main() function, we call the function and then print every token.

In this way, we can build our own string tokenizer in C++ to use in string problems.

RELATED TAGS

c++
string
communitycreator
RELATED COURSES

View all Courses

Keep Exploring