Up to (and including) C++17 if you wanted to check the start or the end in a string you have to use custom solutions, boost or other third-party libraries. Fortunately, this changes with C++20.
See the article where I’ll show you the new functionalities and discuss a couple of examples.
Here’s the main proposal that was added into C++20:
In the new C++ Standard we’ll get the following member functions for
And also for suffix checking:
As you can see, they have three overloads: for a
string_view, a single character and a string literal.
You can play with this basic example @Wandbox
Token Processing Example
Below, you can find an example which takes a set of HTML tokens and extracts only the text that would be rendered on that page. It skips the HTML tags and leaves only the content and also tries to preserve the line endings.
You can play with the code at @Wandbox
The most interesting parts:
- there’s a lambda
convertToEolwhich takes a
stringand then returns the same string or converts that to EOL if it detects the closing HTML tag.
- the lambda is then used in the
std::transformcall that converts the initial set of tokens into the temporary version.
- the lambda is then used in the
- later the temporary tokens are removed from the vector by using another predicate lambda. This time we have a simple text for an HTML token.
- you can also see the use of
std::erase_ifwhich works nicely on our vector, this functionality is also new to C++20. There’s no need to use remove/erase pattern.
- at the end we can display the final tokens that are left
Prefix and a (Sorted) Container
Let’s try another use case. For example, if you have a container of strings, then you might want to search for all elements that start with a prefix.
A simple example with unsorted vector:
Play with code @Wandbox
In the sample code, I’m computing the
foundNames vector, which contains entries from
names that starts with a given
prefix. The code uses
copy_if with a predicated that leverages the
On the other hand, if you want to have better complexity for this kind of queries, then it might be wiser to store those strings (or string views) in a sorted container. This happens when you have a
std::set, or you sort your container. Then, we can use
lower_bound to quickly (logarithmically) find the first element that should match the prefix and then perform a linear search for neighbour elements.
Play with the code @Wandbox
As a side note, you might also try a different approach which should be even faster. Rather than checking elements one by one starting from the lower bound iterator, we can also modify the last letter of the pattern in that way that it’s “later” in the order. And then you can also find lower_bound from that modified pattern. Then you have two ranges and better complexity (two log(n) searchers). I’ll leave that experiment for you as a “homework”.
All examples that I’ve shown so far used regular
std::string objects and thus we could only compare strings case-sensitively. But what if you want to compare it case-insensitive?
For example, in boost there are separate functions that do the job:
In QT, similar functions take additional argument that selects the comparison technique ( QString Class – starts_with).
In the Standard Library, we can do another way… and write your trait for the string object.
As you can recall
std::string is just a specialisation of the following template:
traits class is used for all core operations that you can perform on characters. You can implement a trait that compares strings case-insensitively.
You can find the examples in the following websites:
After implementing the trait you’ll end up with a string type that is different than
Is that a limitation? For example, you won’t be able to easily copy from
std::string into your new
istring. For some designs, it might be fine, but on the other hand, it can also be handy to have just a simple runtime parameter or a separate function that checks case-insensitively. What’s your opinion on that?
Another option is to “normalise” the string and the pattern – for example, make it lowercase. This approach, unfortunately, requires to create extra copies of the strings, so might not be the best.
Sorry for a little interruption in the flow 🙂
I’ve prepared a little bonus if you’re interested in Modern C++, check it out here.
Most of the recent compiler vendors already support the new functionality!
In this article, you’ve seen how to leverage new functionality that we get with C++20: string prefix and suffix checking member functions.
You’ve seen a few examples, and we also discussed options if you want your comparisons to be case insensitive.
And you can read about other techniques of prefix and suffix checking in: