Skip to content

Std::regex

The regular expressions library provides a class that represents Regular expressions, which are a kind of mini-language used to perform pattern matching within strings.

raw strings

아래와 같은 방법으로 RAW 문자열을 저장할 수 있다.

std::string expr = R"(문자열)";

중요한 점은 R"(부터 )"까지 정해진 문법이며, 괄호가 많아, 해석이 혼란스러울 경우 아래와 같은 문법으로 사용하면 된다.

prefix(optional) R "delimiter( raw_characters )delimiter"

Match check

Windows에서 절대경로를 포함하는지 확인하는 코드는 아래와 같다.

bool hasAbsoluteOfWindows(std::string const & path) const {
    if (std::regex_search(path, std::regex(R"(^[a-zA-Z]:)"))) {
        return true;
    }
    return false;
}

Replace

간단히, Path 분리자를 정리하는 코드는 아래와 같다.

/**
 * "O:\\\\Temp\\Directory/..\\.////\\\\/File.tmp\\/" -> "O:/Temp/Directory/.././File.tmp"
 */
static std::string cleanSeparator(std::string const & path) {
    return std::regex_replace(path, std::regex(R"((\\|\/)(\\|\/)*)"), "/");
}

Example

#include <iostream>
#include <iterator>
#include <string>
#include <regex>

int main()
{
    std::string s = "Some people, when confronted with a problem, think "
        "\"I know, I'll use regular expressions.\" "
        "Now they have two problems.";

    std::regex self_regex("REGULAR EXPRESSIONS",
            std::regex_constants::ECMAScript | std::regex_constants::icase);
    if (std::regex_search(s, self_regex)) {
        std::cout << "Text contains the phrase 'regular expressions'\n";
    }

    std::regex word_regex("(\\S+)");
    auto words_begin = 
        std::sregex_iterator(s.begin(), s.end(), word_regex);
    auto words_end = std::sregex_iterator();

    std::cout << "Found "
              << std::distance(words_begin, words_end)
              << " words\n";

    const int N = 6;
    std::cout << "Words longer than " << N << " characters:\n";
    for (std::sregex_iterator i = words_begin; i != words_end; ++i) {
        std::smatch match = *i;
        std::string match_str = match.str();
        if (match_str.size() > N) {
            std::cout << "  " << match_str << '\n';
        }
    }

    std::regex long_word_regex("(\\w{7,})");
    std::string new_s = std::regex_replace(s, long_word_regex, "[$&]");
    std::cout << new_s << '\n';
}

Output:

Text contains the phrase 'regular expressions'
Found 19 words
Words longer than 6 characters:
  people,
  confronted
  problem,
  regular
  expressions."
  problems.
Some people, when [confronted] with a [problem], think 
"I know, I'll use [regular] [expressions]." Now they have two [problems].

Troubleshooting

undefined reference to std::regex_iterator

Linux계열에서 아래와 같이 std::regex_iterator를 찾을 수 없는 경우가 있다.

./libworld.a(Values.world.o): In function `world::Values::replaceEnvString(std::string&) const':
Values.cpp:(.text+0x795): undefined reference to `std::regex_iterator<__gnu_cxx::__normal_iterator<char const*, std::string>, char, std::regex_traits<char> >::regex_iterator()'
collect2: error: ld returned 1 exit status

The GNU C++ standard library supports <regex>, but not until version 4.9.0. (The headers were present in earlier versions, but were unusable.) The other compilers don't support it, as far as I can see. You can use a different library if you use an older GCC.

즉, GCC 4.9.0 부터 regex를 지원한다.

See also

Favorite site