std.regex
Regular expressions are a powerful method of string pattern matching. The regular expression language used in this library is the same as that commonly used, however, some of the very advanced forms may behave slightly differently. The standard observed is the ECMA standard for regular expressions. std.regex is designed to work only with valid UTF strings as input - UTF8 (char), UTF16 (wchar), or UTF32 (dchar). To validate untrusted input, use std.utf.validate(). In the following guide, pattern[] refers to a regular expression. The attributes[] refers to a string controlling the interpretation of the regular expression. It consists of a sequence of one or more of the following characters:Attribute | Action |
---|---|
g | global; repeat over the whole input string |
i | case insensitive |
m | treat as multiple lines separated by newlines |
Format | Replaced With |
---|---|
$$ | $ |
$& | The matched substring. |
$` | The portion of string that precedes the matched substring. |
$' | The portion of string that follows the matched substring. |
$n | The nth capture, where n is a single digit 1-9 and n is not followed by a decimal digit. |
$nn | The nnth capture, where nn is a two-digit decimal number 01-99. If nnth capture is undefined or more than the number of parenthesized subexpressions, use the empty string instead. |
Wikipedia License:
Boost License 1.0. Authors:
Walter Bright, Andrei Alexandrescu Source:
std/regex.d
- Regular expression to extract an email address.
References:
How to Find or Validate an Email Address; RFC 2822 Internet Message Format - Regular expression to extract a url
- A Regex stores a regular expression engine. A Regex object
is constructed from a string and compiled into an internal format for
performance.
The type parameter E specifies the character type recognized by
the regular expression. Currently char, wchar, and dchar are supported. The encoding of the regex string and of the
recognized strings must be the same.
This object will be mostly used via a call to the regex function,
which automatically deduces the character type.
Example:
Declare two variables and assign to them a Regex object. The first matches UTF-8 strings, the second matches UTF-32 strings and also has the global option set.auto r = regex("pattern"); auto s = regex(r"p[1-5]\s*"w, "g");
- RegexMatch is the type returned by a call to match. It
stores the matching state and can be inspected and iterated.
- Get or set the engine of the match.
- Range primitives that allow incremental matching against a string.
Example:
import std.stdio; import std.regex; void main() { foreach(m; match("abcabcabab", regex("ab"))) { writefln("%s[%s]%s", m.pre, m.hit, m.post); } } // Prints: // [ab]cabcabab // abc[ab]cabab // abcabc[ab]ab // abcabcab[ab]
- Retrieve the captured parenthesized matches, in the form of a
random-access range. The first element in the range is always the full
match.
Example:
foreach (m; match("abracadabra", "(.)a(.)")) { foreach (c; m.captures) write(c, ';'); writeln(); } // writes: // rac;r;c; // dab;d;b;
- Returns the slice of the input that precedes the matched substring.
- The matched portion of the input.
- Returns the slice of the input that follows the matched substring.
- Returns hit (converted to string if necessary).
- Returns whether string s matches this.
- Matches a string against a regular expression. This is the main entry to the module's functionality. A call to match(input, regex) returns a RegexMatch object that can be used for direct inspection or for iterating over all matches (if the regular expression was built with the "g" option).
- Search string for matches with regular expression pattern with
attributes. Replace the first match with string generated from format. If the regular expression has the "g" (global)
attribute, continue and replace all matches.
Parameters:
Returns:input Range to search. regex Regular expression pattern. format Replacement string format.
The resulting string. Example:
s = "ark rapacity"; assert(replace(s, regex("r"), "c") == "ack rapacity"); assert(replace(s, regex("r", "g"), "c") == "ack capacity");
The replacement format can reference the matches using the $&, $$, $', $`, .. 9 notation:assert(replace("noon", regex("^n"), "[$&]") == "[n]oon");
- Search string for matches with regular expression pattern with
attributes. Pass each match to function fun. Replace each match
with the return value from dg.
Parameters:
Returns:s String to search. pattern Regular expression pattern. dg Delegate
the resulting string. Example:
Capitalize the letters 'a' and 'r':string baz(RegexMatch!(string) m) { return std.string.toupper(m.hit); } auto s = replace!(baz)("Strap a rocket engine on a chicken.", regex("[ar]", "g")); assert(s == "StRAp A Rocket engine on A chicken.");
- Range that splits another range using a regular expression as a
separator.
Example:
auto s1 = ", abc, de, fg, hi, "; assert(equal(splitter(s1, regex(", *")), ["", "abc", "de", "fg", "hi", ""][]));