Go to the first, previous, next, last section, table of contents.


regular expression syntax

This node is not a tutorial, just a quick synopsis of syntax.

Current Muq MUF regular expression syntax is implemented primarily in muq/pkg/175-C-rex.t -- which is incidentally a good example of a compiler for an embedded language in MUF -- with some C-coded support in muq/c/rex.t.

Muq MUF follows the lead of Larry Wall's PERL language in regular expression syntax: His syntax has a simple quoting rule and minimizes the number of backslashes needed in typical uses of regular expressions. (I wince every time I read a regular expression written in emacs lisp.) The current implementation follows Perl v4: One of these days we'll probably add support for the v5 extensions.

Here are the regular expression operators Muq MUF currently recognizes:

a       Match given char.  Ditto other chars not listed below.
.       Match any single character except a newline.
[a-z]   Match any single character in the given ascii range. 
[^a-z]  Match any single character not in the given ascii range. 

\d      Match any single decimal digit: Same as [0-9].
\D      Inverse of above:  Same as [^0-9].

\s      Match any single whitespace char.
\S      Inverse of above.

\w      Match any single word-component char:  Same as [a-zA-Z0-9_].
\W      Inverse of above:  Same as [^a-zA-Z0-9_].

^ $     Match beginning or end of given string, respectively.

\n \r \t \f \0 Match newline/carraige-return/tab/formfeed/nul (respectively)

(<rex>) Parens group interior <rex> into a unit for exterior operators;
        Parens also remember the substring they matched.

<rex>*  Matches <rex> zero or more times.

<rex>+  Matches <rex> one or more times.

<rex>?  Matches <rex> zero or one times.

<rex>{n} Matches <rex> exactly n times.

<rex>{n,} Matches <rex> at least n times.

<rex>{n,m} Matches <rex> at least n and at most m times.

<rex0>|<rex1>   Match exactly one of <rex0> or <rex1>.

Hints:

To include ^ in a character set, make it non-first: [a-z^].
To include - in a character set, make it first: [-az]
To subdue all operators, precede all nonalphanumeric chars with "\".
In particular, to include a literal backslash, insert "\\".


Go to the first, previous, next, last section, table of contents.