MUF For Hackers - Regular Expressions

Go to the first, previous, next, last section, table of contents.

Regular Expressions

Muq MUF provides a simple regular expression syntax roughly comparable to that of Perl 4. (4)

Regular expressions are treated as a sublanguage used to compile named functions which may then be called just like any other MUF function. The definition of a regular expression function begins with the rex: keyword, which is followed by the function name and then the regular expression proper, delimited by the printable character of your choice, typically slashes:

root:
rex: re1 /^abc[def]/
root:
"abcd" re1
root: t
pop "abcg" re1
root: nil
pop
root:

rex: re2 /^abc[d-f]/
root:
"abcd" re2
root: t
pop "abcg" re2
root: nil
pop
root:

rex: re3 /^abc|def|ghi/
root:
"abc" re3   "jkl" re3
root: t nil
pop pop
root:

rex: re4 /^a[bc]*d/
root:
"acbd" re4   "acb" re4
root: t nil
pop pop
root:

The basic Perl4 set of special escape characters are supported:

^      match start of string
$      match end of string
[a-z]  any char from a through z
[^a-z] any char BUT a through z
\s     whitespace
\S     non-whitespace
\d     digit: [0-9]
\D     non-digit
\w     word character: [0-9a-zA-Z]
\W     non-word character
x*     zero or more 'x's.
x+     one  or more 'x's.
x?     zero or one 'x'x.
x{2,4} two to four 'x's.

Parens in a regular expression serve to group it, and also indicate a part which should be returned as an additional result of the function:

root:
rex: re6 /^a(b*)c/
root:
"ac" re6
root: t ""
pop pop
root:
"abbbbbbbc" re6
root: t "bbbbbbb"
pop pop
root:
"adc" re6
root: nil ""

As usual, numeric escapes such as \2 will match whatever the corresponding (second, in this case) pair of parens in the regular expression matched:

root:
rex: re5 /^(abc)\1$/
root:
"abcabc" re5
root: t "abc"
pop pop
root:
"abcacb" re5
root: nil ""

Regular expressions can be a great aid in digesting formatted text:

root:
rex: matchDate /^\s*(\w+)\s*(\d+)(\w*),\s*(\d+)\s*$/
root:
" July 4th, 1999  " matchDate
root: t "July" "4" "th" "1999"
pop pop pop pop pop
root:
"Aug8,1999" matchDate
root: t "Aug" "8" "" "1999"

Go to the first, previous, next, last section, table of contents.