Use Regular Expressions in Emacs

What is Regular Expression? A regular expression is a special text string for describing a certain amount of text. Generally, A regular expression contains a few special characters and ordinary characters. If you have used Linux/Windows systems, you will probably familiar with wildcard notations such as *.txt to find all text files in a folder. That’s a simple regular expression example (Thanks for netcasper’s correcting . The regex equivalent to “.txt” is .*.txt). Nowadays, Most of the text editors support regular expression such as vi, emacs and Ultra Editor(not a free software). Today, I would like introduce some basic knowledge about regular expression and use it in Emacs to improve the efficiency of your coding work.


If you do not familiar with the emacs tool, you go to GNU Emacs to find out the more detals about this great tool.
Let’s begin with the basis of regular expression. I have already said that regular expression contains some special characters. Here are the special characters `$’, `^’, `.’, `*’, `+’, `?’, `[‘, `]’ and `’. Those special characters will help you express the text you want represent in regular expression. Any other character appearing in a regular expression is ordinary, unless a `’ precedes it.

.
is a special character that matches any single character except a newline. Using concatenation, we can make regular expressions like `a.b’, which matches any three-character string that begins with `a’ and ends with `b’.
*
is not a construct by itself; it is a postfix operator that means to match the preceding regular expression repetitively as many times as possible. Thus, `o*’ matches any number of `o’s (including no `o’s).
+
is a postfix operator, similar to `*’ except that it must match the preceding expression at least once. So, for example, `ca+r’ matches the strings `car’ and `caaaar’ but not the string `cr’, whereas `ca*r’ matches all three strings.
?
is a postfix operator, similar to `*’ except that it can match the preceding expression either once or not at all. For example, `ca?r’ matches `car’ or `cr’; nothing else.
[ … ]
is a character set, which begins with `[‘ and is terminated by `]’. Thus, `[ad]’ matches either one `a’ or one `d’, and `[ad]*’ matches any string composed of just `a’s and `d’s (including the empty string).
You can also include character ranges in a character set, by writing the starting and ending characters with a `-‘ between them. Thus, `[a-z]’ matches any lower-case ASCII letter. When you use a range in case-insensitive search, you should write both ends of the range in upper case, or both in lower case, or both should be non-letters. The behavior of a mixed-case range such as `A-z’ is somewhat ill-defined, and it may change in future Emacs versions.
[^ … ]
`[^’ begins a complemented character set, which matches any character except the ones specified. Thus, `[^a-z0-9A-Z]’ matches all characters except ASCII letters and digits.
`^’ is not special in a character set unless it is the first character. The character following the `^’ is treated as if it were first (in other words, `-‘ and `]’ are not special there).
A complemented character set can match a newline, unless newline is mentioned as one of the characters not to match. This is in contrast to the handling of regexps in programs such as grep.
^
is a special character that matches the empty string, but only at the beginning of a line in the text being matched. Otherwise it fails to match anything. Thus, `^foo’ matches a `foo’ that occurs at the beginning of a line.
$
is similar to `^’ but matches only at the end of a line. Thus, `x+$’ matches a string of one `x’ or more at the end of a line.

has two functions: it quotes the special characters (including `’), and it introduces additional special constructs.

For more infomation about regular-expressions you can check out this site. Then, it’s time to illustrate a real example happened in my project. The problem is that I want modify my code in the following ways:
All the sentences like the following pattern

zCPiADPL2PL2Mode::theOne.isenable = true;

should be substituted for

zCPiStatusBoss::set(this,&zCPiADPL2PL2Mode::theOne,true);

And there are hundreds of sentences like this. If you do this job manually, it is really a challenge to your patience. Fortunately, I have my friend Emacs! Let me tell you how Emacs help me out of it. It’s a very simple task for emacs. Just use the key combination ‘Alt + x’ to input a command in emacs; Use the command “replace-regexp” and enter the following regular expression:

(zCPi[a-zA-Z0-9]*::theOne).isenable = ([truefals]*);$

and press “enter” to input the text you want replace with:

zCPiStatusBoss::set(this,&1::theOne,2);

Ok, Everything is done! The above example is very simple and something need to explain is that the1 in the second expression represent the first expression (zCPi[a-zA-Z0-9]*::theOne) . Similarly, 2 represent ([truefals]*). What does this mean? The answer is Emacs use the ‘(‘ and ‘)’ to encapsulate the text you want to use in the replace text expression.

3 Replies to “Use Regular Expressions in Emacs”

Leave a Reply

Your email address will not be published. Required fields are marked *