Regular expressions

Regular expressions

In addition to Perl regular expressions, UltraEdit supports two other "legacy" styles: a proprietary regular expression syntax and a basic Unix syntax. We typically recommend using Perl regular expressions, as these are far more powerful and robust than these two legacy styles.

UltraEdit (legacy) style syntax

SymbolFunction
%Matches the start of line - Indicates the search string must be at the beginning of a line but does not include any line terminator characters in the resulting string selected.
$Matches the end of line - Indicates the search string must be at the end of line but does not include any line terminator characters in the resulting string selected.
?Matches any single character except newline.
*Matches any number of occurrences of any character except newline. At least one occurrence of the preceding character or one of the characters in preceding character set must be found.
+Matches one or more of the preceding single character/character set. At least one occurrence of the character must be found.
++Matches the preceding single character/character set zero or more times.
^bMatches a page break.
^pMatches a newline (CR/LF) (paragraph) (DOS Files)
^rMatches a newline (CR Only) (paragraph) (MAC Files)
^nMatches a newline (LF Only) (paragraph) (UNIX Files)
^tMatches a tab character
[xyz]A character set. Matches any characters between brackets.
[~xyz]A negative character set. Matches any characters NOT between brackets including newline characters.
^{A^}^{B^}Matches expression A OR B
^Overrides the following regular expression character
^(...^)Brackets or tags an expression to use in the replace command. A regular expression may have up to 9 tagged expressions, numbered according to their order in the regular expression. The corresponding replacement expression is ^x, for x in the range 1-9. Example: If ^(h*o^) ^(f*s^) matches "hello folks", ^2 ^1 would replace it with "folks hello".
Note: ^ refers to the character '^' , not the Ctrl key. Examples:
m?nmatches "man", "men", "min" but not "moon".
t*tmatches "test", "tonight" and "tea time" (the "tea t" portion) but not "tea time" (newline between "tea " and "time").
Te+stmatches "test", "teest", "teeeest" etc. but does not match "tst".
[aeiou]matches every lowercase vowel
[,.?]matches a literal ",", "." or "?".
[0-9a-z]matches any digit, or lowercase letter
[~0-9]matches any character except a digit (~ means NOT the following)
You may search for an expression A or B as follows: "^{John^}^{Tom^}" This will search for an occurrence of John or Tom. There should be nothing between the two expressions. You may combine A or B and C or D in the same search as follows: "^{John^}^{Tom^} ^{Smith^}^{Jones^}" This will search for John or Tom followed by Smith or Jones.

Unix (legacy) style syntax

SymbolFunction
\Indicates the next character has a special meaning. "n" on it's own matches the character "n". "\n" matches a linefeed or newline character. See examples below (\d, \f, \n etc).
^Matches/anchors the beginning of line.
$Matches/anchors the end of line.
*Matches the preceding single character/character set zero or more times.
+Matches one or more of the preceding single character/character set. At least one occurrence of the preceding character or one of the characters in preceding character set must be found.
.Matches any single character except a newline character. Does not match repeated newlines.
(expression)Brackets or tags an expression to use in the replace command. A regular expression may have up to 9 tagged expressions, numbered according to their order in the regular expression. The corresponding replacement expression is \x, for x in the range 1-9. Example: If (h.*o) (f.*s) matches "hello folks", \2 \1 would replace it with "folks hello".
[xyz]A character set. Matches any characters between brackets.
[^xyz]A negative character set. Matches any characters NOT between brackets including newline characters.
\dMatches a digit character. Equivalent to [0-9].
\DMatches a nondigit character. Equivalent to [^0-9].
\fMatches a form-feed character.
\nMatches a linefeed character.
\rMatches a carriage return character.
\sMatches any whitespace including space, tab, form-feed, etc but not newline.
\SMatches any non-whitespace character but not newline.
\tMatches a tab character.
\vMatches a vertical tab character.
\wMatches any alphanumeric character including underscore.
\WMatches any character except alphanumeric characters and underscore.
\pMatches CR/LF (same as \r\n) to match a DOS line terminator.
Note: ^ refers to the character '^' , not the Ctrl key. Examples:
m.nmatches "man", "men", "min" but not "moon".
Te+stmatches "test", "teest", "teeeest" etc. BUT NOT "tst".
Te*stmatches "test", "teest", "teeeest" etc. AND "tst".
[aeiou]matches every lowercase vowel
[,.?]matches a literal ",", "." or "?".
[0-9a-z]matches any digit, or lowercase letter
[^0-9]matches any character except a digit (^ means NOT the following)
You may search for an expression A or B as follows: "(John|Tom)" This will search for an occurrence of John or Tom. There should be nothing between the two expressions. You may combine A or B and C or D in the same search as follows: "(John|Tom) (Smith|Jones)" This will search for John or Tom followed by Smith or Jones. If regular expressions aren't enabled for a find/replace, the following special characters are also valid in the Find and Replace fields:
NotationRepresents
^tTab character
^pNew line (DOS files - CR/LF, or hex 0D 0A)
^rCarriage return (hex 0D)
^nLine feed (new line in Unix based text files) (hex 0A)
^bLine break
^sSelected text
^cClipboard contents (up to 30,000 characters)
^^Literal "^" character
Note: ^ refers to the character '^' , not the Ctrl key.

Regular Expressions are essentially patterns rather than specific strings that are used with Find/Replace operations. There are many ways that regular expressions may be used to streamline operations and enhance efficiency. We have listed below a reference key for both UltraEdit-style and UNIX-style regular expressions as well as some examples to demonstrate how regular expressions may be used in UltraEdit.

Regular Expressions in UltraEdit
UltraEdit SymbolUNIX SymbolFunction
%^Matches/anchors the beginning of line.
$$Matches/anchors the end of line.
?.Matches any single character except a newline character. Does not match repeated newlines.
* Matches any number of occurrences of any character except newline.
++Matches one or more of the preceding character/expression. At least one occurrence of the character must be found. Does not match repeated newlines.
++*Matches the preceding character/expression zero or more times. Does not match repeated newlines.
^\Indicates the next character has a special meaning. "n" on its own matches the character "n". "^n" (UE expressions) or "\n" (UNIX expressions) matches a linefeed or newline character. See examples below.
[ ][ ]Matches any single character or range in the brackets.
[~xyz][^xyz]A negative character set. Matches any characters NOT between brackets.
^b\fMatches a page break/form feed character.
^p\pMatches a newline (CR/LF) (paragraph) (DOS Files).
^r\rMatches a newline (CR Only) (paragraph) (MAC Files).
^n\nMatches a newline (LF Only) (paragraph) (UNIX Files).
^t\tMatches a tab character.
[0-9]\dMatches a digit character.
[~0-9]\DMatches a non-digit character.
[ ^t^b]\sMatches any white space including space, tab, form feed, etc., but not newline.
[~ ^t^b]\SMatches any non-white space character but not newline.
\vMatches a vertical tab character.
[a-z_]\wMatches any word character including underscore.
[~a-z_]\WMatches any non-word character.
^{A^}^{B^}(A|B)Matches expression A OR B.
^\Overrides the following regular expression character.
^(...^)(...)Brackets or tags an expression to use in the replace command. A regular expression may have up to 9 tagged expressions, numbered according to their order in the regular expression.
^1\1Numerical reference to tagged expressions. Text matched with tagged expressions may be used in Replace commands with this format.
Note: ^ refers to the character '^' NOT Control Key + value.


UltraEdit/UNIX Regular Expression Examples


Simple String Matching

Simple string matching is probably the most basic form of regular expressions but can allow you to quickly exploit different patterns so that you can search for more than one string at a time rather than doing multiple Find operations.

UltraEdit RegExp:

Find What: m?n
Matches: "man" and "men" but not "moon"

Find What: t*t
Matches: "test", "tonight" and "tea time" (the "tea t" portion) but not "tea
time" (newline between "tea " and "time").

Find What: Te+st
Matches: "test", "teest", "teeeest", etc. but does not match "tst"

UNIX RegExp:

Find What: m.n
Matches: "man" and "men" but not "moon"

Find What: t.*t
Matches: "test", "tonight" and "tea time" (the "tea t" portion) but not "tea
time" (newline between "tea " and "time").

Find What: Te+st
Matches: "test", "teest", "teeeest", etc. but does not match "tst"

Go Top


Character Sets

A character set is a group of characters bounded by "[" and "]". These may be used to designate specific characters to be matched or ranges (i.e. [aeud], or [a-z]).

UltraEdit RegExp:

Find What: [aeiou]
Matches: every vowel

NOTE: Regular Expressions in UltraEdit are not case-sensitive unless Match Case is selected in the Find dialog.

Find What: [,.^?]
Matches: a literal ",", "." or "?".

Because the "?" is a symbol used in expressions it must be "escaped" for the literal character to be matched rather than interpreted as an expression.

Find What: [0-9a-z]
Matches: any digit or letter

Find What: [~0-9]
Matches: any character except a digit (~ means NOT the following)

UNIX RegExp:

Find What: [aeiou]
Matches: every vowel

Find What: [,\.?]
Matches: a literal ",", "." or "?".

Because the "." is a symbol used in expressions it must be "escaped" for the literal character to be matched rather than interpreted as an expression.

Find What: [0-9a-z]
Matches: any digit or letter

Find What: [^0-9]
Matches: any character except a digit (^ means NOT the following)

Go Top


OR Expressions

Currently UltraEdit only allows for the specification of two operands for an OR expression. You may search for an expression A or B as follows:

UltraEdit RegExp:

Find What: ^{John^}^{Tom^}

UNIX RegExp:

Find What: (John|Tom)

There should be nothing between the two expressions. You may combine A or B and C or D in the same search as follows:

UltraEdit RegExp:

Find What: ^{John^}^{Tom^} ^{Smith^}^{Jones^}

UNIX RegExp:

Find What: (John|Tom) (Smith|Jone)

This will search for "John" or "Tom" followed by "Smith" or "Jones".

Go Top


Deleting Blank Lines

With Regular Expressions selected in the Replace dialog this will match the a CR/LF (DOS line terminator) immediately followed by the end of a line (i.e., a blank line) and replace it with nothing, effectively deleting it:

UltraEdit RegExp:

Find What: ^p$
Replace With: (literally nothing)

UNIX RegExp:

Find What: \p$
Replace With: (literally nothing)


To find lines not begin with "http":
%[~^(http^)]?*$

%[0-9a-gi-z#'"& _/]?*$

Reformatting Text With Tagged Expressions

Example 1:

Tagged expressions may be used to mark various data members so that they may be reorganized, reformatting the data. For example, it might be useful to be able to rearrange:

John Smith, 385 Central Ave., Cincinnati, OH, 45238

into:

45238, Smith, John, 385 Central Ave., Cincinnati, OH

UltraEdit RegExp:

Find What: %^([a-z]+^) ^([a-z]+^), ^(*^), ^(*^), ^(*^), ^([0-9]+^)
Replace With: ^6, ^2, ^1, ^3, ^4, ^5

UNIX RegExp:

Find What: ^([a-z]+) ([a-z]+), (.*), (.*), (.*), ([0-9]+)
Replace With: \6, \2, \1, \3, \4, \5

Go Top

Example 2:

If you have a web-based registration system it might be useful to rearrange the order data into a format easily used by a database:

name = John Smith
address1 = 385 Central Ave.
address2 =
city = Cincinnati
state = OH
zip = 45238

into:

John Smith, 385 Central Ave.,, Cincinnati, OH, 45238,

This can be done with the following expression:

UltraEdit RegExp:

Find What: name = ^([a-z ]+^)^paddress1 = ^([a-z 0-9.,]+^)^paddress2 = ^([a-z 0-9.,]++^)^pcity = ^([a-z]+^)^pstate = ^([a-z]+^)^pzip = ^([0-9^-]+^)
Replace With:^1, ^2, ^3, ^4, ^5, ^6

UNIX RegExp:

Find What: name = ([a-z ]+)\paddress1 = ([a-z 0-9.,]+)\paddress2 = ([a-z 0-9.,]*)\pcity = ([a-z]+)\pstate = ([a-z]+)\pzip = ([0-9^-]+)
Replace With:\1, \2, \3, \4, \5, \6


Ultraedit Power Tips