Summary

This article covers what a regular expression is and how to use them

Table of Contents


A regular expression (abbreviated as regexp or regex, with plural forms regexps, regexes, or regexen) is a string that describes or matches a set of strings, according to certain syntax rules. Regular expressions are used by many text editors and utilities to search and manipulate bodies of text based on certain patterns. In our case, regular expressions are used to match an entry on a data point to a particular pattern, or set of patterns.

Common Uses

Regular expressions in the Fountayn clinical research platform are most commonly used to verify the format of date and time fields. They provide a way to force leading zeros, as well as certain dates (like year 2000 or later). Regular expressions are also almost always used to check the format of patient initials and ID numbers. Of course, there are many uses for regular expressions besides these.

Regular Expression Syntax

Regular expressions are used within a dependency expression by using the EQUALS_RX keyword, followed by the regular expression itself. Almost always, the expression checks against the question it is on, and not another question. You will almost always see “this EQUALS_RX regularexpressionhere”, which states that the question the dependency is on must match the pattern provided in the regular expression.

Very seldom, the HAS_RX keyword will be used to match a pattern. Where EQUALS_RX checks that an entire string matches the pattern, HAS_RX checks that the pattern just exists somewhere within the string. Extraneous characters on other side of the pattern are not considered.

A complete list of regular expression syntax can be found here: http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html. We will cover the more basic and commonly used ones (at least in our eClinical package) on this page.

Regular ExpressionDescription
[a-z]A lower case character, a through z.
[A-Z]An upper case character, A through Z.
[a-zA-Z]Either a lower case or upper case character. a through z or A through Z.
[abc]a, b, or c. Any list of characters can be specified this way.
[^abc]Any character EXCEPT a, b, or c. Can be any characters.
.Any one character.
\dAny one digit 0-9.
\.A single decimal, "."
X{n}The character or regular expression X, exactly n times. A{2} would require "AA" to be entered.
X{n,}The character or regular expression X, at least n times in a row. No upper limit to the amount of times.
X{n,m}The character or regular expression X, at least n times but not more than m times.
X*The character or regular expression X, zero or more times.
X|YX or Y (but not both). X and Y can be a single character, or entire regular expressions themselves.

Examples

The following is a list of several example expressions using regular expressions, to give you an idea of how regular expressions are created and used.

questionTypeIddependencyIdCheck If BlankExpressionAlert
initials1nothis EQUALS_RX [A-Z][A-Z|-][A-Z]You must enter three characters for patient initials. The first and last must be a capital letter, the second can be a capital letter or a dash (‘-’).
subjectStudyID1nothis EQUALS_RX M0-[0-9][0-9][0-9]‘Subject Number’ must be of the format M0-###. Please verify.
labValue1nothis EQUALS_RX [0-9]*\.[0-9]The value must contain a decimal. Please verify.
dateOfBirth1nothis EQUALS_RX [0-3][0-9]-[A-Za-z]{3}-\d{4}This date must be in the format: dd-MMM-yyyy. Please correct.


initials id1
The pattern is a captial letter, then a capital letter or a dash, then another capital letter. Acceptable entries include "QWE" and "Q-E". Unacceptable entries include "qwe", "q-e", "fd", and "123".

subjectStudyID id1
The pattern is M0-, then a three digit number, leading zeros required. Acceptable entries include M0-123, M0-000, M0-010, and M0-999. Unacceptable entries include M0-01, M0-asd, M1-032, N0-321, and M-032.

labValue id1
The pattern is any number of digits 0-9 OR no digits, then one decimal, then only one digit 0-9. Acceptable entries include 1.0, .4, and 3542665.9. Unacceptable entries include 43, 3534., and 473.42

dateOfBirth id1
The format is a digit 0-3, digit 0-9, dash, three alpha characters, dash, four digits 0-9. This adds a check for leading zeros to the built in java format checker. Note that "[A-Za-z]{3}" is equivalent to "[A-Za-z][A-Za-z][A-Za-z]", and that "\d{4}", "\d\d\d\d", "[0-9]{4}", and "[0-9][0-9][0-9][0-9]" are all equivalent.


Need more help?

Please visit the Fountayn Contact Information page.