Amadeus cookies policy - you'll see this message only once.

Amadeus use cookies on this website. They help us to know a little bit about you and how you use our website, which improves the browsing experience and marketing - both for you and for others. They are stored locally on your computer or mobile device. To accept cookies, continue browsing as normal. Or, go to the privacy policy for more information.

Regular Expressions in Proc SQL

Among the functions you can use in Proc SQL is PRXMATCH. At a single stroke, this increases the power of the SELECT statement quite dramatically. Now you can select records that match regular expressions.

This first example is looking for product names that contain words beginning with "h" and end with "r". For matching records, PRXMATCH will return the position of the substring that matched the regular expression; if there was no match, it will return 0.

Regular Expressions in proc SQL image 1

A quick refresher on regular expressions:

  • "!" is being used here as a delimiter. "/" is also commonly used in this way.
  • "b" matches a "word boundary".
  • "[Hh]" is a character class that matches either "H" or "h".
  • "." matches any single character, and "*" is a repeat count meaning "0 or more times"

Among the matches found by the above query are:

Large Hover Mower
Easy Patio Heater
Hand Cultivator (Wood)

The last of these is one that was not wanted - there is a word boundary between the "h" and the "r". We could get rid of this in a number of ways, for example by changing the regular expression to "!b[Hh]w*[rR]b!", where "w" matches any "word character". Word characters are defined as alphanumerics, plus the underscore character.

Here is another example using the same dataset, in which we look for cats and dogs:

Regular Expressions in proc SQL image 2 opt

Here the "|" means "or", and the "i" at the end of the regular expression makes it case-insensitive. Among the matches found are:

Cat Flap
Kennel (Large Dog)
Catering Barbecue

This time the regular expression said nothing about word boundaries.