Evaluate Regular Expression Assertion

The Evaluate Regular Expression assertion is a powerful tool for detecting, filtering, and/or changing service messages. The assertion allows you to define one or more values that, when present in an incoming request message, will yield a specific processing outcome, change the content of the matched message, or detect particular patterns.
gateway90
The
Evaluate Regular Expression
assertion is a powerful tool for detecting, filtering, and/or changing service messages. The assertion allows you to define one or more values that, when present in an incoming request message, will yield a specific processing outcome, change the content of the matched message, or detect particular patterns.
The Evaluate Regular Expression assertion has a wide range of uses. For example, it can be configured to enforce a consistent telephone number format in request and response messages. The assertion will then scan messages for telephone numbers—any number that does not conform to the format specified in the assertion will be altered prior to processing the message. The assertion can also be used for message mediation: using its match and replace functionality, the assertion can convert small item changes, such as adding a special tag after a particular keyword is detected.
Pattern matching and replacement are also useful for protecting service applications from various web service and XML application threats.
A policy can contain multiple Evaluate Regular Expression assertions, placed anywhere before or after the routing assertion.
To learn about selecting the target message for this assertion, see Selecting a Target Message.
To learn more about regular expressions, consider a web tutorial such as this site: http://www.regular-expressions.info/.
Context Variables Created by This Assertion
The Evaluate Regular Expression assertion can optionally populate a multivalued context variable with the values from designated capture groups in the expression.
Example:
  • Multivalued context variable name:
    phone
  • Regular expression containing three capture groups:
    \((\d{3})\)(\d{3})-(\d{4})
  • Input string:
    (800) 555-1234
Using the default settings for the assertion, the multivalued variable
phone
will contain the following values:
  • ${phone[0]}
    is set to the entire string matched by the regular expression – "(800) 555-1234"
  • ${phone[1]}
    is set to the first capture group – "800"
  • ${phone[2]}
    is set to the second capture group – "555"
  • ${phone[3]}
    is set to the third capture group – "1234"
Note that the variable
phone
is created even if the
Do not proceed if pattern matches
option was selected.
Capture groups always exist if any parentheses are present in the expression. However, they are saved only when a context variable is specified.
Using the Assertion
  1. Do one of the following:
    • To add the assertion to the Policy Development window, see Add an Assertion.
    • To change the configuration of an existing assertion, proceed to step 2 below.
  2. When adding the assertion, the Regular Expression Properties automatically appear; when modifying the assertion, right-click
    <target>:
    Evaluate Regular Expression
    in the policy window and select
    Regular Expression Properties
    or double-click the assertion in the policy window. The assertion properties are displayed.
  3. Configure the display name and regular expressions:
    Setting
    Description
    Display Name
    Optionally enter a "friendly name" for this regular expression.  This name is for display purposes only in the policy window.
    A friendly name should briefly describe the purpose of the assertion. This name helps you distinguish between several Evaluate Regular Expression assertions in a policy.
    Regular Expression
    Enter the regular expression value to be matched. You may reference context variables within the expression. Note that any context variable will be treated as literals when the syntax is checked during design time, but will be resolved to their actual values during runtime. For details, see "Example: Context variables in the regular expression" below.
    The Regular Expression field can only contain a single expression. To evaluate multiple expressions, configure multiple Evaluate Regular Expression assertions within a policy.
    Ignore Case
    Select this check box to ignore the case of any matching values in the incoming request message. Clear this check box to enforce case matching.
    Save in context variable
    Optionally reference a context variable that will hold the regular expression pattern in effect at runtime.
    This is most useful if context variables were used in the regular expression pattern, as the exact pattern will depend on the resolved variables during runtime. If no context variables were used, then this variable will contain the exact string entered in the
    Regular Expression
    field.
    Replacement
    (only used with "Match and Replace" option)
    Enter the replacement value or format that will replace the value or format specified in the
    Regular expression
    field. You may reference context variables.
    The '$' character is a reserved symbol for replacements. To use it as a literal character in a regex replacement, escape it with the '\' character (for example, '
    \$
    '), otherwise unexpected errors may occur.
    Use the [
    Test
    ] tab to verify that the replacement is working according to your expectations.
    Example: Context variables in the regular expression
    Suppose you enter the following regular expression:
    hi${there}bob
    When you test this expression during design time, the assertion will match all the characters literally: h, i, $, {, t, h, e, r, e, }, b, o, b.
    At runtime, the assertion will match the characters h, i, followed by whatever is currently in the variable 
    ${there}
    , followed by the characters b, o, b.
    If at runtime 
    ${there}
     contains regular expression metacharacters such as [, ], |, ^, $, etc., they will be matched as literals (in other words, they lose their metacharacter interpretation). For example, consider this policy fragment:
    Set variable ${there} to [a-z]
    Request: match regex hi${there}bob
    This fragment will match this request:
    We all scream hi[a-z]bobbies again!
    But it will not match this request:
    We all scream hipbobbies again!
    If you modify the fragment to replace the context variable with its actual content, the metacharacters will be interpreted as expected:
    Request: match regex hi[a-z]bob
    This will result in the second example above being matched, but not the first.
  4. In the [Source and Destination] tab, configure the assertion as follows:
    Setting
    Description
    Source
    Specify whether to match against the
    Request
    ,
    Response
    , or
    Other Context Variable
    that contains the value to analyze. If other variable, specify the variable name is the box. (You do not need to enclose the variable name within the "${ }" characters.)
    The message target can also be set outside of the assertion properties. For more information, see Selecting a Target Message.
    Destination
    Specify how to proceed based on the results of the pattern matching:
    • Proceed if pattern matches:
      Causes the assertion to return success if the pattern matches. Whether the message is permitted to proceed ultimately depends on the outcome of the policy.
    • Do not proceed if pattern matches:
      Causes the assertion to return failure if the pattern matches. Whether the message is blocked ultimately depends on the outcome of the policy. This option is particularly useful for protecting against specific service threats.
    • Stop searching after first successful match:
      This check box is available when either
      Proceed if pattern matches
       or
      Do not proceed if pattern matches
      is selected.
      • Select this check box to instruct the assertion to stop after a successful match. This setting is the default.
      • Clear this check box to instruct the assertion to find (and capture, if applicable) all matches in the target string, not stopping after the first match.
    • Match and Replace (always proceed):
      If the content matches the regular expression, replace it with the content from the
      Replacement
      field.
      Tip:
      If multiple replacements are required, use several Evaluate Regular Expression assertions in the policy.
      • Repeat successful replacements up to
        x
        times:
        • Select this check box to repeat any "replace all" step that made at least one replacement. This will continue until either there is no more item to replace, or the
          'x'
          iteration limit has been reached. See
          "
          Example: 'Repeat successful replacements' option
          "
          below for an example of how this option works.
        • Clear this check box to not repeat any successful replacement step. This setting is the default.
    • Context Variable:
      Optionally enter the name of a context variable if you wish to record capture groups. For more information, see "Context Variables Created by Assertion" in the introduction to this assertion. For information on naming rules, see "Context Variable Naming Rules" under Context Variables.
    • Include matched substring in capture:
      This check box has an effect only when a Context Variable has been specified. It controls whether the matched substring should be included in the capture. For a detailed explanation of this option, see
      "
      Example: Including matched substring
      "
      below.
    MIME/Multipart Messages
    Specify how to handle MIME/multipart messages:
    • MIME Part:
      For multipart messages, specify which part of the message should be matched against the regular expression value, where '0' is the first part, '1' is the second part, etc. The default is '0'.
      This setting is not used for messages that have only a single part. 
    • Character Encoding:
      Select Default to use the default character encoding or Override to override how the Gateway decodes the message. For example, if a UTF-8 encoded message arrives with a Content-Type incorrectly declaring its character encoding as ISO8859-1, then enter "UTF-8" to override. For more information, see Character Encoding.
    Example: "Repeat successful replacements"
    The following example illustrates the functionality of the 
    Repeat successful replacements
     check box: adding commas to a large number.
    Say you have a large integer and you wish to add commas for improved readability. The length of this integer is unknown ahead of time. You can accomplish this using the following settings in the Evaluate Regular Expression assertion:
    Regular expression: 
    ^(-?\d+)(\d{3})
    Replacement: 
    $1,$2
    [
    X
    Repeat successful replacements up to 
    9999 
    times
    Before processing, the target message contains this integer: 
    92349854732933493424982745249587
    After processing, it will contain: 
    92,349,854,732,933,493,424,982,745,249,587
    How this works behind the scene:
    The assertion examines the integer and will attempt to match all the leading digits that lack commas, followed by three more digits that lack commas. It then replaces this with the same string, but with a comma before the last three digits. This is repeated until the assertion cannot find four consecutive digits. Here is a sample of the assertion in progress:
    Repeat up to 0 times: 
    92349854732933493424982745249,587
    Repeat up to 1 times: 
    92349854732933493424982745,249,587
    Repeat up to 2 times: 
    92349854732933493424982,745,249,587
    Repeat up to 7 times: 
    92349854,732,933,493,424,982,745,249,587
    Repeat up to 9999 times: 
    92,349,854,732,933,493,424,982,745,249,587
    If the 
    Repeat successful replacements
     check box is not selected, the assertion will stop after one pass and the resulting output would be "92349854732933493424982745249,587".
    Example: "Including matched substring"
    As described under 
    "Context Variables Created by This Assertion"
    , you can optionally save the "capture groups" that are automatically created by this assertion by entering a variable name in the assertion properties. You indicate the part of the pattern to be captured by enclosing them within parentheses. When the assertion is run, the matching part of the pattern is then captured and saved to the context variable when matched successfully.
    For example, suppose you have the following data:
    name="John Smith", phone=604-555-1234
    name="Sue Smith", phone=604-555-5678
    You wish to extract the phone number (which for this example we will assume that it is always in the xxx-xxx-xxxx format). This can be accomplished with the following regular expression:
    phone=(\d\d\d-\d\d\d-\d\d\d\d)
    Scenario 1:
     Run the assertion with the following settings:
    Stop searching after first successful match
     = 
    enabled
    Include matched substring in capture
     = 
    enabled
    Context variable
     = 
    p
    This will be the result:
    ${p[0]}
     = "phone=604-555-1234" 
    ${p[1]}
     = "604-555-1234"
    A few points to note about these results:
    • The variable "p" that is created is a multivalued context variable.
    • The first value in "p" (${p[0]}) contains the entire substring matched by the regular expression.
    • The second value in "p" (${p[1]}) contains the first capture group, which is as indicated in the regular expression.
    • The phone number for Sue Smith is not captured, because you've instructed the assertion to stop search after a successful match is made.
    Scenario 2:
     Run the assertion with the following settings:
    Stop searching after first successful match
     = 
    disabled
    Include matched substring in capture
     = 
    enabled
    Context variable
     = 
    p
    This will be the result:
    ${p[0]}
     = "phone=604-555-1234" (entire substring matched by the regex, for the 1st successful match) 
    ${p[1]}
     = "604-555-1234" (1st group, for the 1st successful match) 
    ${p[2]}
     = "phone=604-555-4332" (entire substring matched by the regex, for the 2nd successful match) 
    ${p[3]}
     = "604-555-4332" (1st capture group, for the 2nd successful match)
    Note that when the "Stop searching..." option is disabled, Sue Smith's phone number is captured.
    Scenario 3:
     Run the assertion with the following settings:
    Stop searching after first successful match
     = 
    disabled
    Include matched substring in capture
     = 
    disabled
    Context variable
     = 
    p
    This will be the result:
    ${p[0]}
     = "604-555-1234" (1st capture group, for the 1st successful match) 
    ${p[1]}
     = "604-555-4332" (1st capture group, for the 2nd successful match)
    Notice that when you disable "Include matched substring", the matched substrings are no longer saved.
    Scenario 4:
     Use the same settings as Scenario 3, but introduce a new capture group by enclosing the 
    entire
     regular expression within parentheses (remember, capture groups are indicated by parentheses):
    (phone=(\d\d\d-\d\d\d-\d\d\d\d))
    This will be the result:
    ${p[0]}
     = "phone=604-555-1234" (1st capture group, for the 1st successful match) 
    ${p[1]}
     = "604-555-1234" (2nd capture group, for the 1st successful match) 
    ${p[2]}
     = "phone=604-555-4332" (1st capture group, for the 2nd successful match) 
    ${p[3]}
     = "604-555-4332" (2nd capture group, for the 2nd successful match)
    Notice that this is identical to the output from Scenario 2. Thus, the "Include matched substring" option is the same as disabling the option and enclosing the entire expression within parentheses. 
  5. Select the [Test] tab to test your regular expression and to determine whether the replacement string was entered correctly.
    • In the 
      Test Input
       box, type some sample text that includes the value or format from the 
      Regular Expression
       field. As you type, the assertion attempts to match your input against the Regular Expression that was entered.
    • The 
      Test Result
       box shows the results of the match. Examine the results carefully to see if this is what you intended. The figure below illustrates how the assertion interprets the test input given the sample regular expression and replacement string shown:
      • For test input 
        "888-555-1234"
        , the test result is 
        "<phone country="" area="888" num="555-1234"/>"
        . This means the assertion was able to locate the area code and phone number, but not the country code because it was not specified.
      • For test input 
        "1-888-687-2234"
        , the test result was able to match all three groups successfully.
      • For test input 
        "some test"
        , no replacement was made because this is not a phone number.
      • For test input 
        "879-1234"
        , no replacement was made because the phone number is missing the area code, which is a required element.
      • For test input, 
        "604-681-9387"
        , the area code and phone number was matched.
  6. Click [
    OK
    ] when done.