Limitations of data identifier support for PCRE regular expressions

Beginning with DLP 16.0, you can create data identifier patterns by using PCRE syntax to define regular expressions. However, support for regular expressions is subject to the following limitations:
  • Data identifiers do not support the following PCRE regular expression constructs:
    • Backreferences and capturing sub-expressions.
    • Zero-width assertions.
    • Subroutine references and recursive patterns.
    • Conditional patterns.
    • Backtracking control verbs.
    • The
      \C
      “single-byte” directive (which breaks UTF-8 sequences).
    • The
      \R
      newline match.
    • The
      \A
      beginning of string match.
    • The
      \Z
      and
      \z
      end of string matches.
    • The
      \K
      start of match reset directive.
    • The
      ^
      beginning of line match.
    • The
      $
      end of line match.
    • Callouts and embedded code.
    • Atomic grouping and possessive quantifiers.
  • Pattern compilation could fail on detection servers even if it succeeds on the Enforce Server. This happens because the detection server performs verification with pre- and post-validator characters included, while the Enforce does not include pre- and post-validator characters.
DLP 15.8 endpoints support only the legacy data identifier pattern langauge, which is also a subset of the PCRE regular expression language. For more information, see Using the legacy data identifier pattern language.