Pattern pat = Pattern.compile( "\s*\w+\s*" );
No. The backslashes must be doubled:
Pattern pat = Pattern.compile( "\\s*\\w+\\s*" );
(Recall from chapter 3 that \s
matches
a whitespace character.)
One form of the factory method is
public static Pattern compile(String regex, int flags)
The parameter flags
is a single int
variable that
has particular bits set to select particular options.
Recall that an int
variable has 32 bits.
If all those bits are zero, then no special options are turned on.
Several bits may be set (to one) to turn on the corresponding options.
To create a Pattern
that matches "ant" or "bat" or "cat" or "dog"
regardless of case, do this:
Pattern pattern = Pattern.compile( "ant|bat|cat|dog", Pattern.CASE_INSENSITIVE );
Pattern.CASE_INSENSITIVE
is a 32-bit int
that has the
proper bit set to request case insensitive matching.
However, this assumes that the characters are US-ASCII.
If you need to do case insensitive matching with Unicode you need to do this:
Pattern pattern = Pattern.compile( "ant|bat|cat|dog", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE);
Pattern.UNICODE_CASE
is a 32-bit int
that has the
proper bit set to request that case insensitive matching work with Unicode.
When the two 32-bit int
s are OR-ed together:
Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE
the resulting 32-bit int
contains bits that are set for both options.
What is the bit-wise OR operator in Java?