Regular Expressions: Groups

In Python, you can write the following, to capture groups of characters with regular expressions.

>>> import re
>>> print(“Match is ‘”
… + re.search(‘\\s([a-z]+)\\s’,
… ‘My text string.’).group(1) + “‘”)
Match is ‘text’
>>>

This is quite straightforward. In C#, you can write something similar.

using System.Text.RegularExpressions;

namespace MhNeifer.Samples.CSharp {

public class MyRegex {

static void Main() {

Regex rgx = new Regex(@”\s([a-z]+)\s”);

System.Console.WriteLine(“Match is ‘”

+ rgx.Matches(“My text string.”)[0].Groups[1].Value

+ “‘”);

}

If you ignore that C# is more wordy in general (namespace and class definition and all this), this is straightforward as well.

I thought that in Java it would be straightforward too. But it seems that there’s a catch. Or I’m too dumb to see the simple solution. I found the following.

import java.util.regex.Matcher;

import java.util.regex.Pattern;

public class Regex {

public static void main(String[] args) {

Pattern p = Pattern.compile(“.*\\s([a-z]+)\\s.*”);

Matcher m = p.matcher(“My text string.”);

m.matches();

System.out.println(“Match is ‘” + m.group(1) + “‘”);

}

While it looks straightforward, it is not. You have to call Matcher.matches() before Matcher.group(), or you get an exception. I was surprised by this. Please note the ‘.*’ at the beginning and the end of the regular expression. You have to write a regular expression that matches the whole string. For me, this took a while to remember.

Regular Expressions: Groups

Comment

Cancel reply