How to get the string/part/word/text within brackets in Python using Regex

PROBLEM DEFINITION

For example, you have a string like the following:

[lagril] L.A. Girl Pro. Setting HD Matte Finish Spray

While you are scanning the line, you would like to extract the following word from it ‘lagril’, which you are interested in. How to do that?

GETTING TEXT WITHIN BRACKETS USING REGEX IN PYTHON

Our problem falls into a common string extraction problem we face in software engineering. We usually do this using Regular Expressions. Let’s build the regular expression logic first, using regex101.com

We need to find a string that starts with ‘[‘ bracket and ends with ‘]’ bracket, and in the middle, we expect alphanumeric word with small or capital letters, and they can be anything from 0 to any. So, this should be as simple as the following:

\[[A-Za-z0-9]*\]

Now, this should help us target the words that comes within bracket in a sentence/large string. But the trick to grab the text within the bracket is to group them. To use group in regex, we use () brackets without back slash in front. So if the regex is as following:

\[([A-Za-z0-9]*)\]

This will put the matching string in group 1. Now, how can you get what is in the group 1 of a regular expression engine? Let’s dive into python now:

# let's import regular expression engine first
import re

# our string
txt = '[lagril] L.A. Girl Pro. Setting HD Matte Finish Spray'

# our regex search would be as following:
x = re.search(r"\[([A-Za-z0-9]*)\]", txt)

# we know this will put the inner text in group 1. regex object that returned by re.search, has a method called 'group()' to catch the groups matches regex. You may use the following

x.group(1) # prints lagril

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.