PROBLEM DEFINITION
For example, you have a string like the following:
[lagril] L.A. Girl Pro. Setting HD Matte Finish Spray
While you are scanning the line, you would like to extract the following word from it ‘lagril’, which you are interested in. How to do that?
GETTING TEXT WITHIN BRACKETS USING REGEX IN PYTHON
Our problem falls into a common string extraction problem we face in software engineering. We usually do this using Regular Expressions. Let’s build the regular expression logic first, using regex101.com
We need to find a string that starts with ‘[‘ bracket and ends with ‘]’ bracket, and in the middle, we expect alphanumeric word with small or capital letters, and they can be anything from 0 to any. So, this should be as simple as the following:
\[[A-Za-z0-9]*\]
Now, this should help us target the words that comes within bracket in a sentence/large string. But the trick to grab the text within the bracket is to group them. To use group in regex, we use () brackets without back slash in front. So if the regex is as following:
\[([A-Za-z0-9]*)\]
This will put the matching string in group 1. Now, how can you get what is in the group 1 of a regular expression engine? Let’s dive into python now:
# let's import regular expression engine first import re # our string txt = '[lagril] L.A. Girl Pro. Setting HD Matte Finish Spray' # our regex search would be as following: x = re.search(r"\[([A-Za-z0-9]*)\]", txt) # we know this will put the inner text in group 1. regex object that returned by re.search, has a method called 'group()' to catch the groups matches regex. You may use the following x.group(1) # prints lagril