Regular Expressions (RegEx)
What Is RegEx?
- RegEx, short for Regular Expression, is a sequence of characters that forms a search pattern. It's used to check if a string contains the specified search pattern.
RegEx Module in Python:
Python has a built-in package called
refor working with regular expressions.pythonimport re
Example: Using RegEx in Python:
Check if a string starts with "The" and ends with "Spain":
pythonimport re txt = "The rain in Spain" x = re.search("^The.*Spain$", txt)
RegEx Functions:
- The
remodule offers several functions to work with regular expressions:findall(): Returns a list of all matches.search(): Returns a Match object if there's a match anywhere in the string.split(): Returns a list where the string has been split at each match.sub(): Replaces one or many matches with a string.
Metacharacters:
- Special characters with specific meanings, such as
[],\,.,^,$,*,+,?,{},|,().
Special Sequences:
- Special sequences start with a backslash (
\) followed by a character, like\A,\b,\B,\d,\D,\s,\S,\w,\W,\Z.
Sets:
- Sets are a set of characters inside square brackets
[]with special meanings, such as[a-n],[0-9],[a-zA-Z], etc.
Example Functions:
- The
findall()Function:Returns a list of all matches.
pythonimport re txt = "The rain in Spain" x = re.findall("ai", txt) print(x) # Output: ['ai', 'ai']
- The
search()Function:Searches for a match and returns a Match object.
pythonimport re txt = "The rain in Spain" x = re.search("\s", txt) print("The first white-space character is located in position:", x.start()) # Output: 3
- The
split()Function:Splits the string at each match.
pythonimport re txt = "The rain in Spain" x = re.split("\s", txt) print(x) # Output: ['The', 'rain', 'in', 'Spain']
- The
sub()Function:Replaces matches with a specified string.
pythonimport re txt = "The rain in Spain" x = re.sub("\s", "9", txt) print(x) # Output: The9rain9in9Spain
Match Object:
- Contains information about the search and the result. Methods include
.span(),.string,.group().
Examples:
Print the position of the first match:
pythonimport re txt = "The rain in Spain" x = re.search(r"\bS\w+", txt) print(x.span()) # Output: (12, 17)Print the string passed into the function:
pythonimport re txt = "The rain in Spain" x = re.search(r"\bS\w+", txt) print(x.string) # Output: The rain in SpainPrint the part of the string where there was a match:
pythonimport re txt = "The rain in Spain" x = re.search(r"\bS\w+", txt) print(x.group()) # Output: Spain
Exercises:
Check if a string contains the word "rain":
pythonimport re txt = "The rain in Spain" x = re.search("rain", txt) print("Match found!" if x else "Match not found")Extract all numbers from a string:
pythonimport re txt = "There are 12 apples and 34 bananas." numbers = re.findall("\d+", txt) print(numbers) # Output: ['12', '34']Replace all spaces with underscores:
pythonimport re txt = "The rain in Spain" result = re.sub("\s", "_", txt) print(result) # Output: The_rain_in_Spain
Summary:
- Python RegEx allows you to create search patterns to find, match, and manipulate strings. The
remodule provides essential functions for working with regular expressions. Practice using RegEx to become proficient in pattern matching and string manipulation!