The Ultimate Guide to using the Python regex module
One of the main tasks while working with text data is to create a lot of text-based features.
One could like to find out certain patterns in the text, emails if present in a text as well as phone numbers in a large text.
While it may sound fairly trivial to achieve such functionalities it is much simpler if we use the power of Python’s regex module.
For example, let’s say you are tasked with finding the number of punctuations in a particular piece of text. Using text from Dickens here.
How do you normally go about it?
A simple enough way is to do something like:
target = [';','.',',','–']
string = "It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing before us, we were all going direct to Heaven, we were all going di…
Keep reading with a 7-day free trial
Subscribe to MLWhiz | AI Unwrapped to keep reading this post and get 7 days of free access to the full post archives.