MLWhiz | AI Unwrapped

MLWhiz | AI Unwrapped

NLP Learning Series: Part 1 - Text Preprocessing Methods for Deep Learning

Rahul Agarwal's avatar
Rahul Agarwal
Jan 17, 2019
∙ Paid
NLP  Learning Series: Part 1 - Text Preprocessing Methods for Deep Learning

Recently, I started up with an NLP competition on Kaggle called Quora Question insincerity challenge. It is an NLP Challenge on text classification and as the problem has become more clear after working through the competition as well as by going through the invaluable kernels put up by the kaggle experts, I thought of sharing the knowledge.

Since we have a large amount of material to cover, I am splitting this post into a series of posts. The first post i.e. this one will be based on preprocessing techniques that work with Deep learning models and we will also talk about increasing embeddings coverage. In the second post , I will try to take you through some basic conventional models like TFIDF, Count Vectorizer, Hashing etc. that have been used in text classification and try to access their performance to create a baseline. We will delve deeper into Deep learning models in the third post which will focus on different architectures for solving the text classification problem. We will …

User's avatar

Continue reading this post for free, courtesy of Rahul Agarwal.

Or purchase a paid subscription.
© 2025 Rahul Agarwal · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture