Posts

What my first Silver Medal taught me about Text Classification and Kaggle in general?

What my first Silver Medal taught me about Text Classification and Kaggle in general?

Kaggle is an excellent place for learning. And I learned a lot of things from the recently concluded competition on Quora Insincere questions classification in which I got a rank of 182/4037. In this post, I will try to provide a summary of the things I tried. I will also try to summarize the ideas which I missed but were a part of other winning solutions. As a side note: if you want to know more about NLP, I would like to recommend this excellent course on Natural Language Processing in the Advanced machine learning specialization.
NLP  Learning Series: Part 2 - Conventional Methods for Text Classification

NLP Learning Series: Part 2 - Conventional Methods for Text Classification

This is the second post of the NLP Text classification series. To give you a recap, recently I started up with an NLP text classification competition on Kaggle called Quora Question insincerity challenge. And I thought to share the knowledge via a series of blog posts on text classification. The first post talked about the various preprocessing techniques that work with Deep learning models and increasing embeddings coverage. In this post, I will try to take you through some basic conventional models like TFIDF, Count Vectorizer, Hashing etc.
NLP  Learning Series: Part 1 - Text Preprocessing Methods for Deep Learning

NLP Learning Series: Part 1 - Text Preprocessing Methods for Deep Learning

Recently, I started up with an NLP competition on Kaggle called Quora Question insincerity challenge. It is an NLP Challenge on text classification and as the problem has become more clear after working through the competition as well as by going through the invaluable kernels put up by the kaggle experts, I thought of sharing the knowledge. Since we have a large amount of material to cover, I am splitting this post into a series of posts.
A Layman guide to moving from Keras to Pytorch

A Layman guide to moving from Keras to Pytorch

Recently I started up with a competition on kaggle on text classification, and as a part of the competition, I had to somehow move to Pytorch to get deterministic results. Now I have always worked with Keras in the past and it has given me pretty good results, but somehow I got to know that the CuDNNGRU/CuDNNLSTM layers in keras are not deterministic, even after setting the seeds.
What Kagglers are using for Text Classification

What Kagglers are using for Text Classification

With the problem of Image Classification is more or less solved by Deep learning, Text Classification is the next new developing theme in deep learning. For those who don’t know, Text classification is a common task in natural language processing, which transforms a sequence of text of indefinite length into a category of text. How could you use that? To find sentiment of a review. Find toxic comments in a platform like Facebook Find Insincere questions on Quora.
To all Data Scientists - The one Graph Algorithm you need to know

To all Data Scientists - The one Graph Algorithm you need to know

Graphs provide us with a very useful data structure. They can help us to find structure within our data. With the advent of Machine learning and big data we need to get as much information as possible about our data. Learning a little bit of graph theory can certainly help us with that. Here is a Graph Analytics for Big Data course on Coursera by UCSanDiego which I highly recommend to learn the basics of graph theory.
Object Detection: An End to End Theoretical Perspective

Object Detection: An End to End Theoretical Perspective

We all know about the image classification problem. Given an image can you find out the class the image belongs to? We can solve any new image classification problem with ConvNets and Transfer Learning using pre-trained nets. ConvNet as fixed feature extractor. Take a ConvNet pretrained on ImageNet, remove the last fully-connected layer (this layer’s outputs are the 1000 class scores for a different task like ImageNet), then treat the rest of the ConvNet as a fixed feature extractor for the new dataset.