Graphs provide us with a very useful data structure. They can help us to find structure within our data. With the advent of Machine learning and big data we need to get as much information as possible about our data. Learning a little bit of graph theory can certainly help ...

# Using XGBoost for time series prediction tasks

Recently Kaggle master Kazanova along with some of his friends released a "How to win a data science competition" Coursera course. You can start for free with the 7-day Free Trial. The Course involved a final project which itself was a time series prediction problem. Here I will describe how ...

# Good Feature Building Techniques - Tricks for Kaggle - My Kaggle Code Repository

Often times it happens that we fall short of creativity. And creativity is one of the basic ingredients of what we do. Creating features needs creativity. So here is the list of ideas I gather in day to day life, where people have used creativity to get great results on ...

# Maths Beats Intuition probably every damn time

Newton once said that **"God does not play dice with the universe"**. But actually he does. Everything happening around us could be explained in terms of probabilities. We repeatedly watch things around us happen due to chances, yet we never learn. We always get dumbfounded by the playfulness of nature ...

# Today I Learned This Part I: What are word2vec Embeddings?

Recently Quora put out a Question similarity competition on Kaggle. This is the first time I was attempting an NLP problem so a lot to learn. The one thing that blew my mind away was the word2vec embeddings.

Till now whenever I heard the term word2vec I visualized it as ...

# Top Data Science Resources on the Internet right now

I have been looking to create this list for a while now. There are many people on quora who ask me how I started in the data science field. And so I wanted to create this reference.

To be frank, when I first started learning it all looked very utopian ...

# Top advice for a Data Scientist

A data scientist needs to be Critical and always on a lookout of something that misses others. So here are some advices that one can include in day to day data science work to be better at their work:

## 1. Beware of the Clean Data Syndrome

You need to ask ...

# Pandas For All - Some Basic Pandas Functions

It has been quite a few days I have been working with Pandas and apparently I feel I have gotten quite good at it. (Quite a Braggard I know) So thought about adding a post about Pandas usage here. I intend to make this post quite practical and since I ...

# Behold the power of MCMC

Last time I wrote an article on MCMC and how they could be useful. We learned how MCMC chains could be used to simulate from a random variable whose distribution is partially known i.e. we don't know the normalizing constant.

So MCMC Methods may sound interesting to some ...

# My Tryst With MCMC Algorithms

The things that I find hard to understand push me to my limits. One of the things that I have always found hard is **Markov Chain Monte Carlo Methods**.
When I first encountered them, I read a lot about them but mostly it ended like this.

The meaning is normally ...