I am a Mechanical engineer by education. And I started my career with a core job in the steel industry.
But I didn’t like it and so I left that.
I made it my goal to move into the analytics and data science space somewhere around in 2013. From then on, it has taken me a lot of failures and a lot of efforts to shift.
Now, people on social networks ask me how I got started in the data science field.

Data Science is the study of algorithms.
I grapple through with many algorithms on a day to day basis, so I thought of listing some of the most common and most used algorithms one will end up using in this new DS Algorithm series.
How many times it has happened when you create a lot of features and then you need to come up with ways to reduce the number of features.

Data Science is the study of algorithms.
I grapple through with many algorithms on a day to day basis so I thought of listing some of the most common and most used algorithms one will end up using in this new DS Algorithm series.
This post is about some of the most common sampling techniques one can use while working with data.
Simple Random Sampling Say you want to select a subset of a population in which each member of the subset has an equal probability of being chosen.

Exploration and Exploitation play a key role in any business.
And any good business will try to “explore” various opportunities where it can make a profit.
Any good business at the same time also tries to focus on a particular opportunity it has found already and tries to “exploits” it.
Let me explain this further with a thought experiment.
Thought Experiment: Assume that we have infinite slot machines. Every slot machine has some win probability.

Distributions play an important role in the life of every Statistician. I coming from a non-statistic background am not so well versed in these and keep forgetting about the properties of these famous distributions. That is why I chose to write my own understanding in an intuitive way to keep a track. One of the most helpful way to learn more about these is the STAT110 course by Joe Blitzstein and his book.

Newton once said that “God does not play dice with the universe”. But actually he does. Everything happening around us could be explained in terms of probabilities. We repeatedly watch things around us happen due to chances, yet we never learn. We always get dumbfounded by the playfulness of nature.
One of such ways intuition plays with us is with the Birthday problem.
Problem Statement: In a room full of N people, what is the probability that 2 or more people share the same birthday(Assumption: 365 days in year)?

I have been looking to create this list for a while now. There are many people on quora who ask me how I started in the data science field. And so I wanted to create this reference.
To be frank, when I first started learning it all looked very utopian and out of the world. The Andrew Ng course felt like black magic. And it still doesn’t cease to amaze me.