Statistics

The story of every distribution - Discrete Distributions

The story of every distribution - Discrete Distributions

Distributions play an important role in the life of every Statistician. I coming from a non-statistic background am not so well versed in these and keep forgetting about the properties of these famous distributions. That is why I chose to write my own understanding in an intuitive way to keep a track. One of the most helpful way to learn more about these is the STAT110 course by Joe Blitzstein and his book.

Maths Beats Intuition probably every damn time

Newton once said that “God does not play dice with the universe”. But actually he does. Everything happening around us could be explained in terms of probabilities. We repeatedly watch things around us happen due to chances, yet we never learn. We always get dumbfounded by the playfulness of nature. One of such ways intuition plays with us is with the Birthday problem. Problem Statement: In a room full of N people, what is the probability that 2 or more people share the same birthday(Assumption: 365 days in year)?

Top Data Science Resources on the Internet right now

I have been looking to create this list for a while now. There are many people on quora who ask me how I started in the data science field. And so I wanted to create this reference. To be frank, when I first started learning it all looked very utopian and out of the world. The Andrew Ng course felt like black magic. And it still doesn’t cease to amaze me.

Top advice for a Data Scientist

A data scientist needs to be Critical and always on a lookout of something that misses others. So here are some advices that one can include in day to day data science work to be better at their work: 1. Beware of the Clean Data Syndrome You need to ask yourself questions even before you start working on the data. Does this data make sense? Falsely assuming that the data is clean could lead you towards wrong Hypotheses.

Machine Learning Algorithms for Data Scientists

As a data scientist I believe that a lot of work has to be done before Classification/Regression/Clustering methods are applied to the data you get. The data which may be messy, unwieldy and big. So here are the list of algorithms that helps a data scientist to make better models using the data they have: 1. Sampling Algorithms. In case you want to work with a sample of data.

Things to see while buying a Mutual Fund

This is a post which deviates from my pattern fo blogs that I have wrote till now but I found that Finance also uses up a lot of Statistics. So it won’t be a far cry to put this on my blog here. I recently started investing in Mutual funds so thought of rersearching the area before going all in. Here is the result of some of my research.

Behold the power of MCMC

Last time I wrote an article on MCMC and how they could be useful. We learned how MCMC chains could be used to simulate from a random variable whose distribution is partially known i.e. we don’t know the normalizing constant. So MCMC Methods may sound interesting to some (for these what follows is a treat) and for those who don’t really appreciate MCMC till now, I hope I will be able to pique your interest by the end of this blog post.