The Most Complete Guide to pySpark DataFrames

The Most Complete Guide to pySpark DataFrames

Big Data has become synonymous with Data engineering. But the line between Data Engineering and Data scientists is blurring day by day. At this point in time, I think that Big Data must be in the repertoire of all data scientists. Reason: Too much data is getting generated day by day And that brings us to Spark which is one of the most used tools when it comes to working with Big Data.

Don’t Democratize Data Science

Don’t Democratize Data Science

Every few years, some academic and professional field gets a lot of cachet in the popular imagination. Right now, that field is data science. As a result, a lot of people are looking to get into it. Add to that the news outlets calling data science sexy and various academic institutes promising to make a data scientist out of you in just a few months, and you’ve got the perfect recipe for disaster.

Five Cognitive Biases In Data Science (And how to avoid them)

Five Cognitive Biases In Data Science (And how to avoid them)

Recently, I was reading Rolf Dobell’s The Art of Thinking Clearly, which made me think about cognitive biases in a way I never had before. I realized how deeply seated some cognitive biases are. In fact, we often don’t even consciously realize when our thinking is being affected by one. For data scientists, these biases can really change the way we work with data and make our day-to-day decisions, and generally not for the better.

Stop Worrying and Create your Deep Learning Server in 30 minutes

Stop Worrying and Create your Deep Learning Server in 30 minutes

I have found myself creating a Deep Learning Machine time and time again whenever I start a new project. You start with installing Anaconda and end up creating different environments for Pytorch and Tensorflow, so they don’t interfere. And in the middle of it, you inevitably end up messing up and starting from scratch. And this often happens multiple times. It is not just a massive waste of time; it is also mighty(trying to avoid profanity here) irritating.

How and Why to use f strings in Python3?

How and Why to use f strings in Python3?

Python provides us with many styles of coding. And with time, Python has regularly come up with new coding standards and tools that adhere even more to the coding standards in the Zen of Python. Beautiful is better than ugly. In this series of posts named Python Shorts, I will explain some simple but very useful constructs provided by Python, some essential tips, and some use cases I come up with regularly in my Data Science work.

Using Deep Learning for End to End Multiclass Text Classification

Using Deep Learning for End to End Multiclass Text Classification

Have you ever thought about how toxic comments get flagged automatically on platforms like Quora or Reddit? Or how mail gets marked as spam? Or what decides which online ads are shown to you? All of the above are examples of how text classification is used in different areas. Text classification is a common task in natural language processing (NLP) which transforms a sequence of a text of indefinite length into a single category.

A Newspaper for COVID-19 — The CoronaTimes

A Newspaper for COVID-19 — The CoronaTimes

It seems that the way that I consume information has changed a lot. I have become quite a news junkie recently. One thing, in particular, is that I have been reading quite a lot of international news to determine the stages of Covid-19 in my country. To do this, I generally visit a lot of news media sites in various countries to read up on the news. This gave me an idea.