Parallelization is awesome. We data scientists have got laptops with quad-core, octa-core, turbo-boost. We work with servers with even more cores and computing power. But do we really utilize the raw power we have at hand? Instead, we wait for time taking processes to finish. Sometimes for hours, when urgent deliverables are at hand. Can we do better? Can we get better? In this series of posts named Python Shorts, I will explain some simple constructs provided by Python, some essential tips and some use cases I come up with regularly in my Data Science work.
Python provides us with many styles of coding. In a way, it is pretty inclusive. One can come from any language and start writing Python. However, learning to write a language and writing a language in an optimized way are two different things. In this series of posts named Python Shorts, I will explain some simple but very useful constructs provided by Python, some essential tips and some use cases I come up with regularly in my Data Science work.
Learning a language is easy. Whenever I start with a new language, I focus on a few things in below order, and it is a breeze to get started with writing code in any language. Operators and Data Types: +,-,int,float,str Conditional statements: if,else,case,switch Loops: For, while Data structures: List, Array, Dict, Hashmaps Define Function However, learning to write a language and writing a language in an optimized way are two different things.
Visualizations are awesome. However, a good visualization is annoyingly hard to make. Moreover, it takes time and effort when it comes to present these visualizations to a bigger audience. We all know how to make Bar-Plots, Scatter Plots, and Histograms, yet we don’t pay much attention to beautify them. This hurts us - our credibility with peers and managers. You won’t feel it now, but it happens.
Chatbots are the in thing now. Every website must implement it. Every Data Scientist must know about them. Anytime we talk about AI; Chatbots must be discussed. But they look intimidating to someone very new to the field. We struggle with a lot of questions before we even begin to start working on them. Are they hard to create? What technologies should I know before attempting to work on them?
Just Kidding, Nothing is hotter than Jennifer Lawrence. But as you are here, let’s proceed. For a practitioner in any field, they turn out as good as the tools they use. Data Scientists are no different. But sometimes we don’t even know which tools we need and also if we need them. We are not able to fathom if there could be a more natural way to solve the problem we face.
This post is the fourth post of the NLP Text classification series. To give you a recap, I started up with an NLP text classification competition on Kaggle called Quora Question insincerity challenge. So I thought to share the knowledge via a series of blog posts on text classification. The first post talked about the different preprocessing techniques that work with Deep learning models and increasing embeddings coverage.