Accelerating Spark 3.0 Google DataProc Project with NVIDIA GPUs in 6 simple steps

Accelerating Spark 3.0 Google DataProc Project with NVIDIA GPUs in 6 simple steps

Data Exploration is a key part of Data Science. And does it take long? Ahh. Don’t even ask. Preparing a data set for ML not only requires understanding the data set, cleaning, and creating new features, it also involves doing these steps repeatedly until we have a fine-tuned system. As we moved towards bigger datasets, Apache Spark came as a ray of hope. It gave us a scalable and distributed in-memory system to work with Big Data.

Deployment could be easy — A Data Scientist’s Guide to deploy an Image detection FastAPI API using Amazon ec2

Deployment could be easy — A Data Scientist’s Guide to deploy an Image detection FastAPI API using Amazon ec2

Just recently, I had written a simple tutorial on FastAPI, which was about simplifying and understanding how APIs work, and creating a simple API using the framework. That post got quite a good response, but the most asked question was how to deploy the FastAPI API on ec2 and how to use images data rather than simple strings, integers, and floats as input to the API. I scoured the net for this, but all I could find was some undercooked documentation and a lot of different ways people were taking to deploy using NGINX or ECS.

How to Create an End to End Object Detector using Yolov5

How to Create an End to End Object Detector using Yolov5

Ultralytics recently launched YOLOv5 amid controversy surrounding its name. For context, the first three versions of YOLO (You Only Look Once) were created by Joseph Redmon. Following this, Alexey Bochkovskiy created YOLOv4 on darknet, which boasted higher Average Precision (AP) and faster results than previous iterations. Now, Ultralytics has released YOLOv5, with comparable AP and faster inference times than YOLOv4. This has left many asking: is a new version warranted given similar accuracy to YOLOv4?

A Layman’s Guide for Data Scientists to create APIs in minutes

A Layman’s Guide for Data Scientists to create APIs in minutes

Have you ever been in a situation where you want to provide your model predictions to a frontend developer without them having access to model related code? Or has a developer ever asked you to create an API that they can use? I have faced this a lot. As Data Science and Web developers try to collaborate, API’s become an essential piece of the puzzle to make codes as well as skills more modular.

A definitive guide for Setting up a Deep Learning Workstation with Ubuntu

A definitive guide for Setting up a Deep Learning Workstation with Ubuntu

Creating my own workstation has been a dream for me if nothing else. I knew the process involved, yet I somehow never got to it. But this time I just had to do it. So, I found out some free time to create a Deep Learning Rig with a lot of assistance from NVIDIA folks who were pretty helpful. On that note special thanks to Josh Patterson and Michael Cooper.

End to End Pipeline for setting up Multiclass Image Classification for Data Scientists

End to End Pipeline for setting up Multiclass Image Classification for Data Scientists

Have you ever wondered how Facebook takes care of the abusive and inappropriate images shared by some of its users? Or how Facebook’s tagging feature works? Or how Google Lens recognizes products through images? All of the above are examples of image classification in different settings. Multiclass image classification is a common task in computer vision, where we categorize an image into three or more classes. In the past, I always used Keras for computer vision projects.

How to run your ML model Predictions 50 times faster?

How to run your ML model Predictions 50 times faster?

With the advent of so many computing and serving frameworks, it is getting stressful day by day for the developers to put a model into production. If the question of what model performs best on my data was not enough, now the question is what framework to choose for serving a model trained with Sklearn or LightGBM or PyTorch. And new frameworks are being added as each day passes.