MLWhiz | AI Unwrapped

MLWhiz | AI Unwrapped

100x faster Hyperparameter Search Framework with Pyspark

Rahul Agarwal's avatar
Rahul Agarwal
Feb 22, 2020
∙ Paid
100x faster Hyperparameter Search Framework with Pyspark

Recently I was working on tuning hyperparameters for a huge Machine Learning model.

Manual tuning was not an option since I had to tweak a lot of parameters. Hyperopt was also not an option as it works serially i.e. at a time, only a single model is being built. So it was taking up a lot of time to train each model and I was pretty short on time.

I had to come up with a better more efficient approach if I were to meet the deadline. So I thought of the one thing that helps us data scientists in many such scenarios — Parallelization.

Can I parallelize my model hyperparameter search process?

As you would have guessed, the answer is Yes.

This post is about setting up a hyperparameter tuning framework for Data Science using scikit-learn/xgboost/lightgbm and pySpark.

User's avatar

Continue reading this post for free, courtesy of Rahul Agarwal.

Or purchase a paid subscription.
© 2025 Rahul Agarwal · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture