MLWhiz | AI Unwrapped

MLWhiz | AI Unwrapped

Share this post

MLWhiz | AI Unwrapped
MLWhiz | AI Unwrapped
Minimal Pandas Subset for Data Scientists on GPU
Copy link
Facebook
Email
Notes
More

Minimal Pandas Subset for Data Scientists on GPU

Rahul Agarwal's avatar
Rahul Agarwal
Feb 22, 2020
∙ Paid

Share this post

MLWhiz | AI Unwrapped
MLWhiz | AI Unwrapped
Minimal Pandas Subset for Data Scientists on GPU
Copy link
Facebook
Email
Notes
More
Share
Minimal Pandas Subset for Data Scientists on GPU

Data manipulation is a breeze with pandas, and it has become such a standard for it that a lot of parallelization libraries like Rapids and Dask are being created in line with Pandas syntax.

Sometimes back, I wrote about the subset of Pandas functionality I end up using often. In this post, I will talk about handling most of those data manipulation cases in Python on a GPU using cuDF.

With a sprinkling of some recommendations throughout.

PS: for benchmarking, all the experiments below are done on a Machine with 128 GB RAM and a Titan RTX GPU with 24 GB RAM.

Keep reading with a 7-day free trial

Subscribe to MLWhiz | AI Unwrapped to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Rahul Agarwal
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More