I have found myself creating a Deep Learning Machine time and time again whenever I start a new project.
You start with installing Anaconda and end up creating different environments for Pytorch and Tensorflow, so they don’t interfere. And in the middle of it, you inevitably end up messing up and starting from scratch. And this often happens multiple times.
It is not just a massive waste of time; it is also mighty(trying to avoid profanity here) irritating. Going through all those Stack Overflow threads. Often wondering what has gone wrong.
So is there a way to do this more efficiently?
It turns out there is. In this blog, I will try to set up a deep learning server on EC2 with minimal effort so that I could focus on more important things.
This blog consists explicitly of two parts:
Setting up an Amazon EC2 Machine with preinstalled deep learning libraries.
Setting Up Jupyter Notebook using TMUX and SSH tunneling.
Don’t worry; it’s not as difficult as it sounds. Just follow the steps and click Next.
I am assuming that you have an AWS account, and you have access to the AWS Console . If not, you might need to sign up for an Amazon AWS account.
In this tutorial, I have gone with p2.xlarge instance, which provides NVIDIA K80 GPU with 2,496 parallel processing cores and 12GiB of GPU memory. To know about different instance types, you can look at the documentation here and the pricing here .
Keep this key pair safe as this will be required whenever you want to login to your instance.
To connect to your instance, Just open a terminal window in your Local machine and browse to the folder where you have kept your key pair file and modify some permissions.
chmod 400 aws_key.pem
Once you do that, you will be able to connect to your instance by SSHing. The SSH command will be of the form:
ssh -i "aws_key.pem" ubuntu@<Your PublicDNS(IPv4)>
For me, the command was:
ssh -i "aws_key.pem" [email protected]
Also, keep in mind that the Public DNS might change once you shut down your instance.
But there are still a few things you will require to use your machine fully. One of them being Jupyter Notebooks. To set up Jupyter Notebooks with your Machine, I recommend using TMUX and tunneling. Let us go through setting up the Jupyter notebook step by step.
We will first use TMUX to run the Jupyter notebook on our instance. We mainly use this so that our notebook still runs even if the terminal connection gets lost.
To do this, you will need to create a new TMUX session using:
tmux new -s StreamSession
Once you do that, you will see a new screen with a green border at the bottom. You can start your Jupyter Notebook in this machine using the usual jupyter notebook command. You will see something like:
It will be beneficial to copy the login URL so that we will be able to get the token later when we try to login to our jupyter notebook later. In my case, it is:
[http://localhost:8888/?token=5ccd01f60971d9fc97fd79f64a5bb4ce79f4d96823ab7872](http://localhost:8888/?token=5ccd01f60971d9fc97fd79f64a5bb4ce79f4d96823ab7872&token=5ccd01f60971d9fc97fd79f64a5bb4ce79f4d96823ab7872)
The next step is to detach our TMUX session so that it continues running in the background even when you leave the SSH shell. To do this just press Ctrl+B and then D (Don’t press Ctrl when pressing D)You will come back to the initial screen with the message that you have detached from your TMUX session.
If you want, you can reattach to the session again using:
tmux attach -t StreamSession
The second step is to tunnel into the Amazon instance to be able to get the Jupyter notebook on your Local Browser. As we can see, the Jupyter Notebook is actually running on the localhost on the Cloud instance. How do we access it? We use SSH tunneling. Worry not, it is straightforward fill in the blanks. Just use this command on your local machine terminal window:
ssh -i "aws_key.pem" -L <Local Machine Port>:localhost:8888 [ubuntu@](mailto:[email protected])<Your PublicDNS(IPv4)>
For this case, I have used:
ssh -i "aws_key.pem" -L 8001:localhost:8888 [ubuntu@](mailto:[email protected])ec2-54-202-223-197.us-west-2.compute.amazonaws.com
This means that I will be able to use the Jupyter Notebook If I open the localhost:8001 in my local machine browser. And I surely can. We can now just input the token that we already have saved in one of our previous steps to access the notebook. For me the token is 5ccd01f60971d9fc97fd79f64a5bb4ce79f4d96823ab7872
You can just login using your token and voila we get the notebook in all its glory.
You can now choose to work on a new project by selecting any of the different environments you want. You can come from Tensorflow or Pytorch or might be willing to get the best of both worlds. This notebook will not disappoint you.
It might happen that once the machine is restarted, you face some problems with the NVIDIA graphics card. Specifically, in my case, the nvidia-smi command stopped working. If you encounter this problem, the solution is to download the graphics driver from the NVIDIA website .
Above are the settings for the particular AMI I selected. Once you click on Search you will be able to see the next page:
Just copy the download link by right-clicking and copying the link address. And run the following commands on your machine. You might need to change the link address and the file name in this.
# When nvidia-smi doesnt work:
wget [https://www.nvidia.in/content/DriverDownload-March2009/confirmation.php?url=/tesla/410.129/NVIDIA-Linux-x86_64-410.129-diagnostic.run&lang=in&type=Tesla](https://www.nvidia.in/content/DriverDownload-March2009/confirmation.php?url=/tesla/410.129/NVIDIA-Linux-x86_64-410.129-diagnostic.run&lang=in&type=Tesla)
sudo sh NVIDIA-Linux-x86_64-410.129-diagnostic.run --no-drm --disable-nouveau --dkms --silent --install-libglvnd
modinfo nvidia | head -7
sudo modprobe nvidia
And that’s it. You have got and up and running Deep Learning machine at your disposal, and you can work with it as much as you want. Just keep in mind to stop the instance whenever you stop working, so you won’t need to pay Amazon when you are not working on your instance. You can do it on the instances page by right-clicking on your instance. Just note that when you need to log in again to this machine, you will need to get the Public DNS (IPv4) address from the instance page back as it might have changed.
I have always found it a big chore to set up a deep learning environment.
In this blog, we set up a new Deep Learning server on EC2 in minimal time by using Deep Learning Community AMI, TMUX, and Tunneling for the Jupyter Notebooks. This server comes preinstalled with all the deep learning libraries you might need at your work, and it just works out of the box.
So what are you waiting for? Just get started with Deep Learning with your own server.
If you want to learn more about AWS and how to use it in production settings and deploying models, I would like to call out an excellent course on AWS . Do check it out.
Thanks for the read. I am going to be writing more beginner-friendly posts in the future too. Follow me up at Medium or Subscribe to my blog
Also, a small disclaimer — There might be some affiliate links in this post to relevant resources, as sharing knowledge is never a bad idea.
comments powered by Disqus