Amazon Web Service (“AWS”) Elastic Compute Cloud (“EC2”) presents a powerful and scalable option for computing. It allows developers to access virtual computing environments equipped with high-performance processing units like GPUs (Graphics Processing Units). These GPUs accelerate the training of complex machine learning models, enabling tasks that would be impractical or exceedingly slow on standard computers. This is particularly vital for deep learning models, which require substantial computational power to process large datasets and perform intricate calculations.
When you spin up an EC2 instance, AWS offers you the choice of configuring that instance from scratch or leveraging a prebuilt Amazon Machine Image (AMI). A prebuilt AMI is a template that contains a software configuration (An operating system, tools, and applications) for a specific purpose. For example, you might use a prebuilt AMI configured for deep learning.
Although the prebuilt AMIs are great, they aren’t free and can increase the cost of your EC2 instance. Over a long enough period of time, these increased costs can become significant. By configuring your EC2 instance from scratch, you not only save on costs but also gain a deeper understanding of the setup process and the ability to tailor your environment to your specific needs.
Recently, I had to configure an EC2 instance from scratch. I spent a whole bunch of hours trying to piece together documentation from a variety of sources. The remainder of this post details the steps I took to configure the machine, and hopefully can save someone some confusion in the future.
As a disclaimer, this tutorial might not work out of the box. You need an AWS account with the required roles and permissions to create an EC2 instance. Additionally, AWS accounts don’t come standard with access to GPU machines — you might have to submit a quota request increase to be be able to spin up an EC2 instance with a GPU. Feel free to reach out if you need help.
There are a multitude of ways you you can interact with AWS ranging from AWS management console to terraform…