EC2 Fundamentals
- Last updated
- Reading time
- 6 min read
Overview
Elastic Compute Cloud (EC2), provides virtual servers and is the backbone of a lot of the other services AWS offers.
Aspects of launching an instance
The following options are selected to start up a new instance:
- Instance Type
- Operating System
- Application Server
- Applications
Tenancy
Tenancy is used to describe if instances share hardware or are physically isolated to their own machine. Multi-tenant arrangements can be viewed as a security risk and may not meet compliance regulations in cases that necessitate high security. Examples would be in government-related applications or HIPAA compliance in the US which AWS has a white paper on titled Architecting for HIPAA Security and Compliance on Amazon Web Services.
Shared (Multi-tenant)
- Shares hardware in a multi-tenant arrangement where the separation of instances is logical and achieved via virtualization
Dedicated (Single-tenant)
- Reserves entire server that is physically isolated and only hosts the customers instance
Instance types
The full list is here. Each instance is named with the naming conventions which are explained here. It isn't super important to know the name of every kind of instance, but it's nice to know which category they're in just by looking at the first letter. I've listed the starting letters below. Though each letter represents a different family of instances, all of them are bucketed into one of the main instance types.
Compute Optimized
- More performant CPUs
- Names starting with
c
- High-performance (the highest) one's start with
hpc
Memory Optimized
- Best for when a data must be preloaded from storage
- Great for in-memory databases that need high performance like Redis (their docs recommends the memory optimized instances starting with
r
) - Names starting with either
r
,u
,x
, orz
Storage Optimized
- Designed to handle frequent, sequential IO
- Used for data-warehousing or DFS
- Names starting with either
d
,h
, ori
General Purpose
- Well-rounded across compute, memory, and networking
- Names starting with
m
ort
Accelerated
- Great for graphically intense workloads like streaming
- Names starting with either
g
,p
, orf
(but there are more) g
andp
are GPU-based instances
Instance Store Volumes (Storage)
Some instance types launch with Instance Store Volumes that are attached to the AWS host hardware. This is local disk storage allocated and used by the instance over its lifespan. Because it is directly attached to the instance, any data stored there will be deleted and lost when the instance terminates.
Pricing
On-demand
- Pay for the time you use with no lock-in contracts or discounts
- Good for when workloads are irregular or short-term
Reserved
- Get discounted rates by committing to a set term, 1 or 3 years.
- Two types:
- Standard Reserved - Locks in a dedicated instance type, size, OS, region, and tenancy. Optionally, the availability zone can be set which is called capacity reservation.
- Convertible Reserved - Offers less of a discount than standard, but allows greater flexibility to change instance family, type, or tenancy.
- Unless terminated or renewed, the pricing will change to on-demand after the current term ends
Spot
- Best used for background workloads with flexible, irregular timing
- Offers the greatest discount off of on-demand pricing (up to 90% off)
- Leverages unused hardware to spin up a temporary instance to handle the job
- Could be delayed or interrupted whenever space isn't available
Dedicated hosts
- Provides single-tenant server
- Can be purchased on-demand or reserved for a discount
- Is the most expensive type
Savings plans
- Requires an hourly spend commitment to an instance family and region
- Commit for a 1 or 3 year term to receive a discounted rate
- Discounts usage up to the commitment amount, then charges on-demand rates above that
- Unlike a standard reserved instance, you don't need to commit to a number of instances or any of their attributes like tenancy, size, or type
Flexibility
Auto Scaling
Auto Scaling will add and remove EC2 instances automatically to prevent under- or over-utilization of resources. This will reduce costs when traffic is low and ensure availability when traffic is high. In this context, the collection of instances are called an Auto Scaling group. First, there are two types of scaling:
Dynamic
- Responds to changes in demand as they occur
Predictive
- Schedules the number of instances based on predictions of demand
Combine Dynamic & Predictive
- By combining these, EC2 Auto Scaling can both predict and react to changes for greater efficiency
For configuring an Auto Scaling group,, set the following:
Minimum
- Lower bound for number of instances
- Must be at least 1
Maximum
- Upper bound for number of instances
Desired (Optional)
- Defaults to the minimum
- Maintains number if they terminate which could happen in cases like Spot instance interruption or some kind of disruption or failure
- Pair with CloudWatch alarms to reactively order auto scaler to scale up or down based on cost constraints
Elastic Load Balancing (ELB) - The basics
This is a much larger topic, so I'll keep it brief until a future page in the series. ELB distributes traffic across the instances in an Auto Scaling group (or other targets like a Lambda function). In doing so, all instances can be used more efficiently. The primary example would be to avoid all traffic crowding a single instance when multiple are available from being scaled up. However, there are different load balancing algorithms that handle different use cases. AWS has these listed here.
Additionally, ELB can handle forwarding traffic through security services to inspect and approve any requests and responses that are in route.
There are 4 types of load balancers:
Classic Load Balancer
- Older-gen, shared network that primarily routes traffic to EC2 instances
Application Load Balancer
- Routes traffic for HTTP, gRPC, or WebSocket requests (application layer)
Network Load Balancer
- Routes traffic based on IP address for TCP or UDP requests (network layer)
Gateway Load Balancer
- Routes private connections to third-party targets (AWS PrivateLink)
- Secures traffic by passing it through security appliances like a network firewall