AWS Notes

Storage

Last updated
Reading time
4 min read

Overview

In the EC2 post earlier in the series, Instance Store Volumes were introduced which are storage disks attached an instance. They are temporary and not supported on all instance types. When the files stored must persist beyond the lifespan of the instance, another AWS storage service should be used instead.

Storage Services

Elastic Block Store (EBS)

EBS Volumes are independently provisioned storage resources that are not attached to the AWS host that an instance runs on. There are many configurations to customize a volume such as its size, type, and the EC2 instance it is attached to. Typically, a volume can only be attached to one instance at a time. However, volume types io1 and io2 support Multi-Attach which facilitates attachment to multiple instances. For types that do not support this, snapshots can be created and copied to a new EBS volume that is then attached to a new instance. The maximum size of a volume is 16 TiB and they are solid state by default.

Making snapshots is also important for data loss prevention. In fact, EBS snapshot provides incremental backups for efficiency were only changed blocks (files) are backed up. Best practice is to configure these snapshots to occur on a regular basis such as daily or weekly intervals. Furthermore, the retention period of the snapshot can be set to determine how long a snapshots is kept.

Structurally, files stored in a volume do not have a hierarchical structure by default. However, a filesystem can be created and formatted on the volume using a common system like XFS or NTFS. It would also be important to format this before mounting the storage to an instance and using it.

Simple Storage Service (S3)

S3 stores files as objects within buckets. Buckets use a flat namespace and do not directly support a hierarchy. However, a folder structure could be simulated using a delimiter like a / in the file names as though they were paths. An object is made up of the file, metadata about the file, and a unique key. Once added to a bucket, objects cannot be modified and are described to as write once, read many. Furthermore, all objects are considered web-enabled with their own URL. In fact, a bucket can even be configured to host static websites where an index.html file and an error.html file can be specified in a bucket's configuration. Buckets are replicated across several of AWS's data centers for high availability and fault tolerance.

Buckets do not have any size limits. Instead, sizing rules are enforced against the objects themselves where the maximum size is 5 TB. Furthermore, S3 supports versioning of objects and retains previous versions. By default, one AWS account can only create 100 S3 buckets, but that can be increased by request.

Access permissions can be configured at the bucket-level or object-level. Multiple buckets can be created and used by the same EC2 instance as long as it has the permissions to do so. Unlike other types of storage, buckets are not attached to a single instance.

S3 buckets have several types called storage classes:

  1. Standard

    • Designed for data that is regularly accessed
    • Stored in 3 or more AZs
  2. Infrequent Access

    • Designed for data that isn't accessed regularly that must remain highly available
    • Discounts storage costs
    • Increases access costs
    • Can be further discounted by storing in only a single AZ which is only recommended if the data can be easily recreated outside of AWS
  3. Intelligent-tiering

    • Great for objects with more dynamic access patterns
    • Introduces extra costs for automation and monitoring charged on a per object basis
    • Objects accessed frequently are kept in Standard storage class
    • Objects not accessed after 30 days are automatically moved to the infrequent storage class and moved back when accessed more frequently - switches between the two previous categories: Standard and Standard Infrequent Access
  4. Glacier

    • Best for archived data
    • Has 3 different versions dictating how fast the objects can be retrieved
      • Instant Retrieval provides immediate access within milliseconds
      • Flexible Retrieval will retrieve objects within a few minutes to a few hours
      • Deep Archive retrieves objects within 12 hours
    • For these types, costs decrease as retrieval times increase
  5. Outposts

    • This types offers bucket creation in on-premise hardware, an AWS outpost.

Elastic File System

EFS is a cloud-based shared file system. It can be accessed by multiple EC2 instances within the cloud or by on-premise resources through Direct Connect as long as they are in the same region. The filesystem uses the NFS protocol and integrates with IAM for access control. The size of EFS will scale up and down automatically as files are added and removed.