AWS Notes

DynamoDB

Last updated
Reading time
4 min read

Overview

DynamoDB is a serverless non-relational (NoSQL) database that stores data as tables of items each with a unique primary key. Each items takes a JSON-like structure, containing key-value pairs called attributes. Items can be configured to have secondary keys for greater flexibility around what attributes can be used to access data.

The underlying architecture is managed by AWS including auto-scaling storage, backups, and mirroring across multiple drives. Queries have millisecond response times which offers a significant performance advantage over a traditional relational database. Though it isn't perfect for every use case, this low-latency is the primary strength of using a NoSQL database. Benefits around speed are due in part to DynamoDB not needing to handle queries or aggregations. Instead, everything is optimized around simple reads and writes. Additionally, horizontal scaling happens automatically to distribute workloads from increased traffic.

Use cases for DynamoDB include performance-critical applications that must scale to extreme quantities of read/write operations (like trillions) or if operating a serverless app. In general, a NoSQL database like DynamoDB is a good starting point whenever the relationships between data are less important. If the database itself needs to handle queries and aggregations, an SQL database would be better.

DynamoDB Accelerator (DAX)

DAX is a caching software that runs on nodes in a DAX cluster. A cluster can be deployed into a VPC to sit in front of DynamoDB. It will cache frequently accessed data so that when a client request comes in for data it has a cache hit will occur, and the response will be sent without needing to read to the database. Items are cached when returned as results from GetItem or BatchGetItem` requests and expire after 5 minutes by default. Much of the functionality is configurable including how long to cache items or whether or not they should be cached at all. By caching frequently accessed items in memory, they can be accessed a bit more quickly (within in 1 - 2 milliseconds on average) than a regular read to DynamoDB.

Interacting with Data

Operations

As a NoSQL DB, there is no query language. Instead, a set of operations are supported, though one of these facilitates SQL-style queries. There are 11 in total:

  1. PutItem

    • Adds or replaces an item
  2. GetItem

    • Gets an item by its primary key
  3. UpdateItem

    • Updates attributes of an item that already exists
  4. DeleteItem

    • Deletes a single item by its primary key
  5. Query

    • Gets multiple items from a specified table by specifying conditions
    • Restricts conditions to being on partition keys or sort keys, components of an item's primary key
  6. Scan

    • Gets all items or a subset of items from a table by specifying conditions
    • Allows conditions to be based on any attribute in a table's items
  7. BatchGetItem

    • Gets multiple items from one or more tables by their primary keys
  8. BatchWriteItem

    • Adds or deletes items from on or more tables by their primary keys
  9. ExecuteStatement

    • Simulates SQL queries using PartiQL statements
    • Can query, insert, update, or delete with an SQL-style syntax
  10. TransactWriteItems

    • Can perform multiple write operations in the same call
    • Acts as an all-or-nothing write where all operations will fail if one fails
  11. TransactGetItems

    • Can perform multiple read operations in the same call
    • Acts as an all-or-nothing read that returns all items or none of them

Event-driven Streams

Streams can be enabled on a MongoDB database which captures change data for items in almost real time. This is can be done using either DynamoDB Streams or Kinesis Data Streams for DynamoDB. When an item is added, updated, or deleted, the event is added to a time-ordered stream as a stream record. These records are organized into groups called shards and are retained for 24 hours. This is best used when applications must respond to specific database events, for example, when paired with Lambda where the function might fire after a specific DynamoDB event.