DynamoDB Notes

21 Mar 2017 • document store key value nosql sql snippet

Scaling, optimization etc all handled by dynamodb

Types of nosql

key-value databases
document databases
column-family databases
Graph databases

DynamoDB is a mix of key-value and document database

Throughput units

Datamodel

items are like rows

Attributes are like columns

Keys -- single or composite

An attribute can only be up to 400kb of data

Creating tables

Keys

Hash Primary Key
- Unique, required, single attribute
- DynamoDB creates unordered index
  - You can't creat queries on an unordered index
- Use when you know the id/key
Hash and Range Primary Key
- Key is comprised of two attributes
- Combination must be unique
- Unordered index + sorted range index
- Best for query, grouping scenarios
- example: client_id as hash and order_id as range

Data Types

Scalar
- string, number, binary, boolean, null
multi-value types
- String, number, and binary sets
Document types
- List (array)
- Map (Object) (should use this for OJ Stripe object)

Throughput units

Reads in blocks of 4kb
Writes in blocks of 1kb
Read capacity units per second
Write capacity units per second
Eventually consistent vs strongly consistent reads
Impact of secondary indexes

Partioning

Single partition can hold ~10gb of data
Throughput spread across partitions
DynamoDB automatically partitions based on hash key
partitions are limited to about 3000 read capacity units and 1000 write capacity
Dynamo will never shrink the size of your partitions
Unused capacity reserved for bursts

Table design

Avoid hot keys
- really important to avoid hot keys
- Need uniform access
- Random extension to hash key
Time series data in multiple tables
- put into tables based on monthly or weekly data
Test applications ahead of time
Storage of large items elsewhwere
Use caching solutions for popular items

Querying

GetItem
- uses primary key
- eventually consistent by default
Query
- Find item via primary key attributes(s) for table or index
- Retrieved in sorted order when using range key
Scan
- Reads every item in a table or index
- slow as table grows
- you can run parallel scans
Filters and pagination support for query or scan

Secondary Indexes

Alternate keys for querying and scanning
up to 5 allowed per source table
contains all or subset of attributes
Automatically maintained
uses provisioned throughput
- Throughtput capacity units split between table and index
no size limit

Global Secondary Index

For querying non-key attributes
Hash key or hash and range key
- different view into the data
projected attributes get copied into the index
query or scan only

Local Secondary Index

Alternate range key for hash key
For each hash key, 10GB max
Applied when creating the table
Basically a different index
Projected attributes copied into the index
Query or scan only
throughput taken from the main table throughput