Databases on AWS

Michael Wittig – 30 Apr 2020

Andy Jassy, CEO of AWS, proclaimed #DBFreedom, aka use whatever database you like. AWS offers them all. At least, that’s what AWS marketing wants us to understand.

In the real world, AWS offers a wide variety of databases for different use cases. Your job is to pick the right solution for your problem. Knowing all the options improves the quality of your architectural decisions. In this blog post, I introduce all the database options that AWS offers.

This is a cross-post from the Cloudcraft blog.

Databases on AWS: #DBFreedom

Amazon Relational Database Service (RDS)

Amazon RDS provides traditional relational databases operated by AWS. You can create a new database with the click of a button, wait 5-15 minutes, and you are ready to go. AWS takes care of patching, monitoring, backups, HA deployments, and read-replicas.

The following engines are supported:

  • MySQL
  • MariaDB
  • PostgreSQL
  • Oracle Database
  • Microsoft SQL Serve

Use RDS if no other database is a better fit or if you are in doing a lift&shift migration.

Amazon Aurora (Serverless)

Amazon Aurora is also part of RDS, but it requires a more detailed view. Aurora is a proprietary database engine developed by Amazon. The core of the technology is a unique storage layer that makes it possible to scale relational databases horizontally without hassle. The following figure demonstrates the replicated storage layer and the horizontally scalable database instance.

Amazon Aurora

Aurora provides a MySQL or PostgreSQL compatible database that is easy to scale.

If you go with the Serverless offering, you get a database that scales in and out depending on load. Does your workload run on MySQL or PostgreSQL? Give Aurora Serverless a try!

Amazon DynamoDB

Amazon DynamoDB is a NoSQL database with virtually unlimited scaling. Both in terms of storage and queries per second. The downside is that DynamoDB is not a relational database and does not support SQL. You can think of it as a document or key-value store if you are familiar with those concepts. If you create a data model, you have to work from the queries backward. The following figure shows a Serverless application that uses DynamoDB as a data store.

Amazon DynamoDB

You interact with DynamoDB via the AWS API. Usually, you use one of the AWS SDKs to call the API from your programming language of choice.

Amazon DocumentDB

Amazon DocumentDB provides a MongoDB compatible database hosted by AWS. DocumentDB is powered by Aurora storage technology. MongoDB is a document database that can be used as a primary database.

Not all features of MongoDB 3.6 are supported. If you are looking for a real MongoDB, check out MongoDB Atlas.

Typical use cases include more complex data models, as found in business applications.

Cover of Amazon Web Services in Action

Level up, strengthen your AWS skills.

Our book Amazon Web Services in Action is a comprehensive introduction to computing, storing, and networking in the AWS cloud. You'll find clear, relevant coverage of all the essential AWS services, emphasizing best practices for security, high availability, and scalability. Get the first chapter for free!

Amazon ElastiCache

Amazon ElastiCache provides in-memory caches operated by AWS. Choose between Redis and memcached, two popular Open Source in-memory databases. You can expect similar features that RDS provides as well: high availability, snapshots, and many more. Caches are usually not used as the primary data source. Instead, a cache is used to offload reads from the database.

Typical use cases are caching, low latency and read-intensive lookups, volatile data (e.g., a session store).

Amazon ElastiCache

Cached data can become outdated. Commonly, caches are not always consistent with the primary data source. But the performance benefits are worth it.

Amazon Elasticsearch Service

Amazon Elasticsearch Service provides Elasticsearch hosted by AWS. Elasticsearch is a document store with a search engine. Usually, Elasticsearch is not used as the primary database. Instead, data is replicated into Elasticsearch to provide search functionality to end-users.

Amazon Elasticsearch Service

Due to some licensing conflicts with Elastic, Amazon provides the Open Distro for Elasticsearch, which is a flavor of Elasticsearch. Don’t be confused by Open Distro. It’s the same as Elasticsearch, but open-source implementations replace the commercial plugins.

Typical use case: full-text search for documents as well as faceted search for an online shop.

Amazon Redshift

Amazon Redshift provides a data warehouse managed by AWS. Redshift can deal with up to 8 PB of data! It’s a relational database that supports SQL for queries. Redshift is good at inserting large data chunks in one shot. It doesn’t like to receive frequent but small updates.

The typical use case is a data warehouse.

Amazon Neptune

Amazon Neptune provides a graph database operated by AWS. Neptune supports Gremlin and SPARQL to interact with the data. Graph databases shine when your data is highly interconnected. Graphs can be used to answer questions such as “people like you bought this” or “friends of friends like that”.

The typical use case is highly interconnected data.

Amazon Quantum Ledger Database (QLDB)

Amazon QLDB provides a ledger database. This is a new category of databases. You cannot update or delete data in QLDB. You can only append new data. QLDB goes one step further: you can cryptographically verify that the data has not changed. A perfect fit for a system that deals with financial transactions that must never change. QLDB comes with zero operational effort for you. AWS takes care of everything.

Amazon Keyspaces (for Apache Cassandra)

Amazon Keyspaces provides a Cassandra compatible database. MCS comes with a Serverless offering with built-in auto-scaling. All you have to do is query the database. Cassandra is known as a wide column database with a proven track record. Be warned, Cassandra is not easy to use!

Previously known as Amazon Managed Apache Cassandra Service (MCS).

Amazon Timestream

Amazon Timestream provides a time-series database. Time series data is all about time. Good fits are sensor data, stock market data, FX rates, and so on. Whenever tuples of time and value are stored, and your queries deal with time spans, this database might be of value. Unfortunately, there is not too much information about Timestream published at this moment. Timestream is in private preview and, therefore, not ready for most of us and certainly not for production workloads!

Comparison

The following table provides a comparison of the different database options on AWS.

Max. Data Volume Interface Replication DB Model
RDS / Aurora / Serverless 64 TiB SQL None, Multi-AZ, Multi-Region Relational
DynamoDB Unlimited AWS API Multi-AZ, Multi-Region Key-value, Document
DocumentDB 64 TB Subset of MongoDB API Multi-AZ Document
ElastiCache 155 TiB Redis/memcached API Multi-AZ Key-value
Elasticsearch Service 3 PB Elasticsearch API Multi-AZ Document, SearchEngine
Redshift 8 PB SQL Multi-AZ Relational
Neptune 64 TB Subset of Gremlin & SPARQL Multi-AZ Graph
QLDB Unlimited Subset of PartiQL Multi-AZ Ledger
Amazon Keyspaces (for Apache Cassandra) Unlimited Subset of CQL Multi-AZ Widecolumn
Timestream* Not documented yet. Not documented yet. Not documented yet. Timeseries

* In Preview: Not for production workloads!

Michael Wittig

Michael Wittig

I’m an independent consultant, technical writer, and programming founder. All these activities have to do with AWS. I’m writing this blog and all other projects together with my brother Andreas.

In 2009, we joined the same company as software developers. Three years later, we were looking for a way to deploy our software—an online banking platform—in an agile way. We got excited about the possibilities in the cloud and the DevOps movement. It’s no wonder we ended up migrating the whole infrastructure of Tullius Walden Bank to AWS. This was a first in the finance industry, at least in Germany! Since 2015, we have accelerated the cloud journeys of startups, mid-sized companies, and enterprises. We have penned books like Amazon Web Services in Action and Rapid Docker on AWS, we regularly update our blog, and we are contributing to the Open Source community. Besides running a 2-headed consultancy, we are entrepreneurs building Software-as-a-Service products.

We are available for projects.

You can contact me via Email, Twitter, and LinkedIn.

Briefcase icon
Hire me