AWS year in review
If you want to provide an excellent AWS consulting service you need to stay up-to-date with all the new AWS stuff (450 announcements in 2015). As an AWS Cloud Consultant it’s part of my job to read all AWS announcements and evaluate the new features as early as possible.
In this post I aggregated and commented the new stuff released in 2015 for you. To organize this a little bit I grouped it into 9 categories (hottest stuff at the top):
- Compute
- Storage & Content Delivery
- Analytics
- Application Services
- Databases
- Networking
- Internet of Things
- Developer & Management Tools
- Security & Identity
I wish you a happy new year, stay AWSome in 2016!
Compute
Serverless code execution
AWS Lambda was announced in 2014 but it became available to all of us in 2015. I award Lambda with the hottest new service award! This will likely be a game changer!
You can use Lambda to run code written in:
- Node.js
- JVM (Java, Scala, …)
- Python
A Lambda function can be triggert by:
- S3 bucket notifications
- Amazon DynamoDB Streams: in response to a database change
- Amazon CloudWatch logs: in response to new logs
- Amazon Simple Email Service: in response to a received email
- Amazon Kinesis streams: in response to new records in your stream
- Amazon SNS: in response to a pub/sub message
- Amazon API Gateway: when calling an HTTP REST endpoint
- Scheduled events
- manually by invoking the API
- …
Lambda is often called serverless because you no longer need to worry about (virtual) machines. You only provide the code and AWS executes it.
Docker containers
The newly launched Amazon EC2 Container Service provides everything between raw EC2 instances and a running docker container. It integrated nicely with EBS, IAM and ELB. The EC2 Container Registry makes it easy for developers to store, manage, and deploy Docker container images.
Spot instances
Besides on-demand and reserved EC2 instances you can also bid on spare EC2 capacity in availability zones (data centers). The AWS spot market is not new but in 2015 AWS made spot instances useful for many more use cases.
- If the market price goes above your bid price you have now up to two minutes to save your
assdata before the instance is terminated: EC2 Spot Instance Termination Notices. - The new EC2 Spot Bid Advisor tells you how likely your bid price will be below market price by analyzing the history of the spot market price history.
- You can now use Spot Instance Fleets to bid on compute power (cores, memory, …) instead of a single instance in a specific availability zone. The Spot Instance Fleet will fulfill your resource demand by mixing instance types and availability zones while using the lowest available spot prices.
- With the newly added EC2 Spot Instances for Specific Duration Workloads you can now bid for spot instances but with the guarantee that they are not terminated during your defined duration block (1, 2, …, 5, 6 hours). Spot blocks are priced differently than spot instances (6 hours guarantee is more expensive than 1 hour guarantee).
New instances: C4, D2, M4, g2.8xlarge, t2.large, t2.nano
- C4 instances are the latest generation of compute optimized instances
- D2 instances are the latest generation of HDD storage optimized instances
- M4 instances are the latest generation of general purpose instances
- t2.large is the new high end burstable performance instance (8.0 GiB RAM) while t2.nano is the smallest (0.5 GiB RAM)
Auto Recovery
Auto Recovery for Amazon EC2 solves the problem of replacing a single EC2 instance if the underlying hardware fails without changing the instance id, private IP address, EIP addresses (not the public IP!), EBS volume attachments, and other configuration details. Auto Recovery can be automatically triggered by an CloudWatch alarm to automize the whole process. My personal opinion: better use a Auto Scaling Group with Min/Max/Desired set to one and don’t use fixed IP addresses at all!
Load Balancers
- You can now attach or detach Elastic Load Balancers in your Auto Scaling Group on-the-fly.
- Elastic Load Balancing now support all ports (1-65535).
Storage & Content Delivery
Network attached storage
You can now create EBS Provisioned IOPS volumes that can store up to 16 TB, and process up to 20,000 IOPS. You can also create Amazon EBS General Purpose (SSD) volumes that can store up to 16 TB, and process up to 10,000 IOPS. These volumes are designed for five 9s of availability and up to 320 megabytes per second of throughput when attached to EBS optimized instances.
Content Delivery
- New CDN edge locations in Seoul, Korea, Chicago, Illinois.
- You can now invalidate multiple objects by providing invalidation patterns like
*.png*
or even*
. - You can now configurable default & max TTL in case your origin does not provide proper headers.
- CloudFront now supports Gzip compression at the edge.
- You can now add or modify request headers forwarded from CloudFront to your origin.
Object store and archival
- Introduction of cross-region replication to copy S3 objects into another region
- Amazon Glacier Vault Access Policies provide a second way to grant access to a Glacier vault besides IAM.
- You can now meet regulatory storage requirements with Amazon Glacier Vault Lock by set a “write once read many” policy.
- Lower Glacier prices
- A new Amazon S3 storage class (Standard - Infrequent Access) where the per GB price is lower but you pay an additional retrieval fee.
Analytics
Data stream
Amazon Kinesis is a continuous data stream. It is real-time and elastic and you can use it to reliably deliver any amount of data to your mission-critical applications. Many producers write to the stream on the one side while a bunch of consumers read on the other side typically in batches for efficient data processing.
- Amazon Kinesis reduced the time between inserting and retrieving data from seconds to “no time”.
- Amazon Kinesis Streams can now remember data up to 7 days.
- Amazon Kinesis Client Library for Ruby and Node.js Developers.
Machine Learning
The new service Amazon Machine Learning provides binary classification (spam/no spam), multi-class classification (buy, sell, do nothing), or regression models (temperature prediction). To create a model you need data to train with. This training data can be stored in S3, Redshift or RDS. Your data contains one column that provides the answer to your prediction questions (e.g. spam/no spam) and other columns that are used to predict. One part of your training data is then used by Amazon Machine Learning to create the model while the other part is used to cross-validate your model. Therefore it feeds the columns used for prediction and compares the prediction with the prediction column. If the model satisfies your needs you can use it to do batch or real-time predictions with new data.
Hadoop Ecosystem
The new version 4.2 of the managed “Hadoop”/“Big Data” service Elastic Map Reduce provides:
- Apache Hadoop 2.6.0
- Apache Hive 1.0.0
- Apache Pig 0.14.0
- Apache Spark 1.5.2
- Hue 3.7.1
- Ganglia 3.6.0
- Apache Mahout 0.11.0
Application Services
Managed HTTP API backend
With Amazon API Gateway you can build and run scalable application backends. API Gateway provides an HTTP API endpoint that is fully configurable. You define the HTTP resources (like /user
), the HTTP methods on that resources (like POST
, GET
, DELETE
, …) and the integration (e.g. Lambda function) that should be called to process the request. The Lambda function can than run whatever logic is needed to answer the request. The result of the Lambda function is returned by the API Gateway to the caller.
Managed Elasticsearch
Besides the proprietary search solution called Amazon CloudSearch AWS added Amazon Elasticsearch to their service portfolio. Amazon Elasticsearch comes with tow following plugins:
- Kibana 3 & 4
- jetty
- cloud-aws
- kuromoji
- icu
Receive emails with SES
Amazon Simple Email Service is not a new service but in 2015 AWS added inbound email functionality. You are now able to forward incoming emails to a SNS topic or call a Lambda function directly. You can now create pretty cool features like comment by email.
Databases
RDS
- Encryption using keys managed in AWS Key Management Service
- New maximum database storage size up to 6TB
- Updated / new engines:
- Amazon Aurora: A new MySQL-compatible & proprietary database engine made by Amazon
- MariaDB: A new engine made by the original developers of MySQL (10.0)
- Oracle Database: 12.1 (EE, SE, SE One, SE Two)
- PostgreSQL: 9.4
- Microsoft SQL Server: 2014 Express, Web and Standard Editions
- MySQL: 5.6
ElastiCache
- Updated engines:
- ElastiCache for Redis: 2.8.23
- ElastiCache for Memcached: 1.4.24
DynamoDB
- Enhanced metrics for better operational insight are now provided every minute (instead of every 5): This means you can react to provisioning shortages earlier.
- DynamoDB Streams was added. A stream contains all the changes to a table and can be read in time sequence. Typical use-cases are cross-region replication and replicate changes to Elasticsearch / ElastiCache.
Amazon Redshift
- Now supports Scalar User-Defined Functions in Python.
- Now supports modifying cluster accessibility (VPC, public).
- Now supports specifying sort order for NULL values.
- Now supports tag-based permissions (with IAM) and default access privileges (inside the database).
- Now supports BZIP2 compression format.
Networking
VPC network debugging
VPC flow logs can report allowed and/or denied traffic on a Elastic Network Interface (ENI) level. Traffic can be allowed/denied by either Security Groups or ACLs. Flow logs are stored in CloudWatch logs and can be searched and monitored like CloudWatch logs. I find VPC flow logs very useful when troubleshooting network problems in VPCs.
VPC pain point: NAT
Private subnets with NAT instance to access the internet have their pitfalls. AWS released a Managed NAT gateway to fix tackle the problem. The gateway has built-in redundancy for high availability. Each gateway that you create can handle up to 10 Gbps of bursty TCP, UDP, and ICMP traffic, and is managed by Amazon. The 10 Gbps is still a pitfall but at least you don’t need to worry about HA any more.
DNS with Amazon Route 53
- Calculated health checks are logical combinations of health checks.
- Latency measurement health checks allow you to switch your traffic if one endpoint is to slow.
- Route 53 Traffic Flow is a tool to graphically design your weights and health checks to control the flow of your traffic.
Internet of Things
re:Invent 2015 was focusing heavily on IoT. A lot of use cases and customers were shown. The message is clear: AWS is or tries to become the place to store and evaluate data from any kind of IoT devices and services. To be able to achieve that AWS is offering a managed cloud platform called AWS IoT. Important part of this platform is a MQTT broker that allows to collect data from IoT devices over the Internet easily.
Developer & Management Tools
Simple Systems Manager
Executing scripts or commands on all your EC2 instances (Linux & Windows) is now integrated into the AWS ecosystem (including IAM permissions). The new Amazon EC2 Simple Systems Manager enables you to remotely manage the configuration of your Amazon EC2 instance. See how Remote Instance Management at Scale works.
AWS Elastic Beanstalk
Added support for:
- Go
- Docker 1.5 (also Multi-Container environments)
- Ruby 2.2
- Node.js 0.12.0
- PHP 5.6
- Generic Java Applications
CloudFormation
- Now supports additional parameter types like Security Group id, Subnet id, and many more.
- With the new CloudFormation Designer you can visually author templates.
- The AWS Marketplace now supports AWS CloudFormation templates besides AMI images.
Security & Identity
Following the principle of least privilege
To support you to follow the principle of least privilege the IAM service now helps you to quickly identify when an access key was last used. So you can delete keys that are no longer used.
To keep your your managed policies, users, groups and roles as strict as possible the Access Advisor tab now shows information what services are allowed and when the last access was recorded. If you see services without access these are candidates for removal!
Web Application Firewall (WAF)
AWS added a managed WAF service that integrated with CloudFront. A WAF analyzes HTTP traffic to block malicious traffic like SQL injection or traffic from a certain IP source. This is an interesting service for clients who are not confident enough that their web applications handle input validation properly. WAFs are popular in the enterprise IT security world while I recommend to properly validate user input and an elastic architecture of your application.
Additional notes
If you want to stay up to date I can recommend Jeff Barr’s weekly and official Week in Review (~ 50 blog posts a year) or the official What’s New from Amazon Web Services page (~ 450 entries a year).
I have not collected the new stuff for Mobile Services and Enterprise Applications because I don’t use this stuff at the moment.
Further reading
- Article Antivirus for S3 Buckets
- Article WordPress on AWS: you are holding it wrong
- Article Building blocks for highly available systems
- Article What can you do with AWS?
- Tag retrospective