Adrian Cantrill AWS Solutions Architect Associate Course
Introduction
Site Tools and features
Scenario - Animals4Life
- Animals4life.org is used throughout the course to model a real-world situation to show use cases
- Animals4life is an animal rescue and awareness org that rehouse and monitors animals that uses IOT and big data
- global company
- call center/admin/IT/Marketing/Legal & accounts
- 200 staff, half remote, half onsite in Australia
- also has lobbying staff and major offices in London, New York, Seattle
- Currently have small on prem data center in Brisbane
- They previously had a bad AWS Trial in Sydney Aus
- Also tried other cloud services
- All remote workers rely on infrastructure in AUS
- company is cost conscious
- needs to be available 24/7
Problems
- On-prem hardware is failing and must be replaced in 18 month
- AWS/Azure previous experience were messy and not best practice
- Performance issues exist for field workers
- Lack of high-availability systems and scalability
- small it team with little automation/cloud experience
- global expansion concerns - cost for new infrastructure
Ideal Outcomes
- fast performance for field workers
- able to deploy into new regions quickly when required
- low cost and scalable base infrastructure
- Agility - spin up new marketing campaigns, social and progressive applications(iot, big data, etc is important
- Automation - low base staffing costs
AWS Accounts - The Basics
An AWS account is a container for identities and AWS resources
- when creating an account you need a name, a unique email address, and a credit card
- This creates an account root user
- AWS are billed pay-as-you go
- this course will use free-tier as much as possible
- Account root user has full control over the AWS account
- you can create sub-identities that can be restricted that uses Identity Access Management (IAM) IAM users/groups/roles with the ability to limit permissions
- permissions must be explicitly granted
- the orange line in the below picture is the boundary of the account, it helps keep things out and things in
Setting it up
Cost Management
- Setting up a budget
IAM Identity and Access Management
Each AWS account has a root user with full unrestricted access
IAM lets us restrict access to different users in a least privileged accessed way
IAM identities start with no permissions and can be granted permissions
Each account comes with its own IAM database.
IAM is a globally resilient service - any data is always secure across all AWS regions
The IAM you see is your own dedicated IAM. the IAM service is trusted fully.
Inside IAM you can create different identities.
IAM Basics
IAM lets you create three different types of identity objects:
- IAM users: represents humans or applications that need access to your account
- IAM groups: Collection of related users, eg dev team, finance, HR
- IAM roles: can be used by AWS services or for granting external access to your account.
IAM policies are documents that can be used to allow or deny access to AWS services only when attached to users, groups, or roles
IAM has three main jobs:
- It manages identities - it's an ID provider
- It authenticates the identities it manages
- It authorizes access based on policies
IAM is free, there are no costs associated with creating users, groups, or roles
there are limits on the quantities of users etc
IAM is a global service and is globally resilient
IAM allows or denies its identities on its AWS account
There is no direct control over external accounts or users
IAM lets you make use of identity federation and MFA
normal practice is to replace the root user with an IAM admin identity
IAM Access Keys
command line access is done via IAM access keys
they are a long-term credential.
They have to be explicitly changed
Am IAM user has 1 username and 1 password
An IAM user can have 0,1, or 2 keys which can be created, deleted, made inactive, or active and are defaulted to active
Access keys are made from two parts:
- the Access key id: shorter
- the secret access key: longer, more complex
The only time you can get the secret access key is when you create it
Rotating access keys is when you create a new access key and delete the old ones
IAM users are the only identity that uses access keys
Creating IAM access keys
This is done through the security console
you can deactivate and reactivate keys
AWS cli integration requires credential configuration in order to configure
AWS configure --profile iamadmin-general
our server - us-east-1
AWS Fundamentals
AWS Public vs Private Services
AWS services can be categorized into two types:
- Public:
- Private: Runs within a VPC(Virtual Private Cloud). Only things in the VPC can access
These refer to networking only
When thinking of networks, people generally thing in two zones:
- Public Internet zone
- Private networks: Like home private network - only things connected to a port in your house can operate in the personal network
AWS has private zones called VPC's Nothing from the internet can reach a VPC unless it's configured to allow outside connections
The AWS public zone runs between the public internet zone and the AWS private zone.
AWS Global Infrastructure
AWS is a collection of individual infrastructure worldwide and consists of AWS regions and AWS edge locations
Regions
don't map directly on to continent or country. It contains a full deployment of AWS infrastructure (all services)
we can use this concept to design systems that are resilient to global disasters
Regions have 3 main benefits:
- Geographic Separation: Each region is geographically separate. Disasters in one region wont impact other regions
- Geopolitical Separation: you can isolate yourself from different governance
- Location Control: Allows you to tune your architecture for performance
Regions are usually referred to by the region code or the region name eg ap-southeast-2
vs asia pacific (sydney)
Availability Zone
A lower level component that gives isolated infrastructure within a region
As a solutions architect we can deploy across multiple availability zones
Edge Locations
Edge locations are much smaller than regions and typically only have content distribution services as well as some types of edge computing.
They are more prevalent than regions
How To Define Service Resilience
Service resilience can be described in one of 3 ways:
- Globally Resilient: Relatively few of these, operates globally with a single product replicated across multiple regions. It would take the world to fail to experience a full outage. Eg IAM
- Region Resilient: Services that operate in a single region with one set of data per region. A db in Sydney is different from a db in N. Virginia. They normally replicate to multiple Availability Zones
- Availability Zone resilient services: if the AZ fails, the service will fail
Virtual Private Cloud Basics (VPC)
(Comes Up A Lot On Exams)
A VPC is a virtual network inside of AWS. When you create a VPC it's created inside 1 account and 1 region. They are regionally resilient
By default, VPC's are private and isolated unless configured otherwise
Services deployed in the VPC can communicate
Two types of VPC:
- default: max 1 per region
- Custom VPC: can have many in a region - Used in almost all serious deployments
Default VPC's are configured in a very specific way
Every VPC is allocated a range of ip addresses (VPC CIDER)
Default VPC always has the same CIDR (172.31.0.0/16)
A VPC can be divided into subnets and each subnet is assigned to each availability zone. Default creates one subnet in each availability zone
These subnets also determine the start and end IP addresses
Default VPC facts
- Max one per region - can be removed and recreated
- Default VPC CIDR is always 172.31.0.0/16
- /20 subnet in each availability zone in the region
- Provided with an Internet Gateway (IGW), Security Group(SG), and NACL
- Subnets assign public IPv4 addresses
- default VPCs can be deleted
EC2 Elastic Compute Cloud Basics
Anything you need to deploy that needs compute requirements should be done on EC2
EC2 Key Facts & Features
- IAAS: Infrastructure as a Service. Provides Virtual Machines -> EC2 Instances
- Unit of consumption is instance
- Private service by default, uses VPC networking (By default runs in private zones).
- EC2 is Availability Zone resilient. The instance fails if the AZ fails
- Different instance sizes and capabilities are available. Some of these can be configured afterword
- Because EC2 is IaaS. The User manages OS and upwards
- Billing is ON DEMAND either per second or per hour. You only pay for what you consume
- Can use local on-host storage or Elastic Block Store (EBS) which is network storage made available to the instance
Instance Lifestyle
An instance has a state attribute
The most important to remember right now are:
- Running: When an instance is launched after provisioning it goes into running state
- Stopped: A shut down instance
- Terminated: One way change. Non-reversible and the instance is fully deleted
These states influence the charges
An instance is composed of:
- CPU
- Memory
- Storage
- Networking
When an instance is in the running state, you're charged for all 4 of these items
When it is in a stopped state, you are only charged for storage
Terminating an instance is the only way to fully stop all charges
Amazon Machine Image (AMI)
an AMI is an image of an EC2 instance and can be used to create an EC2 instance, or be created from an EC2 instance
An AMI is similar to a server image and contains a few important things:
- Attached Permissions: Which accounts can use the AMI (Public, Owner-Implicitly Allowed, Those with explicit permissions to use the AMI)
- Contains the Boot/Root Volume of the Instance
- Contains Block Device Mapping: Config that links the volumes the device has..which is boot, which is data
Connecting to EC2
EC2 can run different operating systems and you can connect via different methods, for instance, Windows connects via RDP on 3389, where as Linux uses SSH on port 22 and uses an SSH key
S3 Basics
S3 aka simple storage service
S3 is a global storage platform and is region based/resilient and can be accessed from anywhere
Your data doesn't leave the reason unless you configure it to
Data is replicated across AZ's
You choose the region when you create things in s3
S3 is a public service.
It's good for hosting large amounts of data
It scales from nothing to near unlimited levels
Can be accessed from:
- gui
- cli
- api
- http
S3 should be your default storage service for AWS
S3 has two main things it delivers:
- objects: Objects are individual items, a movie, a photo etc
- buckets: A bucket is a container for objects
Objects
you can think of objects like files.
An object is made up of two things:
- An object key: think of this as similar to a file name. The key identifies an object in a bucket
- Value: the data/contents of the object. The value of an object can vary from 0 up to 5tb in size
Objects also have a version id, metadata, access control, subresources
Buckets
Buckets are created in a specific AWS region.
your data in a bucket has a primary home region and it doesn't leave that region unless you configure it to leave that region.
By creating a bucket in a region, you can control the laws and regulations that apply to that data
A bucket is identified by it's name and the bucket name needs to be globally unique
Most AWS are unique in a region or unique in your account. Buckets though are different.
A bucket is infinitely expandable
A bucket has a flat structure, all objects in the bucket are stored at the same level. There are no folders
However, the UI displays it similarly to folders
Inside S3, there is no concept of file type based on the name.
"folders" in S3 are represented when you have a file type like /old/koala3.jpg
. The UI presents this as a folder called "old" and inside of that Koala3.jpg.
Folders are often referred to as 'prefixes' in S3
Buckets are the default place to configure the way S3 works
Bucket Names are Globally Unique
3-63 characters, all lower case, no underscores
must start with a lowercase letter or a number
can't be formatted like an IP address
Buckets 100 soft limit, 1000 hard limit per account
buckets can be "divided" using prefixes
Unlimited objects in a bucket, 0-5tb
key = name, value = data
S3 Patterns and Anti-Patterns
- S3 is an object store - not a file or block storage
- You cannot mount S3 buckets as a drive /images or k:\
- great for large scale data storage or distribution or upload
- great for 'offloading' things
- INPUT and or OUTPUT to MANY AWS services
All resources in AWS have a unique identifier ARN = Amazon Resource Name.arn:aws:s3:::koalacampaigna4l-13184151
CloudFormation (CFN) Basics
CloudFormation is a tool that lets you create, update, and delete AWS resources based on a template
A CloudFormation Template is written in either JSON or YAML
All templates have a list of resources that tells CloudFormation what to do
The description area is used talk about useful information that should be relayed to users. If you have both a description and an AWS format version, the description must follow the format version
Metadata: Can control how things are presented in the UI (groupings, order, labels, etc)
Parameters: You can add fields here that prompt the user for more information such as "which size of instance to create" or "which region"
Mappings: Another optional section. Allows you to create look up tables
Conditions: Allow for decision making in the template such as things that will only occur if a condition is met. Ie "if the parameter is set to prod, do something specific"
Outputs: presented when a template is done being applied, such as the instance ID of an instance
How it works
CloudFormation starts with a template.
When you give a template to cloud formation, it creates a stack out of all of the logical resources from the template.
For any logical resources in the stack, CloudFormation makes a physical resource in your AWS account
CloudWatch Basics
Cloud watch is a core AWS support product and is used by almost all AWS services and does 3 main things
CloudWatch is a product that collects and manages operation data on your behalf
Cloud watch is 3 main products in one:
- Metrics -AWS products, apps, on-premises systems (IE CPU usage, traffic...etc)
- CloudWatch Logs - Collection of Logging Data, AWS, APPS, Prem...anything that can be ingested
- CloudWatch Events: AWS Services and schedules. If an AWS does something eg EC2 terminated or started...CloudWatch can react. Or you can schedule with CloudWatch
Can do things like automatically scale EC2 for demand
CloudWatch Core Concepts
Namespace: think of it like a container for monitor resources
- Name: anything in the ruleset for naming names
- ALL AWS data goes into a namespace called AWS/followed by the service
- Metric: A collection of related data points in a time ordered structure. Ie CPU Usage, Disk Usage, IN/OUT, DiskIO
- Datapoint: any time any server measures whatever it's supposed to measure. Ie, CPU Usage...any time the CPU usage is measured, it is a data point and includes a timestamp and a value
- Dimension: Dimensions separate datapoints for different things or perspectives within a metric. IE different CPU's from EC2. Dimensions often include instance ID and instance Type
- Alarms: Linked to a specific metric and takes an action based on that metric. An alarm means that the metric is not in a good state, and you can define an action that takes place based on an alarm
Shared Responsibility Model
Most of AWS is in the IaaS column where the red is AWS responsibility, and the Blue is the Customer perspective
AWS is responsible for the security OF the cloud, where as you are responsible for the security IN the cloud.
AWS: Hardware/Global Infrastructure in regions, Availability Zones, and Edge Locations, compute, storage, DB, networking, and software that assists in these items
Customer: OS Upwards, Client Side data, encryption, network traffic protection. OS Config, firewall, platform, applications, identity, access management and customer data
High-Availability (HA) vs Fault-Tolerance (FT) vs Disaster Recovery
High-Availability
Aims to ensure an agreed level of operation performance, usually uptime, for a higher than normal period.
HA isn't aiming to stop failure. It's just meant to be online as often as possible, and when it fails, it's components can be fixed or replaced as quickly as possible. It is about maximizing a systems online time.
Say you have a requirement of 99.9% uptime...this means you can have only 8.77 hours of downtime per year
Think of this like an SUV going into the desert with a spare tire
Fault-Tolerance
Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of some(one or more faults within) of its components
Designed to work through failure
Disaster Recovery
A set of policies, tools, and procedures to enable the recovery or continuation of vital technology infrastructure and systems following a natural or human-induced disaster.
DR is designed to keep the crucial parts of your system safe
High-Availability - Minimize Any Outage
Fault-Tolerance - Operate through Faults
Disaster Recovery - Used when the above two don't work
Route53 Fundamentals
Route53 is AWS's managed DNS service
Provides two main services:
- Register Domains:
- Host Zones on managed nameservers
Route53 is a global service on a single database
Replicated through regions so it is globally resilient
Register Domains
Has relationships with all major domain registries
Hosted Zones
Zone Files hosted in AWS on four managed name servers. Can be public and accessible from anywhere. Can also be private and only accessible from within a vpc
A hosted zones stores records(recordsets)
DNS Record Types
A&AAAA
Given a DNS zone, these records map host names to IP addresses. A record maps to IPv4, AAAA maps to IPv6
CNAME
Canonical Name lets you create equivalent of DNS shortcuts. Host to Host records. Lets say we have an A record called server pointing to an IPv4 address. Creating CNAME for ftp, mail, and www all point to the A record (this often shows up as a trick question in exams)
TXT
Text records allow you to add arbitrary information to a domain. This can be something such as proving domain ownership.
MX
MX records are used as part of email services. MX records have two main parts, a priority and a value. The value can be just a host, but if you include a '.' it's a fully qualified domain name. It could be inside the zone or outside the zone. An MX record looks at the to address and then does an MX query on the domain.
Lower values on the priority field are actually higher priority
The server gets the result of the query back then connects to the other server via SMTP. MX records facilitate email
NS
Name server records are how the name delegation happens for the name servers. Points from www.com to the actual server. How delegation works end to end in DNS
TTL (Time to Live)
Measured in seconds
Using TTL we can indicate how long records are cached for
IAM, Accounts, and AWS Organizations
IAM Identity Policies
IAM policies get attached to IAM identities in AWS
Policies allow you to either allow or deny access to AWS resources
Identity Policies are created using JSON
A policy block is just one or more "statements"
AWS knows which resource or resources you are trying to interact with, then works though all the statements and sees which apply to a particular identity
Statements
A statement must match the action and resource
- Sid - Statement ID: Informs the reader what the statement does (OPTIONAL) should always be used
- Effect: Either "allow" or "deny" based on if the action match the action and resource
- Action: "Service: operation" Eg ["S3:*"] would match all of S3 (can be specific or general)
- Resource: The resource the action applies to. EG["*] for all, or ["arn:AWS:S3:::catgifs"]
It is possible to be allowed and denied at the same time
Statement Priorities
- Explicit Deny. Explicit Denies always take priority
- Explicit Allow
- Default - Implicit Deny. With the exception of the account root user, they have no access
"If they're not allowed access, they have no access"
DENY, ALLOW, DENY
There may be multiple policies involved.
When a given identity accesses a resource, it gathers all the statements that apply and evaluates them all at the same time, but the same rule applies, DENY, ALLOW, DENY.
Inline Policies
Applying JSON to each account individually...3 separate policies
Generally used in exceptions or special circumstances
Managed Policies
One policy created as it's own object, then you attach to any identity that you wanted to gain those access rights.
Managed policies should be used for normal default rights
Benefits:
- Reusable
- Low Management overhead. DRY
Two main types: - AWS Managed
- Customer Managed
IAM Users and ARN's
IAM Users are an identity used for anything requiring long-term AWS access. For example, Humans, Applications, or service accounts
IAM starts with a "principals".
A principal is a person or application that makes a request to IAM in order to interact with resources. IAM authenticates against an identity in IAM and then authorizes the principal
Authentication is done with:
- Username and Password
- Access Keys
Once a principle has gone through the authentication process, it is now an "Authenticated Identity"
AWS then knows which IAM policies apply to that identity
ARN's - Amazon Resource Names
Uniquely identify resources within any AWS accounts.
The top ARN references an actual bucket, the bottom ARN references anything in the bucket, but not the bucket itself
You can only ever have 5000 IAM users per account
An IAM user can be a member of a max of 10 groups
Simple Identity Permissions in AWS
IAM Groups
IAM groups are containers for Users
There are no credentials for IAM groups, and you cannot log into an IAM group
They're used solely for organizing users
A user can be a member of multiple groups
Groups can have both managed and inline policies
When thinking about allow/denies, for someone with a group policy and individual policy, you need to apply the sam deny allow deny rule to them
There is no limit for number of users in a group
There is no built in all users group in IAM
you cannot nest groups
there is a limit of 300 groups/account but this can be increased via support
Groups are not a true identity, they can't be referenced as a principal in a policy
IAM Roles
A role is one type of identity that exists in an AWS account (the other is USER)
A role is best suited for an unknown number of principles, not just one
Roles are generally used on a temporary basis
IAM users can have attached roles
IAM roles have two types that can be attached, the trust policies and the permissions policies
If someone assumes a role, they are assigned temporary security credentials
The credentials are checked against permission policies
sts:AssumeRole is a secure token system used to assume roles.