Adrian Cantrill AWS Solutions Architect Associate Course

Introduction

Site Tools and features

github
slack

Scenario - Animals4Life

Animals4life.org is used throughout the course to model a real-world situation to show use cases
Animals4life is an animal rescue and awareness org that rehouse and monitors animals that uses IOT and big data
- global company
- call center/admin/IT/Marketing/Legal & accounts
- 200 staff, half remote, half onsite in Australia
- also has lobbying staff and major offices in London, New York, Seattle
- Currently have small on prem data center in Brisbane
- They previously had a bad AWS Trial in Sydney Aus
- Also tried other cloud services
- All remote workers rely on infrastructure in AUS
- company is cost conscious
- needs to be available 24/7

Problems

On-prem hardware is failing and must be replaced in 18 month
AWS/Azure previous experience were messy and not best practice
Performance issues exist for field workers
Lack of high-availability systems and scalability
small it team with little automation/cloud experience
global expansion concerns - cost for new infrastructure

Ideal Outcomes

fast performance for field workers
able to deploy into new regions quickly when required
low cost and scalable base infrastructure
Agility - spin up new marketing campaigns, social and progressive applications(iot, big data, etc is important
Automation - low base staffing costs

AWS Accounts - The Basics

An AWS account is a container for identities and AWS resources

when creating an account you need a name, a unique email address, and a credit card
This creates an account root user
AWS are billed pay-as-you go
this course will use free-tier as much as possible
Account root user has full control over the AWS account
you can create sub-identities that can be restricted that uses Identity Access Management (IAM) IAM users/groups/roles with the ability to limit permissions
permissions must be explicitly granted
the orange line in the below picture is the boundary of the account, it helps keep things out and things in

Setting it up

Pasted image 20230710123757.png
Pasted image 20230710123811.png

Cost Management

Setting up a budget

IAM Identity and Access Management

Pasted image 20230813155049.png
Each AWS account has a root user with full unrestricted access
IAM lets us restrict access to different users in a least privileged accessed way
IAM identities start with no permissions and can be granted permissions
Each account comes with its own IAM database.
IAM is a globally resilient service - any data is always secure across all AWS regions
The IAM you see is your own dedicated IAM. the IAM service is trusted fully.
Inside IAM you can create different identities.

IAM Basics

IAM lets you create three different types of identity objects:
- IAM users: represents humans or applications that need access to your account
- IAM groups: Collection of related users, eg dev team, finance, HR
- IAM roles: can be used by AWS services or for granting external access to your account.
IAM policies are documents that can be used to allow or deny access to AWS services only when attached to users, groups, or roles

IAM has three main jobs:

It manages identities - it's an ID provider
It authenticates the identities it manages
It authorizes access based on policies

IAM is free, there are no costs associated with creating users, groups, or roles
there are limits on the quantities of users etc

IAM is a global service and is globally resilient

IAM allows or denies its identities on its AWS account
There is no direct control over external accounts or users
IAM lets you make use of identity federation and MFA

normal practice is to replace the root user with an IAM admin identity

IAM Access Keys

command line access is done via IAM access keys
they are a long-term credential.
They have to be explicitly changed
Am IAM user has 1 username and 1 password
An IAM user can have 0,1, or 2 keys which can be created, deleted, made inactive, or active and are defaulted to active

Access keys are made from two parts:

the Access key id: shorter
the secret access key: longer, more complex
The only time you can get the secret access key is when you create it

Rotating access keys is when you create a new access key and delete the old ones

IAM users are the only identity that uses access keys

Creating IAM access keys

This is done through the security console
you can deactivate and reactivate keys
AWS cli integration requires credential configuration in order to configure
AWS configure --profile iamadmin-general
our server - us-east-1

AWS Fundamentals

AWS Public vs Private Services

AWS services can be categorized into two types:

Public:
Private: Runs within a VPC(Virtual Private Cloud). Only things in the VPC can access
These refer to networking only

When thinking of networks, people generally thing in two zones:

Public Internet zone
Private networks: Like home private network - only things connected to a port in your house can operate in the personal network
AWS has private zones called VPC's Nothing from the internet can reach a VPC unless it's configured to allow outside connections

The AWS public zone runs between the public internet zone and the AWS private zone.
Pasted image 20230814171428.png
Pasted image 20230814171349.png

AWS Global Infrastructure

AWS is a collection of individual infrastructure worldwide and consists of AWS regions and AWS edge locations

Regions

don't map directly on to continent or country. It contains a full deployment of AWS infrastructure (all services)
we can use this concept to design systems that are resilient to global disasters

Regions have 3 main benefits:

Geographic Separation: Each region is geographically separate. Disasters in one region wont impact other regions
Geopolitical Separation: you can isolate yourself from different governance
Location Control: Allows you to tune your architecture for performance

Regions are usually referred to by the region code or the region name eg ap-southeast-2 vs asia pacific (sydney)

Availability Zone

A lower level component that gives isolated infrastructure within a region
As a solutions architect we can deploy across multiple availability zones

Pasted image 20230814173011.png

Edge Locations

Edge locations are much smaller than regions and typically only have content distribution services as well as some types of edge computing.
They are more prevalent than regions

How To Define Service Resilience

Service resilience can be described in one of 3 ways:

Globally Resilient: Relatively few of these, operates globally with a single product replicated across multiple regions. It would take the world to fail to experience a full outage. Eg IAM
Region Resilient: Services that operate in a single region with one set of data per region. A db in Sydney is different from a db in N. Virginia. They normally replicate to multiple Availability Zones
Availability Zone resilient services: if the AZ fails, the service will fail

Virtual Private Cloud Basics (VPC)

(Comes Up A Lot On Exams)
A VPC is a virtual network inside of AWS. When you create a VPC it's created inside 1 account and 1 region. They are regionally resilient

By default, VPC's are private and isolated unless configured otherwise
Services deployed in the VPC can communicate

Two types of VPC:

default: max 1 per region
Custom VPC: can have many in a region - Used in almost all serious deployments
Default VPC's are configured in a very specific way

Every VPC is allocated a range of ip addresses (VPC CIDER)
Default VPC always has the same CIDR (172.31.0.0/16)
A VPC can be divided into subnets and each subnet is assigned to each availability zone. Default creates one subnet in each availability zone
Pasted image 20230814193944.png

These subnets also determine the start and end IP addresses
Pasted image 20230814194029.png

Default VPC facts

Max one per region - can be removed and recreated
Default VPC CIDR is always 172.31.0.0/16
/20 subnet in each availability zone in the region
Provided with an Internet Gateway (IGW), Security Group(SG), and NACL
Subnets assign public IPv4 addresses
default VPCs can be deleted

EC2 Elastic Compute Cloud Basics

Anything you need to deploy that needs compute requirements should be done on EC2

EC2 Key Facts & Features

IAAS: Infrastructure as a Service. Provides Virtual Machines -> EC2 Instances
Unit of consumption is instance
Private service by default, uses VPC networking (By default runs in private zones).
EC2 is Availability Zone resilient. The instance fails if the AZ fails
Different instance sizes and capabilities are available. Some of these can be configured afterword
Because EC2 is IaaS. The User manages OS and upwards
Billing is ON DEMAND either per second or per hour. You only pay for what you consume
Can use local on-host storage or Elastic Block Store (EBS) which is network storage made available to the instance

Instance Lifestyle

An instance has a state attribute
The most important to remember right now are:

Running: When an instance is launched after provisioning it goes into running state
Stopped: A shut down instance
Terminated: One way change. Non-reversible and the instance is fully deleted
These states influence the charges

An instance is composed of:

CPU
Memory
Storage
Networking
When an instance is in the running state, you're charged for all 4 of these items
When it is in a stopped state, you are only charged for storage

Pasted image 20230814204147.png

Terminating an instance is the only way to fully stop all charges

Amazon Machine Image (AMI)

an AMI is an image of an EC2 instance and can be used to create an EC2 instance, or be created from an EC2 instance

An AMI is similar to a server image and contains a few important things:

Attached Permissions: Which accounts can use the AMI (Public, Owner-Implicitly Allowed, Those with explicit permissions to use the AMI)
Contains the Boot/Root Volume of the Instance
Contains Block Device Mapping: Config that links the volumes the device has..which is boot, which is data

Connecting to EC2

EC2 can run different operating systems and you can connect via different methods, for instance, Windows connects via RDP on 3389, where as Linux uses SSH on port 22 and uses an SSH key

S3 Basics

S3 aka simple storage service
S3 is a global storage platform and is region based/resilient and can be accessed from anywhere
Your data doesn't leave the reason unless you configure it to
Data is replicated across AZ's
You choose the region when you create things in s3
S3 is a public service.
It's good for hosting large amounts of data
It scales from nothing to near unlimited levels
Can be accessed from:

gui
cli
api
http
S3 should be your default storage service for AWS

S3 has two main things it delivers:

objects: Objects are individual items, a movie, a photo etc
buckets: A bucket is a container for objects

Objects

you can think of objects like files.
An object is made up of two things:

An object key: think of this as similar to a file name. The key identifies an object in a bucket
Value: the data/contents of the object. The value of an object can vary from 0 up to 5tb in size

Objects also have a version id, metadata, access control, subresources
Pasted image 20230817150525.png

Buckets

Buckets are created in a specific AWS region.
your data in a bucket has a primary home region and it doesn't leave that region unless you configure it to leave that region.
By creating a bucket in a region, you can control the laws and regulations that apply to that data

A bucket is identified by it's name and the bucket name needs to be globally unique
Most AWS are unique in a region or unique in your account. Buckets though are different.

A bucket is infinitely expandable
A bucket has a flat structure, all objects in the bucket are stored at the same level. There are no folders
However, the UI displays it similarly to folders
Inside S3, there is no concept of file type based on the name.

"folders" in S3 are represented when you have a file type like /old/koala3.jpg. The UI presents this as a folder called "old" and inside of that Koala3.jpg.
Folders are often referred to as 'prefixes' in S3

Buckets are the default place to configure the way S3 works

Info

Bucket Names are Globally Unique
3-63 characters, all lower case, no underscores
must start with a lowercase letter or a number
can't be formatted like an IP address
Buckets 100 soft limit, 1000 hard limit per account
buckets can be "divided" using prefixes
Unlimited objects in a bucket, 0-5tb
key = name, value = data

S3 Patterns and Anti-Patterns

S3 is an object store - not a file or block storage
You cannot mount S3 buckets as a drive /images or k:\
great for large scale data storage or distribution or upload
great for 'offloading' things
INPUT and or OUTPUT to MANY AWS services
All resources in AWS have a unique identifier ARN = Amazon Resource Name. arn:aws:s3:::koalacampaigna4l-13184151

CloudFormation (CFN) Basics

CloudFormation is a tool that lets you create, update, and delete AWS resources based on a template

A CloudFormation Template is written in either JSON or YAML
All templates have a list of resources that tells CloudFormation what to do
The description area is used talk about useful information that should be relayed to users. If you have both a description and an AWS format version, the description must follow the format version

Metadata: Can control how things are presented in the UI (groupings, order, labels, etc)
Parameters: You can add fields here that prompt the user for more information such as "which size of instance to create" or "which region"
Mappings: Another optional section. Allows you to create look up tables
Conditions: Allow for decision making in the template such as things that will only occur if a condition is met. Ie "if the parameter is set to prod, do something specific"
Outputs: presented when a template is done being applied, such as the instance ID of an instance

How it works

CloudFormation starts with a template.
When you give a template to cloud formation, it creates a stack out of all of the logical resources from the template.

For any logical resources in the stack, CloudFormation makes a physical resource in your AWS account
Pasted image 20230817171930.png

CloudWatch Basics

Cloud watch is a core AWS support product and is used by almost all AWS services and does 3 main things
CloudWatch is a product that collects and manages operation data on your behalf
Cloud watch is 3 main products in one:

Metrics -AWS products, apps, on-premises systems (IE CPU usage, traffic...etc)
CloudWatch Logs - Collection of Logging Data, AWS, APPS, Prem...anything that can be ingested
CloudWatch Events: AWS Services and schedules. If an AWS does something eg EC2 terminated or started...CloudWatch can react. Or you can schedule with CloudWatch

Can do things like automatically scale EC2 for demand

CloudWatch Core Concepts

Namespace: think of it like a container for monitor resources

Name: anything in the ruleset for naming names
ALL AWS data goes into a namespace called AWS/followed by the service
Metric: A collection of related data points in a time ordered structure. Ie CPU Usage, Disk Usage, IN/OUT, DiskIO
Datapoint: any time any server measures whatever it's supposed to measure. Ie, CPU Usage...any time the CPU usage is measured, it is a data point and includes a timestamp and a value
Dimension: Dimensions separate datapoints for different things or perspectives within a metric. IE different CPU's from EC2. Dimensions often include instance ID and instance Type
Alarms: Linked to a specific metric and takes an action based on that metric. An alarm means that the metric is not in a good state, and you can define an action that takes place based on an alarm

Shared Responsibility Model

Pasted image 20230818164148.png
Most of AWS is in the IaaS column where the red is AWS responsibility, and the Blue is the Customer perspective

AWS is responsible for the security OF the cloud, where as you are responsible for the security IN the cloud.
AWS: Hardware/Global Infrastructure in regions, Availability Zones, and Edge Locations, compute, storage, DB, networking, and software that assists in these items
Customer: OS Upwards, Client Side data, encryption, network traffic protection. OS Config, firewall, platform, applications, identity, access management and customer data
Pasted image 20230818164553.png

High-Availability (HA) vs Fault-Tolerance (FT) vs Disaster Recovery

High-Availability

Aims to ensure an agreed level of operation performance, usually uptime, for a higher than normal period.

HA isn't aiming to stop failure. It's just meant to be online as often as possible, and when it fails, it's components can be fixed or replaced as quickly as possible. It is about maximizing a systems online time.

Say you have a requirement of 99.9% uptime...this means you can have only 8.77 hours of downtime per year

Think of this like an SUV going into the desert with a spare tire

Fault-Tolerance

Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of some(one or more faults within) of its components

Designed to work through failure

Disaster Recovery

A set of policies, tools, and procedures to enable the recovery or continuation of vital technology infrastructure and systems following a natural or human-induced disaster.

DR is designed to keep the crucial parts of your system safe

High-Availability - Minimize Any Outage
Fault-Tolerance - Operate through Faults
Disaster Recovery - Used when the above two don't work

Route53 Fundamentals

Route53 is AWS's managed DNS service
Provides two main services:

Register Domains:
Host Zones on managed nameservers
Route53 is a global service on a single database
Replicated through regions so it is globally resilient

Register Domains

Has relationships with all major domain registries

Hosted Zones

Zone Files hosted in AWS on four managed name servers. Can be public and accessible from anywhere. Can also be private and only accessible from within a vpc
A hosted zones stores records(recordsets)

DNS Record Types

A&AAAA

Given a DNS zone, these records map host names to IP addresses. A record maps to IPv4, AAAA maps to IPv6

CNAME

Canonical Name lets you create equivalent of DNS shortcuts. Host to Host records. Lets say we have an A record called server pointing to an IPv4 address. Creating CNAME for ftp, mail, and www all point to the A record (this often shows up as a trick question in exams)

TXT

Text records allow you to add arbitrary information to a domain. This can be something such as proving domain ownership.

MX

MX records are used as part of email services. MX records have two main parts, a priority and a value. The value can be just a host, but if you include a '.' it's a fully qualified domain name. It could be inside the zone or outside the zone. An MX record looks at the to address and then does an MX query on the domain.
Lower values on the priority field are actually higher priority
The server gets the result of the query back then connects to the other server via SMTP. MX records facilitate email

NS

Name server records are how the name delegation happens for the name servers. Points from www.com to the actual server. How delegation works end to end in DNS

TTL (Time to Live)

Measured in seconds
Using TTL we can indicate how long records are cached for

IAM, Accounts, and AWS Organizations

IAM Identity Policies

IAM policies get attached to IAM identities in AWS
Policies allow you to either allow or deny access to AWS resources

Identity Policies are created using JSON
A policy block is just one or more "statements"
Pasted image 20230822201734.png

AWS knows which resource or resources you are trying to interact with, then works though all the statements and sees which apply to a particular identity

Statements

A statement must match the action and resource

Sid - Statement ID: Informs the reader what the statement does (OPTIONAL) should always be used
Effect: Either "allow" or "deny" based on if the action match the action and resource
Action: "Service: operation" Eg ["S3:*"] would match all of S3 (can be specific or general)
Resource: The resource the action applies to. EG["*] for all, or ["arn:AWS:S3:::catgifs"]
It is possible to be allowed and denied at the same time

Statement Priorities

Explicit Deny. Explicit Denies always take priority
Explicit Allow
Default - Implicit Deny. With the exception of the account root user, they have no access
"If they're not allowed access, they have no access"
DENY, ALLOW, DENY

There may be multiple policies involved.
When a given identity accesses a resource, it gathers all the statements that apply and evaluates them all at the same time, but the same rule applies, DENY, ALLOW, DENY.
Pasted image 20230822203113.png

Inline Policies

Applying JSON to each account individually...3 separate policies
Generally used in exceptions or special circumstances

Managed Policies

One policy created as it's own object, then you attach to any identity that you wanted to gain those access rights.
Managed policies should be used for normal default rights
Benefits:

Reusable
Low Management overhead. DRY
Two main types:
AWS Managed
Customer Managed

IAM Users and ARN's

IAM Users are an identity used for anything requiring long-term AWS access. For example, Humans, Applications, or service accounts

IAM starts with a "principals".
A principal is a person or application that makes a request to IAM in order to interact with resources. IAM authenticates against an identity in IAM and then authorizes the principal

Authentication is done with:

Username and Password
Access Keys

Once a principle has gone through the authentication process, it is now an "Authenticated Identity"

AWS then knows which IAM policies apply to that identity
Pasted image 20230824133654.png

ARN's - Amazon Resource Names

Uniquely identify resources within any AWS accounts.

Pasted image 20230824133935.png
The top ARN references an actual bucket, the bottom ARN references anything in the bucket, but not the bucket itself

Info

You can only ever have 5000 IAM users per account
An IAM user can be a member of a max of 10 groups

Simple Identity Permissions in AWS

IAM Groups

IAM groups are containers for Users
There are no credentials for IAM groups, and you cannot log into an IAM group
They're used solely for organizing users
A user can be a member of multiple groups
Pasted image 20241120115959.png

Groups can have both managed and inline policies
When thinking about allow/denies, for someone with a group policy and individual policy, you need to apply the sam deny allow deny rule to them

There is no limit for number of users in a group
There is no built in all users group in IAM
you cannot nest groups

there is a limit of 300 groups/account but this can be increased via support

Groups are not a true identity, they can't be referenced as a principal in a policy

IAM Roles

A role is one type of identity that exists in an AWS account (the other is USER)
A role is best suited for an unknown number of principles, not just one
Pasted image 20241121114956.png

Roles are generally used on a temporary basis
IAM users can have attached roles
IAM roles have two types that can be attached, the trust policies and the permissions policies
If someone assumes a role, they are assigned temporary security credentials
The credentials are checked against permission policies
sts:AssumeRole is a secure token system used to assume roles.
Pasted image 20241121115245.png