Section 4: Object Storage and CDN – S3, Glacier and CloudFront

  Amazon Web Services (AWS)

13: S3 101

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/v4/t/lecture/4324004?start=0

What is S3?

  • Simple Storage Service
  • Secure, durable, highly scalable object storage
  • Web interface to store and retrieve any amount of data from anywhere on the web.
  • “A place to store your files on the cloud”
  • Data spread across multiple devices and facilities.

What are Objects?

  • Videos, documents, spreadsheets
  • aka “flat files”
  • Not good for executable or database files – better in Block storage

The Basics

  • Object based – allows you to upload files.
  • 0B – 5TB file size
  • Unlimited storage
  • Files are stored in buckets (folders)
  • S3 is a universal namespace, that is, names must be unique globally
    • https://s3-eu-west-1amazonaws.com/bucketname
  • After file is uploaded, S3 returns an HTTP 200 code if successful.

Data Consistency Model for S3

  • Read after Write consistency for PUTS of new Objects
    • Immediate access to file
  • Eventual Consistency for overwrite PUTS (file updates) and DELETES
    • These can take some time to propagate so all locations are in sync.
    • You might get the new version or the old version. Never a partial version.

Objects consist of the following:

  • Key (Name of the object)
    • Sorted alphabetically
    • Recommend to start filenames with random characters (salt) to ensure files are spread across S3
  • Value (The data that is the file – sequence of Bytes)
  • Version ID (Important for versioning)
  • Metadata (Data about the data you are storing)
    • Date, last update, etc.
  • Subresources
    • Access Control Lists
      • Who can access this file & different permissions
      • Can be applied on Files, or at the Bucket level
    • Torrent

The Basics

  • Built for 99.99% availability
  • Amazon Guarantee 99.9% availability
  • Amazon guarantees 99.999999999% durability for data integrity (11 9’s)
  • Tiered Storage Available
    • S3
      • 99.99% availability, 11 9’s durability
      • Stored redundantly across multiple devices in multiple facilities
      • Design to sustain the loss of 2 facilities concurrently
      • Ability to host static websites (HTML only, no PHP)
    • S3-IA (Infrequently Accessed)
      • Accessed less frequently but requires rapid access when needed
      • Lower initial cost, but charges a retrieval fee.
      • Example: Payroll data or wage slips.
    • RRS (Reduced Redundancy Storage)
      • 99.99% availability (same as S3)
      • 99.99% Durability
      • Good for files that can be regenerated if lost
        • Image thumbnails
    • Glacier
      • Very low cost.
      • Used for archival only.
      • 3-5 hours for file restore.
  • Lifecycle Management
    • New files stored front of the bus
    • Older files stored middle of the bus
    • Very old stored back of the bus
  • Versioning
  • Encryption (multiple methodology)
  • Secure with:
    • Access Control Lists
    • Bucket Policies

 

S3 Breakdown

S3 S3-IA S3-RRS Glacier
Durability 99.999999999% 99.999999999% 99.99% 99.999999999%
Availability 99.99% 99.9% 99.99% N/A
Availability SLA 99.9% 99% 99.9% N/A
Concurrent Facility Fault Tolerance 2 2 1
SSL Support Yes Yes Yes Yes
Min. Object Size N/A 128KB* N/A
Min. Storage Duration N/A 30 Days 90 Days
Retrieval Fee N/A Per GB Per GB Per GB**
First Byte Latency milliseconds milliseconds milliseconds select minutes or hours***
Storage Class Object Level Object Level Object Level Object Level
Lifecycle Transitions Yes Yes Yes Yes

 

Charges and Fees

  • Storage
  • Requests
  • Storage Management Pricing
    • Tagging of data for what the data is associated with for cost tracking?
  • Data Transfer Pricing
    • Data coming in is free
    • Moving data around
      • Replication between regions
    • Transfer Accelerations
      • Takes advantage of CloudFront’s edge locations to allow fast transfer of files (file uploads) between end users and an S3 bucket.  As data arrives at an edge location, data is routed to S3 over an optimized network path

Read the S3 FAQ Before taking the exam!!

14: Create an SE Bucket (Lab)

  • Console > Storage > S3 > [+ Create bucket]
    • Bucket names must be DNS complient
      • 63 characters maximum
      • lower case letters, numbers and hyphens only
      • No spaces, underscores or other special characters.
    • Select a region (does not have to be in the same region as your EC2 Instance.)
    • [Create]

Bucket Properties

  • Click on the bucket to view the bucket’s properties (Not the bucket name)
  • Objects
    • Upload
    • Set Object Properties
    • Set Object Permissions
      • Default all objects are private but can set unique access control policies
  • Properties
    • Versioning
    • Logging
    • Static Website Hosting
    • Advanced Properties
      • Tags
        • Good for cost controls, defines ‘who the bucket is owned by’
        • Requires account with ‘tag key values’ included
        • Can then tag buckets with up to 10 ‘resources’
      • Cross-region Replication
      • Transfer Acceleration
      • Events
        • Can be used with Lambda functions
      • Requester Pays
  • Permissions
    • Access Control List
      • Who can access
      • Default = Private
    • Bucket Policy
    • CORS configuration
    • Manage Users
  • Management
    • Lyfecycle
      • Tier off data to different storage tiers based on creation date
    • Analytics
      • Help to determine which storage solution would work best for the money spent
    • Metrics
      • Total storage
      • # of requests
      • Data transfer rates
    • Inventory
      • reports that tell you what is in your bucket.

Upload a file

  • Select Files
    • Bucket > Objects > [Upload] > Drag and Drop or Browse > [Next]
  • Set permissions
    • Manage users
      • Objects
        • Grant permissions to the users to list, create, overwrite or delete the object(s)
      • Object permissions
        • Grant permissions to the users to read or write to the access control list for the bucket.
    • Manage public permissions
      • Do not grant public read access to this object(s) (Default)
      • Grant public read access to this object(s)
  • Set properties
    • Storage class
      • Standard (S3)
      • Standard-IA (S3-IA infrequent access)
      • Reduced redundancy (S3-RRS)
    • Encryption
      • None
      • Amazon S3 master-key
      • AWS KMS master-key (Key Management Service)
    • Metadata
      • All name-value pairs
      • Cannot be changed once uploaded.  What??
  • Review > [Upload]
  • After the upload completes, you’re returned to the Objects tab.  Clicking the object will show you properties that were just set.
    • Included with the properties will be a link to the file.
      • Private files will display an error
      • Public files will display or download.

15: Version Control (Lab)

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/v4/t/lecture/4325268?start=0

The Basics

  • To enable versioning:
    • Console > S3 > Select Bucket > Properties > Versioning
  • Versioning is enabled at the ‘bucket level’ and applies to all files within that bucket.
  • Once enabled, it cannot be removed, however it can be disabled.
    • To ‘remove’ Versioning, you will need to create a new bucket, then transfer the contents over to it.
  • Any changes to a file will save the changes as a new file with the same Key, but updated Version ID as the original file.
  • The total size of the object will be the sum of all version sizes
  • Deleted versions cannot be restored, however deleted objects can be.
  • Integrates with Lifecycle rules.
  • Multi-factor Authentication can be enabled to prevent object deletion.

 

16: Cross Region Replication

  • Versioning must be enabled in both the source and target (destination) buckets
  • Buckets must be in different regions.
  • Possible to replicate specific folders, or the entire bucket.
  • In the Cross-region replication window, select ‘create a new role’
    • This will automatically create the ‘replication_role_for_SourceBucket_to_TargetBucket’ role
  • Existing objects will NOT automatically replicate
    • Modified objects will be replicated, including all pre-existing versions.
  • Permissions are replicated to the target (if public, will also be public on the target)
  • Replication to multiple buckets is not possible.  You can only replicate to a single target.
  • Replication cannot be ‘daisy chained’.
    • Example bucket1 > bucket2  bucket2 > bucket2
    • File is uploaded to bucket1 and automatically replicated to bucket2
    • File is NOT replicated from bucket2 to bucket3.
  • ‘Deleting’ an object will also ‘delete’ it from the target (‘delete’==hide)
    • Restoring will likewise restore on the target.
    • Deleting a ‘version’ will NOT replicate

17: Lifecycle Management & Glacier (Lab)

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/v4/t/lecture/4325274?start=0

  • If offloading to Glacier, make sure the bucket created is in a region that supports Glacier (i.e. not Singapore or SA)
  • Does not require Versioning
  • Select what ‘type’ of file to manage
    • Action on Current Version
    • Action on Previous Versions (Versioning enabled required)
    • Action on Incomplete Multipart Uploads

Exam Tips

  • Dates on current versions (or only versions) based on file creation dates.
  • Can be used in conjunction with Versioning
  • Can be applied to current and previous versions
    • Previous version dates based on date it becomes a previous version.
  • -> Standard IA
    • 128kB min file size
    • 30 days minimum after creation date
  • -> Glacier
    • 30 days after (IA if relevant).
  • Permanent Deletion possible.
  • Mostly high level understanding required.

18: CloudFront CDN Overiew

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/v4/t/lecture/4325282?start=0

Definitions

  • CDN (Content Delivery Network):  A CDN is a system of distributed servers (network of servers) that deliver webpages and other web content to a user based on the geographical location of the user, the origin of the webpage and a content delivery server.
  • Edge Location: This is the location where the content will be cached (stored).  This is separate to a Region/AZ.  (Edge locations may exist where a Region does not.)
  • Origin: This is the origin of all the files that the CDN will distribute.  This can either be an S3 bucket, an EC2 Instance, an Elastic Load Balancer, Route 53, or even a server outside of AWS.
  • Distribution: This is the name given the CDN which consists of a collection of Edge Locations.
    • Web Distribution: Typically used for Websites
    • RTMP: Used for streaming media

How does CDN work?

Example: Webpage is served from a region in the UK.  A user in Austrailia views the page for the first time.

  • The user’s request will first go to a CDN Edge server in Australia for the file
    • If the file exists, it will be served from the Edge location
    • If it does not, it will pull the file from the source server in the UK, then it caches it for the time duration specified by the TTL.
  • The next user to request that file closest to that Edge location will be able to download it directly from the Edge.

CloudFront

CloudFront is optimized to work with other Amazon services, such as S2, EC2, Elastic Load Balancing, Route 53, ext. CloudFront also works seamlessly with any non-AWS origin server, which stores the original, definitive version of your files.  (Files do not need to originate from AWS)

Exam Tips

  • Edge locations are both Read and Write (You can put objects on them)
  • Objects are cached for the life of the TTL (Time to Live)
  • You can clear cached objects, but you will be charged.

 

19: Create a CloudFront CDN (Lab)

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/v4/t/lecture/4325290?start=0

  • Dashboard > Networking and Distribution > CloudFront < [Create Distribution]
  • Select between Web and RTMP > [Get Started]
    • Web:
      • Files can be stored anywhere
      • Multiple origins can be added.
    • RTMP:
      • Files must be stored in an S3 bucket.
      • Used for streaming Adobe Flash files

Create Distribution

  • Origin Settings
    • Origin Domain Name (URL to origin)
      • Auto-Dropdown of account S3 Buckets
    • Origin Path
      • Used to specify a specific path within the origin
    • Origin ID
      • Unique name to identify ‘this’ origin.
      • Useful when using multiple origins
    • Restrict bucket Access
      • Yes – All requests to S3 must go through CloudFront
        • Origin Access Identity: Out of scope
          • [*] Create a New Identity
        • Grant Read Permissions on Bucket
          • Yes – Automatically update bucket permissions
          • No – Manual permissions update
        • Origin Custom Headers: Out of Scope
          • Header Name __________
          • Value ___________
          • Can add multiple Custom Header Pairs
      • No – typical access
  • Default Cache Behavior Settings
    • Path Pattern: Allows you to use ‘regular expressions’ to set the bucket.  Example: Client requests a .jpg file. You can set the recommended source bucket based on the file type.
    • Viewer Protocol Policy
      • HTTP & HTTPS: Both are allowed
      • Redirect to HTTP to HTTPS: All HTTP requests are auto converted to HTTPS
      • HTTPS Only: HTTP requests are not allowed and will fail.
    • Allowed HTTP Methods
      • GET, HEAD
      • GET, HEAD, OPTIONS
      • GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE
    • Cached HTTP Methods
      • GET, HEAD by default
      • [ ] OPTIONS is optional 🙂
        • Only available if OPTIONS is an ‘Allowed HTTP Method’
    • Forward Headers (Out of Scope)
      • None
      • Whitelist
      • All
    • Object Caching
      • [x] Use Origin Cache Headers
      • [ ] Customize
    • Minimum TTL
      • Always measured in Seconds
      • Really depends on user’s needs
        • Changing a lot?  Make it small
        • Hasn’t changed in years?  Turn it up like Freedom Rock!
      • Minimum, Maximum and Default (86400=24 hrs)
    • Forward Cookies
      • None
      • Whitelist
      • All
    • Forward Query Strings
      • Yes or No
    • Smooth Streaming (Using Microsoft’s Smooth Streaming service?)
      • Yes or No
    • Restrict Viewer Access (Use Signed URLs or Signed Cookies)
      • Will need to know for Exam!
      • Yes
        • If you restrict viewer access, viewers must use CloudFront signed URLs or signed cookies to access your content.  For more information, see Serving Private Content through CloudFront in the Amazon CloudFront Developer Guide
        • Trusted Signers
          • [ ] Self
          • [ ] Specify Accounts
      • No
    • Compress Objects Automatically
      • Yes or No
  • Distribution Settings
    • Price Class
      • All Edge Locations (Best Performance)
      • Only US, Canada and Europe
      • Only US, Canada, Europe and Asia
    • AWS WAF Web ACL
      • Can use a Web Application Firewall for CloudFront
      • Because this is so new, it is likely not in the exam.
    • Alternate Domain Names (CNAMES)
      • Use your own sub.domain.com instead of the long wanky one AWS will provide to access this distribution.
    • SSL Certificate
      • [ ] Default CloudFront Cert (*.cloudfront.net)
        • Select if using HTTPS with CloudFront domain (https://blahblah.cloudfront.net)
        • Only works with semi-modern browsers
      • [ ] Custom SSL Cert (sub.domain.com)
        • Custom cert must be stored in AWS Cert Manager (ACM)
        • Grayed out unless a cert is already stored
      • Supported HTTP Versions (Not shown in class example!)
        • [ ] HTTP/2, HTTP/1.1, HTTP/1.0
        • [ ] HTTP/1.1, HTTP/1.0
      • Default Root Object: ________
        • Seems like this would be similar to ‘index.html’ if a users just browses to the URL without a filename.
      • Logging
        • On
          • Bucket for Logs _____________
          • Log Prefix ___________
          • Cookie Logging
            • On or Off
        • Off
      • Enable IPv6 (Not shown in example)
        • Enable or Disable
      • Comment _________
      • Distribution State
        • [x] Enabled
        • [ ] Disabled
  • [Create Distribution]

Creating the distribution can take a long time – even with very few or no files.

Exam Notes:

  • You can enable multiple sources.
    • Once enabled, you can use regular expressions to sort which source a file should originate – such as .pdf, .jpg, .rtt, etc.
  • Geo-Restrictions allow you to whitelist or blacklist specific countries from accessing your CDN.
    • You CANNOT both whitelist and blacklist.  You must choose one or the other.
  • Invalidations are a way to remove objects from your cache before the TTL expires, but you will pay for these.
  • To Delete a ‘Distribution’, you must first disable it, then (after 15 minutes or so) you can delete it.

20: S3 – Security & Encryption

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/v4/t/lecture/4325300?start=0

Securing Your Buckets

  • By default all newly created buckets are PRIVATE
  • You can setup access control to your buckets using:
    • Bucket Policies – applies to all objects in the bucket
    • Access Control Lists – apply to individual objects
  • S3 buckets can be configured to create access logs which log all requests made to the S3 bucket.
    • Logs can be stored in another AWS bucket or account.

Encryption – Important!  Know all 4

  • In Transit – while data is being transferred back and forth
    • SSL / TLS
  • At Rest
    • Server Side Encryption
      • SSE-S3 Managed Keys – Server Side Encryption – S3
        • Each object encrypted with its own key and that key is encrypted with a master key that is regularly rotated by AWS.
        • Keys are handled by AWS and are compliant with AES-256
        • Click on the Object and click [Encrypt]
      • SSE-KMS – AWS Key Management Service
        • Similar to SSE-S3, but a few added benefits and charges.
        • Separate permissions for use of an envelop key.
          • Envelope key is used to protect your encryption key
          • Audit trail of when key was used and who used it.
          • Can create and manage your own keys
      • SSE-C – Customer provided keys
  • Client Side Encryption
    • Data is pre-encrypted prior to uploading to S3

Important to know

  • Use with Bucket Policies and Access Control Lists
  • Enable Logging
  • Secure data traveling to and from S3 using SSL

21: Storage Gateway

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/v4/t/lecture/4325316?start=0

AWS Storage Gateway is a service that connects an op-premises software appliance with cloud-based storage to provide seamless and secure integration between an organizatio’s on-premises IT environment and AWS’s storage infrastructure.  The service enables you to securely store data to the AWS cloud for scalable and cost-effective storage.

AWS Storage Gateway’s software appliance is available for download as a virtual machine (VM) image that you install on a host in your datacenter.  Storage Gateway supports either VMware ESXi or Microsoft Hyper-V.  Once you’ve installed your gateway and associated it with your AWS account through the activation process, you can use the AWS Management Console to create the storage gateway option that is right for you.

4 Types of Storage Gateways

  • File Gateway (NFS)
    • Good for flat files
    • Files are stored as objects in your S3 buckets, accessed through a Network File System (NFS) mount point.  Ownership, permissions and timestamps are durably stored in SE in the user-metadata of the object associated with the file.  Once objects are transferred to S3, they can be managed as native s3 objects, and bucket policies such as versioning, lifecycle management and cross-region replication apply directly to objects stored in your bucket.
  • Volume Gateway (iSCSI)
    • The volume interface presents your applications with disk volumes using the iSCSI block protocol.  Data written to these volumes can be asynchronously backed up as point-in-time snapshots of your volumes, and stored in the cloud as Amazon EBS (Elastic Block Storage) snapshots.  Snapshots are incremental backups that capture only changed blocks.  All snapshot storage is also compressed to minimize your storage charges.
    • In short, the Volume Gateway takes data that is stored on virtual hard disks on premise, and backs them up to the cloud.
    • Block based storage
      • Operating Systems
      • Applications
      • Virtual drives for VMs
      • Databases
      • Can be used for flat files, but not really designed for that.
    • 2 Different types of Volume Gateways
      • Stored Volumes
        • Store an entire copy of your data On-Premises, while asynchronously backing up that data to AWS,  Stored volumes provide your on-premises applications with low-latency access to their entire datasets, while providing durable, off-site backups.  You can create storage volumes and mount them as iSCSI devices from your on-premises application servers (web servers, database servers, etc.).  Sata written to your stored volumes is stored on your on-premises storage hardware (physical disk).  This data is asynchronously backed yo to S3 in the form of EBS snapshots. 1GB-16TB in size for Stored Volumes.
          • The storage volumes are NOT thin provisioned, so a 1TB drive will take 1TB space.
      • Cached Volumes
        • Cached volumes let you use S3 as your primary data storage while retaining frequently accessed data locally in your storage gateway.  Cached volumes minimize the need to scale you on-premises storage infrastructure , while still providing your applications with low-latency access to their frequently accessed data.  You can create storage volumes up to 20TB in size and attached toi them as iSCSI devices from you on-premises application servers.  Your gateway stores data that you write to these volumes in S3 (EBS) and retains recently read data in your on-premises storage gateway’s cache and upload buffer storage. 1GB – 32GB in size for Cached Volumes.
          • Only recently accessed data is stored On_prem.
          • The remaining data backed off to S3 (EBS)
  • Tape Gateway (VTL – Virtual Tape Library)
    • Tape Gateway offers a durable, cost-effective solution to archive your data in the AWS Cloud.  The VTL interface it provides lets you leverage your existing tape-based backup application infrastructure to store data on virtual tape cartridges that you create on your tape gateway.  Each tape gateway is preconfigured with a media changer and tape drives, which are available to your existing client backup applications as iSCSI devices.  You add tape cartridges as you need to archive your data.  Supported by NetBackup, Backup Exec, Veeam, etc.
    • These are then sent to S3 and can use Lifecycle policies to send them over to Glacier.

For the exam

  • File Gateway – For flat files, stored directly on S3 (nothing on premises.
  • Volume Gateway
    • Stored Volumes – Entire Dataset is stored on site and is asychronously backed up to S3 (Snapshots)
    • Cached Volumes – Entire Dataset is stored on S3 and the most frequently accessed data is cached on site.
  • Gateway Virtual Tape Library (VTL) – Used to backup and uses popular backup applications like NetBackup, Backup Exec, Veeam, etc.

3 ways to connect Storage Gateway to AWS

  • Direct Connect
  • Internet (most common)
  • Amazon VPC (Virtual Private Cloud)
    • This could mean that the Storage Gateway VM is actually sitting at AWS instead of locally.

22. Snowball

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/v4/t/lecture/4325320?start=0

AWS Import/Export Disk accelerates moving large amounts of data into and out of the AWS cloud using prtable storage devices for transport.  AWS Import/Export Disk transfers your data directly onto and off of storage devices using Amazon’s high-speed internal network and bypassing the Internet.

  • Snowball
    • On Board Storage Only
    • Petabyte-scale data transport solution tha uses secure appliances to transfer large amounts of data into and out of AWS.  Using Snowball addresses common challenges with large-scale data transfers including high network costs, long transfer times, and security concerns.  Transferring data with Snowball is simple, fast, secure, and can be as little as one-fifth the cost of high-speed Internet.
    • 80TB snowball in all regions (50TB in US).  Snowball uses multiple layers of security designed to protect your data including tamper-resistant enclosures, 256-bit encryption, and an industry-standard Trusted Platform Module (TPM) designed 6to ensure both security and full chain-of-custody of your data.  (You can track where your Snowball is at any given time.)  Once the data transfer job has been processed and verified, AWS performs a software erasure of the Snowball appliance.
  • Snowball Edge
    • 100TB device with On Board Storage and Computer capabilities.  (Like a mini AWS datacenter you can bring on premises.  You can even run Lamda functions on it!)
    • Use Snowball Edge to move large amounts of data into and out of AWS, as a temporary storage tier for large local datasets, or to support local workloads in remote or offline locations.
    • Snowball Edge connects to your existing applications and infrastructure using standards storage interfaces, streamlining the data transfer process and minimizing setup and integration.  Snowball Edge can cluster together to form a local storage tire and process your data on-premises, helping ensure your applications continue to run even when they are not able to access the cloud.
  • Snowmobile
    • Mounted on the back of a Semi.  Petabyte and Exabytes worth of data.
    • AWS Snowmobile is an Exabyte-scale data transfer service used to move extremely large amounts of data to AWS.  YOu can transfer up to 100PB per Snowmobile, a 45′ long ruggedized shipping container, pulled by a semi-trailer truck.  Snowmobile makes it easy to move massibe bolumes of data to the cloud, including ideo libraries, image repositories or even a complete data center migration.  Transferring data with Snowmobile is secure, fast and cost effective.

Exam Tips

  • Understand what a Snowball is
  • Understand what Import Export is (Legacy application, but still available)
  • Snowball can
    • Import to S3
    • Export from S3
    • If using Glacier, the data will need to be moved to S3 first.
  • Snowball is located in the Migration category!

23: Snowball (Lab)

Umm.  Order, track, install, upload, ship back.

24: S3 Transfer Acceleration

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/v4/t/lecture/4989308?start=0

S3 Transfer Acceleration utilizes the CloudFront Edge Network to accelerate your uploads to S3.  Instead of uploading directly to your S3 bucket, you can use a distinct URL to upload directly to an edge location (something that is closer to you) which will then transfer that file to S3.  You will get a distinct URL to upload to.

  • Bucket > Properties > Transfer Acceleration > Enable / Disable
  • There is a fee to use this service.
  • You can ‘test’ the speed difference before enabling this feature
  • Generally, the farther the user is from the bucket, the better the gains by using Transfer Acceleration
    • My tests from Austin to Singapore showed using T.A. as being SLOWER than not using it!

 

25: Create a static website using S3

  • S3 > Select Bucket > Properties >Static Website Hosting > [*] Use the bucket to host a website
  • This will display the ‘endpoint; (domain name and path)
    • http://thomasandsofia.s3-website.us-east-2.amazonaws.com
    • Index document is required.  If missing, will display a 403 Access Denied error.
      • Adding an error document is optional?  Can you set an error doc only? (No!)
    • Public access is required.  If not set, will display a 403 error.

 

26: S3 Summary

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/v4/t/lecture/4325548?start=0

  • S3 is Object Based
    • Good for Static (Flat) Files
      • PDF
      • Word
      • Images
      • Video
    • Not good for Dynamic (changing) Files
      • Operating systems
      • Executable
      • Database
      • Games
  • Files can be 0 to 5 TB size
  • No storage limits
  • All files stored in Buckets.
  • S3 is a “Universal” name space, so each bucket must have a unique, DNS appropriate name.  (all lower case, no spaces or special characters, etc.)
  • https://s3-region.amazonaws.com/bucketname
  • Consistency
    • New objects get ‘Read after Write’ consistency. (Immediately access the new file)
    • Eventual consistency for overwrites, updates, deletes, ect. due to propagation times.
      • different users might get different version of the file until propagation completes.
  • Storage Classes (Tiers)
    • S3 (a.k.a. S3 Standard)
      • Durable, Immediately available, frequently accessed
      • No charge to access the files.
      • 2 Nodes can fail and files still accessible (99.99)
    • S3 – IA (Infrequently Accessed)
      • Durable, immediately accessible, infrequently accessed
      • Incurs additional costs as files are accessed.
      • 2 Nodes can fail and files still accessible (99.99)
    • S3 – RRS (Reduced Redundancy Storage)
      • Good for items that can be easily reproduced (thumb nails)
      • Not good for mission critical data.
      • 1 node can fail and file still accessible (99.9)
      • Durability 99.99 (Not 99.999999999)
    • Glacier
      • Least expensive, very slow access times (may be several hours)
      • Great for archive data
      • Minimum storage time or extra fee for deleting.
  • Core Fundamentals
    • Key = File Name
    • Value = Actual file data
    • Version ID
    • Metadata – Data about data
    • Access Control Lists
  • Versioning
    • Stores ALL versions (including all writes, even if you delete an object.)
    • Great backup tool.  (Deletes not actually deleted)
    • Once enabled cannot be disabled, but can be suspended.
    • Integrates with Lifecycle Rules
    • Allows MFA (Multi Factor Authentication) delete capability to provide an additional layer of security.
    • Required for Cross Region Authentication.
  • Lifecycle Management
    • Plays well with Versioning
      • Can have different ‘rules’ for current versions and previous versions.
    • Options
      • Move to S3-IA 30 days after creation (Must be 128Kb min (Or charge that anyway)
      • Move to Glacier
        • 30 days after IA if relevant.
        • Next day if NOT using IA
        • Permanently Delete using rules.
  • CloudFront
    • Edge Location: Datacenter where the data is cached.  This is NOT necessarily related to an AWS Region / AZ
    • Origin: The initial location hosting the files that the CDN will distribute.  These can be an S3 Bucket, EC2 Instance, Elastic Load Balancer or Route53.
      • Multiple Origins are allowed per Distribution (if Web Distribution is selected)
    • Distribution: The name given to the CDN which consists of a collection of Edge Locations.
      • Web Distribution: Typically used for Websites
      • RTMP: (Real Time Messaging Protocol) Used for Media Streaming / Adobe Flash files.
    • Edge Locations not just READ ONLY, but can be used to write.
    • Objects cached for the life of the TTL
      • Extra charges to clear cached objects prior to their TTL.
  • Security
    • By default, all newly created buckets are PRIVATE
    • Access Controls can be setup using:
      • Bucket Policies (Bucket Level)
      • Access Control Lists (Object Level)
    • Can be configured to create Access Logs which log all requests made to the S3 Bucket.
      • Log files can be stored to an alternate S3 bucket.
        • In your own account
        • In someone else’s account!
  • Encryption
    • In Transit
      • SSL or TLS
    • At Rest
      • Server Side Encryption, Managed Keys: SSE-S3
        • Each object is encrypted with a unique key using strong multifactor encryption.
        • Key is also encrypted with a Master key that itself is frequently rotated (AES 256)
        • AWS manages all aspects of the keys
      • Key Management Service, Managed Keys: SSE-KMS
        • Added benefits and costs
        • Separate permissions for an Envelope Key (Key that protects the data’s encryption key)
        • Allows you to create an Audit trail of when the key was used and by whom.
      • Server Side Encryption with Customer Provided Keys: SSE-C
        • Client managed all aspects of the key
    • Client Side Encryption
      • Files are encrypted BEFORE they are uploaded to S3
    • Encryption Client
      • Is this the same as Client Side Encryption????  It was not in the videos!!
      • Using an encryption client library, such as the Amazon S3 Encryption Client, you retain control of the keys and complete the encryption and decryption of objects client-side using an encryption library of your choice. Some customers prefer full end-to-end control of the encryption and decryption of objects; that way, only encrypted objects are transmitted over the Internet to Amazon S3.
  • Storage Gateway
    • Requires a VM to be installed ‘on prem’ to access S3
    • File Gateway (NFS)
      • Used for Flat files, stored directly on S3
    • Volume Gateway (iSCI)
      • Stored Volumes: Entire Dataset is stored on site and is asynchronously backed up to S3.
        • Virtual Machines, OS, Databases, etc.
      • Cached Volumes: Entire Dataset stored onn S3 and the most frequently accessed data is cached on prem.
    • Gateway Virtual Tape Library (VTL)
      • Used for backup and uses popular backup applications like NetBackup, Backup Exec, Veeam etc.
      • Store these virtual tapes in S3
  • Snowball
    • Snowball Standard: Pure Storage, 50TB Starting, 80TB standard
    • Snowball Edge: Storage plus Compute for Lambda functions (Mini version of an AWS Datacenter In A Box)
    • Snowmobile
      • Storage Container moved by a semi
      • Can have armored protection
      • Only available in the US and only contiguous states.
      • Petabytes worth of data
    • Snowball Exam Tips
      • Know what a Snowball is
      • Understand what Import Export is (See above somewhere)
      • Snowballs can
        • Import to S3
        • Export from S3
  • Transfer Acceleration
    • Used to speed up transfers to S3 by using Edge Locations
      • Best if upload source is located very far away from the destination S3 bucket.
    • Added costs to use
    • Sample test available to see if Transfer Acceleration is worth the money or not.
  • Static Websites
    • Must be enabled in the Bucket Properties.
    • Serverless, so no EC2 required
    • Only good for STATIC files.  HTML, Images, ect.
    • Cannot be used for PHP.
  • Closing Thoughts
    • Writes to S3 reply with HTTP 200 code if successful.
    • Upload faster by enabling multipart uploads
    • Read the S3 FAQ!  It comes up A LOT!!

LEAVE A COMMENT