Kinesis

  Amazon Web Services (AWS), Analytics

Kinesis Lecture

Video: https://www.udemy.com/aws-certified-solutions-architect-associate/learn/v4/t/lecture/6645692?start=0
FAQ (Kinesis Streams): https://aws.amazon.com/kinesis/streams/faqs/
FAQ (Kinesis Firehose): https://aws.amazon.com/kinesis/firehose/faqs/
FAQ (Kinesis Analytics): https://aws.amazon.com/kinesis/analytics/faqs/
Kinesis Streams vs Firehose: https://www.sumologic.com/blog/devops/kinesis-streams-vs-firehose/

What is Streaming Data

Streaming Data is data that is generated continuously by thousands of data sources, which typically send in the data records simultaneously and in small sizes (order of Kilobytes).

  • Purchases from online stores (amazon.com)
  • Stock Prices
  • Game data
  • Social Network data
  • Geospatial data (uber.com)
    • Constantly updating where your uber driver is.
    • iOT sensor data

What is Kinesis

Amazon Kinesis is a platform on AWS to send your streaming data to.  Kinesis makes it easy to load and analyze streaming data and also provides the ability for your to build your own custom applications for your business needs.

Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale.  You can configure hundreds of thousands of data producers to continuously put data into an Amazon Kinesis stream.

Fore example, data from website clickstreams, applications logs, and social media feeds.  Within less than a second, the data will be available for your Amazon Kinesis Applications to read and process from the stream.

  • Used to consume big data.
  • Stream large amounts of social media, news feeds logs, etc. into the cloud.
  • To process this data – depends on the use case:
    • Business intelligence?  Redshift
    • Big Data Processing?  Elastic Map Reduce

Core Kinesis Services – Know These!

Kinesis Streams

  • Producers send data to Kinesis Streams
    • EC2, Cell Phones, Laptops, Physical Servers
  • Data is stored in ‘Shards
    • 24 hour retention by default
    • Can be increased to 7 days
  • Consumers (Usually EC2 instances) take the data from the Shards and turn it into something useful.
  • Consumers then often (but not required) send that data to a database or some other form of storage system
    • DynamoDB
    • S3
    • EMR (??) Elastic Map Reduce
    • Redshift (Data Warehouse)

Shards

  • 5 transactions per second for reads, up to a maximum total data read rate of 2 MB per second
  • Up to 1,000 records per second for writes, up to a maximum total data write rate of 1 MB per second (Including partition keys)
  • The data capacity of your stream is a function of the number of shards that you spcify for the stream.  The total capacity of the stream is the sum of the capacities of its shards.

Kinesis Firehose

Firehose is much more automated than Streams

  • Producers send data to Kinesis Streams
    • EC2, Cell Phones, Laptops, Physical Servers
  • Data is sent to Firehose
    • No retention period.  Data is processed immediately
      • Data can be analysed by Lambda (optional)
      • Data is sent to some sort of storage
        • If using Redshift, must be sent to S3 first.
      • Can be sent to an Elasticsearch Cluster ??

Kinesis Analytics

  • Performs a SQL query on the data then sends it to whichever storage solution (S3, Redshift, Elasticsearch Cluster, etc.)
  • Can be performed on both Kinesis Streams and Kinesis Firehose.

 

Kinesis Lab

 

I got nothing out of this lab whatsoever.

 

Before getting started

  1. Download the file located here and rename it kinesis-data-vis-sample-app.template
  2. Change the AWS region to Virginia (This lab didn’t work for me in Ohio)
  • Management Tools > CloudFormation > [Create new stack]
    • (*) Upload a template to S3 and upload the template file you just downloaded.
    • [Next]
  • Specify Details
    • Stack Name: MyKinesisStack
  • Parameters
    • Leave everything default
    • [Next]
  • Options
    • Leave everything default
    • [Next]
  • [x] I acknowledge that AWS CloudFormation might create IAM resources.
    • [Create]
    • The creation process may take 10-15 minutes.  Get some coffee
  • Click the {Outputs} tab and click the URL link to view the output.

Exam Tips

  • Know the differences between Kinesis Streams and Kinesis Firehose
    • Will be given different scenario questions and you must choose the most relevant service.
      • If ‘Shards’ then is Streams
      • If ‘Lambda’ then is Firehose.
  • Understand what Kinesis Analytics is.

 

 

LEAVE A COMMENT