Get startedGet started for free

Monitoring performance

1. Monitoring performance

Welcome back!

2. Last lesson

In the last lesson, we have created our Elasticsearch instance.

3. Lambda transform

Now, we can create the lambda transform function that will run each tweet against the Amazon comprehend API.

4. Lambda transform

The lambda code will look like a regular lambda transform function. We initialize the boto3 comprehend client in the handler, create a list to store our results, then we loop over the incoming records.

5. Lambda transform

We load the data, and decode it from base64. Then, we call the detect_sentiment() method of the boto3 comprehend client. Since twitter data from our query is coming in English and Spanish, we pass the language code directly from Twitter.

6. Lambda transform

Then we put it all together in a record so that it matches the model Firehose expects to receive from Lambda, and return when we're done looping through the records.

7. Wiring it up

Now all that's left is to wire everything up.

8. Update firehoseDeliveryRole

First, let's update firehoseDeliveryRole to have full access to Elasticsearch.

9. Create delivery stream

When we create the new Firehose delivery stream, we repeat all the steps from Chapter 1, except we will select Elasticsearch as the destination. We select the domain as the Elasticsearch domain we created. Lastly, we specify the index, which will act like a table where all our tweets get written to.

10. Create delivery stream

We will also specify the S3 bucket where our tweets will get backed up.

11. Cloudwatch

Let's make sure that everything is running smoothly, to meet our requirement of losing as little data as possible. To make this happen, we're going to use the AWS Cloudwatch service. Cloudwatch lets us monitor numerous metrics for every AWS service. There are four crucial components to CloudWatch that all build on top of each other. Logs are the raw data sent to CloudWatch by AWS services. Those are analyzed to present metrics - which are measures of various activities of the service. Metrics can be used to trigger alarms - or notifications when a metric is out of a specified range. Lastly, metrics can be visualized in dashboards.

12. Failure points

When I think of what to monitor, I like to think of places in the pipeline where a failure would have the most effect. Failure points are places where things could fail. Our architecture's failure points are highlighted in blue. Cloudwatch can help us monitor hundreds of different metrics across these services. We'll focus on making sure that data gets written to the stream and Elasticsearch is getting records.

13. Let's practice!