Get startedGet started for free

Serverless APIs

1. Serverless APIs

Welcome to Lesson 4 of Chapter 2!

2. Last lesson

In the last lesson, we set up a streaming data pipeline that takes data from vehicles, finds speeders, and generates a daily report.

3. Giving access to data

We can give Cody access to download these reports in S3. However, as a data engineer, you will find yourself thinking about how to give access to data, or subsets of it, programmatically. APIs allow an external developer to write code that interacts with with your data in a programmatic, controlled way.

4. Simple API

Cody decided to integrate the speeding data into HR's employee performance system. She hired a developer to build a system that will use an API to get the speeding information for a certain date.

5. Simple API

Before serverless, standing up an API meant provisioning a webserver, writing an app, and doing a bunch of work. Today - we can make a quick API using Lambda! Let's create a new Lambda function, using Python 3.8 and call it speederReporterApi.

6. Simple API

Add the AWS Data Wrangler layer

7. Add a trigger

Create a trigger, using the API Gateway. Choose HTTP API, with Open Security, call it getSpeedersByDate, and click add.

8. Test your new API

In the main function view, you will see a new API and a URL. If you open that URL in a new window, you will see "Hello From Lambda".

9. Our first API

That's what our callback code returns. We just made our first API in under 5 minutes!

10. API parameters

Let's create a sample request to test with. We will simulate someone hitting the API with a request for a specific date.

11. Lambda handler

Now, for the function code. We import the dependencies and initialize the boto3 session.

12. Lambda handler

In the handler method we get the filter date from queryStringParameters, and use wrangler to read the CSV file.

13. Respond with data

Finally, we return the result in an object. We return a Status Code of 200 - OK, a Content Type header, specifying that we're sending JSON, and finally, the DataFrame converted to JSON.

14. Live response

Visiting the URL in the browser gets a full JSON response that the HR developer can use.

15. Trigger another Lambda

If the request for data is for today, let's recalculate the data and create a speeder file - even if it's not midnight yet. To do this, we will need to trigger speederAggregator from speederReporterAPI.

16. Trigger another Lambda

To do this, let's define a method called trigger_recalc. It will get called by speederReporterAPI if the date requested is today. First, let's create a lambda client.

17. speederAggregator ARN

We get the ARN of speederAggregator at the top of the main Lambda function screen.

18. Invoke

Then we call the lambda client's invoke() method. We pass speederAggregator's ARN in the functionName parameter. We specify "Event" as invocation type to call the function asynchronously. If we were to use the RequestResponse - the default - the requester would have to wait for speederAggregator to complete before he receives the response for speederReporterAPI. This way, the next request the developer makes for today will have the freshest data - speederReporterAPI will have run in the background.

19. Review

Whew! We've learned a lot in these 2 chapters. We started off using Firehose collecting data from vehicle sensors and storing it in a bucket.

20. Review

Next, we used the incoming data to trigger an SMS message, and write out a subset of speeders to a different folder.

21. Review

Then, we set up a nightly lambda function that aggregates all the speeders into a daily report.

22. Review

Finally, we built an API that a developer can use to integrate with an HR system, and trigger another lambda function to recalculate available data.

23. Let's practice!

Let's practice our new data engineering skills!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.