Serverless APIs
1. Serverless APIs
Welcome to Lesson 4 of Chapter 2!2. Last lesson
In the last lesson, we set up a streaming data pipeline that takes data from vehicles, finds speeders, and generates a daily report.3. Giving access to data
We can give Cody access to download these reports in S3. However, as a data engineer, you will find yourself thinking about how to give access to data, or subsets of it, programmatically. APIs allow an external developer to write code that interacts with with your data in a programmatic, controlled way.4. Simple API
Cody decided to integrate the speeding data into HR's employee performance system. She hired a developer to build a system that will use an API to get the speeding information for a certain date.5. Simple API
Before serverless, standing up an API meant provisioning a webserver, writing an app, and doing a bunch of work. Today - we can make a quick API using Lambda! Let's create a new Lambda function, using Python 3.8 and call it speederReporterApi.6. Simple API
Add the AWS Data Wrangler layer7. Add a trigger
Create a trigger, using the API Gateway. Choose HTTP API, with Open Security, call it getSpeedersByDate, and click add.8. Test your new API
In the main function view, you will see a new API and a URL. If you open that URL in a new window, you will see "Hello From Lambda".9. Our first API
That's what our callback code returns. We just made our first API in under 5 minutes!10. API parameters
Let's create a sample request to test with. We will simulate someone hitting the API with a request for a specific date.11. Lambda handler
Now, for the function code. We import the dependencies and initialize the boto3 session.12. Lambda handler
In the handler method we get the filter date from queryStringParameters, and use wrangler to read the CSV file.13. Respond with data
Finally, we return the result in an object. We return a Status Code of 200 - OK, a Content Type header, specifying that we're sending JSON, and finally, the DataFrame converted to JSON.14. Live response
Visiting the URL in the browser gets a full JSON response that the HR developer can use.15. Trigger another Lambda
If the request for data is for today, let's recalculate the data and create a speeder file - even if it's not midnight yet. To do this, we will need to trigger speederAggregator from speederReporterAPI.16. Trigger another Lambda
To do this, let's define a method called trigger_recalc. It will get called by speederReporterAPI if the date requested is today. First, let's create a lambda client.17. speederAggregator ARN
We get the ARN of speederAggregator at the top of the main Lambda function screen.18. Invoke
Then we call the lambda client's invoke() method. We pass speederAggregator's ARN in the functionName parameter. We specify "Event" as invocation type to call the function asynchronously. If we were to use the RequestResponse - the default - the requester would have to wait for speederAggregator to complete before he receives the response for speederReporterAPI. This way, the next request the developer makes for today will have the freshest data - speederReporterAPI will have run in the background.19. Review
Whew! We've learned a lot in these 2 chapters. We started off using Firehose collecting data from vehicle sensors and storing it in a bucket.20. Review
Next, we used the incoming data to trigger an SMS message, and write out a subset of speeders to a different folder.21. Review
Then, we set up a nightly lambda function that aggregates all the speeders into a daily report.22. Review
Finally, we built an API that a developer can use to integrate with an HR system, and trigger another lambda function to recalculate available data.23. Let's practice!
Let's practice our new data engineering skills!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.