1. Delivering data from Kinesis Analytics
Head over to the Kinesis console, then our Analytics application.
Select Connect Reference Data. Choose sd-vehicle-data for our bucket, and provide the path to our reference data file.
Let's refer to this data as SENSORS_REF in SQL.
Edit the schema to uppercase the column names.
Then exit, and go to the SQL console.
Update the SQL query to join the data, save and run.
We will now see the sensors, with their ping count, and Vehicle Identification Number which is joined from our reference datasets!
Then, we will add another Firehose delivery stream as a destination for our analytics application. This stream will deliver the over-pinging devices to an S3 bucket.
To add a destination Firehose stream, go to Kinesis, Firehose, and Create Delivery Stream.
Let's call it buggySensorsDeliveryStream. This stream will use DirectPut or other sources for input.
Next, review the process records section, but nothing to change here.
For destination, select S3, then pick the sd-vehicle-data bucket. For prefix enter buggy-sensors/ so that all of our buggy sensors data goes to a separate directory.
Click Next, and change the buffer size to 1MB and buffer interval to 60 seconds.
Let's use the FirehoseDeliveryRole - since this stream will be writing to the same S3 bucket.
Now let's head back to Kinesis Data Analytics and our gpsAnalytics application. Select "Connect" a destination.
Select FirehoseDeliveryStream, and select the buggySensorsDeliveryStream from the list.
For the in-application stream section, select the DESTINATION_SQL_STREAM we created in Kinesis Data Analytics. This connects the Kinesis Data Analytics stream output to the Firehose stream.
Select save and continue. Now our Kinesis Data Analytics application output is connected to the Firehose stream.
If we start our input stream, then head over to S3, we will see the buggy-sensors folder created, and sensors that are over-pinging are getting written to it.
2. Let's practice!