Returning structured prediction response

1. Returning structured prediction response

The last step in deploying our models is to create endpoints that accept data, make predictions, and return structured responses.

2. Challenges with deploying models

Whenever we are deploying ML models, it's not enough to just make predictions. We need to accept input data properly, validate it before prediction, handle errors gracefully and return the predictions from models as structured responses. Let's see how FastAPI can help us do this.

3. Defining request structure

We'll use Pydantic models to define our request and response structures. This gives us automatic validation. We have defined a PredictionRequest model that tells our API to expect a text from the user and ensures it is a valid string only. Then we have the PredictionResponse model. This defines what our API will send back as prediction output. The output contains the text on which sentiment analysis is to be applied, as a string. It also has the sentiment from the model as string, and lastly, the confidence score from the model for that prediction as a floating point number.

4. Creating the prediction endpoint

Incoming data in a POST request needs to be validated, and FastAPI automatically converts the incoming JSON request into a PredictionRequest object. Here in this endpoint, we are first checking if our model is loaded. If the model is not loaded, we raise an HTTPException with the status_code 503 which means service unavailable along with the detail "Model not loaded." Then we make the prediction and store it in the result variable which contains the sentiment label and confidence score in the first element of the result list. Finally, we return the response as a PredictionResponse model object that returns it in a structured pre-defined format as JSON.

5. Error handling

We can further add a try-except block to handle any error that comes up while making predictions. Here, we have added a try block for making predictions and returning the response. If there is any error while making predictions, the except block raises an HTTPException with code 500 indicating internal server error. So if the model is not able to infer the sentiment, FastAPI raises an HTTPException and the output will be converted into a formatted JSON response.

6. Testing the endpoint

To test our endpoint, we can use the curl command or the requests python library. Here we are using the requests package that allows us to send HTTP requests and comprehend responses. We have our API running and we send a POST request to the local server with some text in the JSON field of the post method from the requests library. We passed "Great product" as text, and we get back a structured JSON response with our prediction, confidence score and the request text as defined in the PredictionResponse model.

7. Let's practice!

Now that we have learned how to return structured responses as JSON, let's put our skills to the test with some coding exercises.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Deploying AI into Production with FastAPI

AdvancedSkill Level

4.9+

215 reviews