Detecting language
The City Council is wondering whether it's worth it to build a Spanish version of the Get It Done application. There is a large Spanish speaking constituency, but they are not sure if they will engage. Building in multi-lingual translation complicates the system and needs to be justified.
They ask Sam to figure out how many people are posting requests in Spanish.
She has already loaded the CSV into the dumping_df variable and subset it to the following columns:

Help Sam quantify the demand for a Spanish version of the Get It Done application. Figure out how many requesters use Spanish and print the final result!
Questo esercizio fa parte del corso
Introduction to AWS Boto in Python
Istruzioni dell'esercizio
- For each row in the DataFrame, detect the dominant language.
- Assign the first selected language to the
'lang'column. - Count the total number of posts in Spanish.
Esercizio pratico interattivo
Prova a risolvere questo esercizio completando il codice di esempio.
# For each dataframe row
for index, row in dumping_df.iterrows():
# Get the public description field
description =dumping_df.loc[index, 'public_description']
if description != '':
# Detect language in the field content
resp = comprehend.____(____=description)
# Assign the top choice language to the lang column.
dumping_df.loc[index, 'lang'] = resp['____'][0]['____']
# Count the total number of spanish posts
spanish_post_ct = len(dumping_df[dumping_df.lang == 'es'])
# Print the result
print("{} posts in Spanish".format(spanish_post_ct))