Generating English-French translations
Did you know that HSBC bank once spent $10 million on re-branding its slogan, due to a translation mistake?
We will use the trained model to predict the French translation of an English sentence using model.predict()
. You will be provided with the trained model (model
). This model has been trained for 50 epochs on 100,000 sentences which achieved around 90% accuracy on a 35000+ word validation set. It might take longer for this exercise to load as the trained model is loaded before the exercise. Furthermore, you will also be provided with a dictionary (fr_id2word
) which you can use to convert word indices to words. Finally, you will use the sents2seqs
function that you implemented earlier to preprocess the data before feeding it to the model.
You can use help(sents2seqs)
to remind yourself what's accepted by the sents2seqs()
function.
This exercise is part of the course
Machine Translation with Keras
Exercise instructions
- Preprocess source
en_st
so that it is converted to a onehot encodednumpy
array using the previously definedsents2seqs
function. - Predict the output for
en_seq
using the provided trainedmodel
. - Extract the maximum index for each prediction of
fr_pred
usingnp.argmax
and assign it tofr_seq
. - Convert the French sequence IDs to a sentence using list comprehension (remember to ignore 0s) and assign it to
fr_sent
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
en_st = ['the united states is sometimes chilly during december , but it is sometimes freezing in june .']
print('English: {}'.format(en_st))
# Convert the English sentence to a sequence
en_seq = ____(____, en_st, ____=True, reverse=____)
# Predict probabilities of words using en_seq
fr_pred = ____.____(en_seq)
# Get the sequence indices (max argument) of fr_pred
fr_seq = ____.____(fr_pred, axis=____)[0]
# Convert the sequence of IDs to a sentence and print
fr_sent = [____[i] for i in ____ if i != ____]
print("French (Custom): {}".format(' '.join(fr_sent)))
print("French (Google Translate): les etats-unis sont parfois froids en décembre, mais parfois gelés en juin")