Session Ready
Exercise

Generating translations

You will now be generating French translations using an inference model trained using Teacher Forcing.

This model (nmt_tf) has been trained for 50 epochs on 100,000 sentences which achieved around 98% accuracy on a 35000+ validation set. It might take longer for this exercise to initialize as the trained model needs to be loaded. You are provided with the sents2seqs() function. You have also been given two new functions:

word2onehot(tokenizer, word, vocab_size) which accepts:

  • tokenizer - A Keras Tokenizer object
  • word - A string representing a word from the vocabulary (e.g. 'apple')
  • vocab_size - Vocabulary size

probs2word(probs, tok) which accepts:

  • probs - An output from the model of the shape [1,<French Vocab Size>]
  • tok - A Keras Tokenizer object

You can peak at the source code for these functions by typing print(inspect.getsource(word2onehot)) and print(inspect.getsource(probs2word)) in the console.

Instructions
100 XP
  • Predict the initial decoder state (de_s_t) with the encoder.
  • Predict the output and the new state from the decoder using the previous prediction (output) and the previous state as inputs. Remember to recursively generate the new state.
  • Get the word string from the probability output using the probs2word() function.
  • Convert the word string to a one-hot sequence using the word2onehot() function.