1. Learn
  2. /
  3. Courses
  4. /
  5. Machine Translation with Keras

Connected

Exercise

Splitting training and validation data

You will be creating training and validation datasets. Keeping a validation dataset and monitoring the performance of model on that set is a good practice to avoid overfitting.

For this exercise you have been provided with en_text (English sentences) and fr_text (French sentences).

Instructions

100 XP
  • Define a sequence of indices using np.arange(), that starts with 0 and has size of en_text.
  • Define train_inds as the first train_size set of indices from the sequence of indices.
  • Define tr_en and tf_fr, which contains the sentences found at the indices specified by train_inds in the lists en_text and fr_text.
  • Define v_en and v_fr which contains the sentences found at the indices specified by valid_inds in the lists en_text and fr_text.