Splitting training and validation data

You will be creating training and validation datasets. Keeping a validation dataset and monitoring the performance of model on that set is a good practice to avoid overfitting.

For this exercise you have been provided with en_text (English sentences) and fr_text (French sentences).

Define a sequence of indices using np.arange(), that starts with 0 and has size of en_text.
Define train_inds as the first train_size set of indices from the sequence of indices.
Define tr_en and tf_fr, which contains the sentences found at the indices specified by train_inds in the lists en_text and fr_text.
Define v_en and v_fr which contains the sentences found at the indices specified by valid_inds in the lists en_text and fr_text.

Bài tập

Splitting training and validation data

Hướng dẫn

.css-6su6fj{-webkit-flex-shrink:0;-ms-flex-negative:0;flex-shrink:0;}Bài tập

Hướng dẫn

Bài tập