To train the model on our own dataset, do we only need to provide the simple text file where each line is corresponding to question and response in source and target file respectively?
Is it true that the opensubtitles source and target files are shuffled already?