Comments (6)
If someone is looking for the answer, here what I did and worked for me:
`tokenize = lambda x:x.split(' ')
SRC = Field(tokenize = tokenize)
TRG = Field(tokenize = tokenize,)
fields = {'Source': ('src',SRC), 'Target': ('trg',TRG)}
train_data, valid_data, test_data = torchtext.legacy.data.TabularDataset.splits(
path = '',
train = 'My_train_Set.csv',
test = 'My_test_set.csv',
validation = 'My_Validation_Set.csv',
format = 'csv',
fields = fields)
SRC.build_vocab(train_data, min_freq=2)
TRG.build_vocab(train_data, min_freq=2)
BATCH_SIZE = 128
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
train_iterator, valid_iterator, test_iterator = BucketIterator.splits(
(train_data, valid_data, test_data),
batch_size = BATCH_SIZE,
sort_within_batch = True,
sort_key = lambda x : len(x.src),
device = device)`
from pytorch-seq2seq.
i also want to ask this question!
from pytorch-seq2seq.
from pytorch-seq2seq.
I don't know how your data structured but mine was originally in Excel files so I didn't have any problems converting them to CSV.
from pytorch-seq2seq.
can you tell me how to make your own data of the csv format?
from pytorch-seq2seq.
Thanks for this great solution.
Using model with custom dataset is always a big bored and irritable problem
from pytorch-seq2seq.
Related Issues (20)
- Thank you! HOT 1
- [Bug] Tranformer Seq2Seq Have Wrong Inputs! HOT 2
- Question
- torchtext recent version (0.12.0) doesn't support Field, BucketIterator HOT 4
- Question about how to resolve the out of vocabulary problem during encoding and decoding in tutorial 1
- Possible Inaccuracies in training script
- Tutorial 6: [Attention is All You need] Different output at different batch size during Inference
- Question about changing params init from xavier to kaiming
- Transformer ScaledDotProductAttention energy value on 16-bit Precision. HOT 3
- Using pretrained BERT embedding
- Why using tanh function HOT 3
- How do you make this work on android?
- Notebook 1 <eos> problem. HOT 2
- no module named 'torchtext.legacy' HOT 2
- import
- possible opposite explanation of hidden compared to output in notebook #3
- Seq2seq: Input not matching Output (and big thanks)
- How to change seq2seq to graph2seq
- Incorrect German Translation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch-seq2seq.