Comments (4)
Thank you. Problem solved according to your advice.
from unsupervisedmt.
Mmm not sure what is happening here.. but it's definitely a simple fix. Apparently Numpy is not able to compute the average of one list in self.stats.
Can you try to debug it with something like this:
for k, l in mean_loss:
if len(self.stats[l]) == 0:
continue
print(k, l, self.stats[l])
print(np.mean(self.stats[l]))
To see what is failing. Then, adding a .item()
might help. See if the line that fails still fails with: print(np.mean([x.item() for x in self.stats[l]]))
(I had similar issues with PyTorch 0.4, but not 0.5)
from unsupervisedmt.
Thank you.
I've found some stats[l]
with type list of float, while some with type list of tensor.
As follows:
XE-fr-fr xe_costs_fr_fr [24.71696662902832, 15.35357666015625, 14.00998306274414, 10.67545223236084, 12.328499794006348, 10.166346549987793, 9.223615646362305, 10.425082206726074, 9.373058319091797, 8.913718223571777, 9.59327507019043, 10.338445663452148, 9.292261123657227, 8.838339805603027, 9.697969436645508, 8.906725883483887, 8.574493408203125, 10.022809982299805, 8.99677562713623, 8.417030334472656, 9.018220901489258, 8.298065185546875, 7.982298374176025, 7.654149055480957, 7.863222599029541, 8.192898750305176, 8.38564395904541, 7.941415786743164, 8.161163330078125, 7.49355411529541, 8.103026390075684, 8.087912559509277, 7.8237762451171875, 7.40770149230957, 7.23865270614624, 8.092958450317383, 6.812219142913818, 7.91216516494751, 6.933863162994385, 6.812775611877441, 7.149317741394043, 7.432501316070557, 6.9618120193481445, 6.810456275939941, 7.347645282745361, 7.362916469573975, 6.808451175689697, 7.550614833831787, 7.067275524139404, 7.3161444664001465]
8.91774487495
XE-en-fr-en xe_costs_en_fr_en [21.315174102783203, 14.113967895507812, 16.057392120361328, 14.214425086975098, 10.695882797241211, 12.561439514160156, 12.702580451965332, 9.774046897888184, 10.083074569702148, 9.16340446472168, 9.50981330871582, 8.895936012268066, 8.666557312011719, 9.547480583190918, 9.107008934020996, 8.7278413772583, 8.256816864013672, 8.65737533569336, 9.26156234741211, 9.16889762878418, 9.101730346679688, 8.083001136779785, 8.140402793884277, 8.307229042053223, 8.865376472473145, 8.346843719482422, 8.2429780960083, 8.382173538208008, 8.205283164978027, 7.960878372192383, 7.709842681884766, 7.594789981842041, 7.891580581665039, 7.779585361480713, 7.633982181549072, 7.272852420806885, 7.942907810211182, 7.431552886962891, 7.231876373291016, 8.1515474319458, 7.91450834274292, 6.876171112060547, 7.3396382331848145, 7.740161895751953, 7.5449442863464355, 7.542090892791748, 7.457121849060059, 8.104904174804688, 7.7293620109558105, 7.319813251495361]
9.12651616096
XE-fr-en-fr xe_costs_fr_en_fr [19.55707359313965, 13.122364044189453, 11.77707290649414, 11.066141128540039, 11.021257400512695, 13.121122360229492, 11.121556282043457, 10.597886085510254, 9.561735153198242, 9.616044998168945, 11.355292320251465, 8.618619918823242, 9.483935356140137, 8.754108428955078, 8.33616828918457, 8.592178344726562, 8.52877140045166, 8.667609214782715, 8.422994613647461, 8.415281295776367, 8.112914085388184, 8.593239784240723, 8.110265731811523, 8.457667350769043, 8.474080085754395, 8.1345796585083, 7.927401065826416, 7.714478492736816, 7.330197811126709, 7.683262825012207, 7.950503349304199, 7.904394626617432, 7.666321754455566, 8.060973167419434, 7.4695916175842285, 7.459173202514648, 7.796142578125, 8.041093826293945, 7.4564738273620605, 6.949404239654541, 7.232761859893799, 7.486824035644531, 7.231648921966553, 7.532423496246338, 7.799289703369141, 6.822357654571533, 7.232748508453369, 7.324521064758301, 7.211960315704346, 7.188131809234619]
8.8018407917
ENC-L2-en enc_norms_en [tensor(5.2854, device='cuda:0'), tensor(4.9418, device='cuda:0'), tensor(4.7892, device='cuda:0'), tensor(4.6721, device='cuda:0'), tensor(4.6508, device='cuda:0'), tensor(4.5445, device='cuda:0'), tensor(4.5432, device='cuda:0'), tensor(4.5423, device='cuda:0'), tensor(4.4875, device='cuda:0'), tensor(4.5121, device='cuda:0'), tensor(4.5438, device='cuda:0'), tensor(4.4784, device='cuda:0'), tensor(4.4609, device='cuda:0'), tensor(4.4328, device='cuda:0'), tensor(4.4258, device='cuda:0'), tensor(4.4172, device='cuda:0'), tensor(4.4131, device='cuda:0'), tensor(4.4003, device='cuda:0'), tensor(4.4294, device='cuda:0'), tensor(4.4335, device='cuda:0'), tensor(4.4042, device='cuda:0'), tensor(4.3502, device='cuda:0'), tensor(4.3336, device='cuda:0'), tensor(4.3355, device='cuda:0'), tensor(4.3133, device='cuda:0'), tensor(4.3110, device='cuda:0'), tensor(4.3024, device='cuda:0'), tensor(4.3005, device='cuda:0'), tensor(4.2883, device='cuda:0'), tensor(4.2781, device='cuda:0'), tensor(4.2677, device='cuda:0'), tensor(4.2579, device='cuda:0'), tensor(4.2693, device='cuda:0'), tensor(4.2655, device='cuda:0'), tensor(4.2553, device='cuda:0'), tensor(4.2415, device='cuda:0'), tensor(4.2314, device='cuda:0'), tensor(4.2049, device='cuda:0'), tensor(4.1991, device='cuda:0'), tensor(4.2099, device='cuda:0'), tensor(4.2063, device='cuda:0'), tensor(4.1791, device='cuda:0'), tensor(4.1645, device='cuda:0'), tensor(4.1521, device='cuda:0'), tensor(4.1385, device='cuda:0'), tensor(4.1437, device='cuda:0'), tensor(4.1399, device='cuda:0'), tensor(4.1413, device='cuda:0'), tensor(4.1175, device='cuda:0'), tensor(4.1102, device='cuda:0')]
from unsupervisedmt.
I see, yes the enc_norms_en
seems to be the problem. I was using this value to check that the encoding of sentences in different languages had similar norms, but this was only for debugging and now I guess this is kind of useless. You can either:
Remove this line:
https://github.com/facebookresearch/UnsupervisedMT/blob/master/NMT/src/trainer.py#L469
Or replace it with:
self.stats['enc_norms_%s' % lang1].append(encoded.dis_input.data.norm(2, 1).mean().item())
if you want to see the average encoded vector norms.
from unsupervisedmt.
Related Issues (20)
- why MemoryError
- Why codes file is empty.? HOT 4
- for different language, where to make change?
- How to train NMT + PBSMT ?
- UnboundLocalError: local variable 'n_words' referenced before assignment
- About number of shared layers
- RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [14, 32, 1536]], which is output 0 of AddBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True). HOT 1
- How to run PBSMT +NMT ?
- transformer multihead attention scaling layer error
- Setting the random seed does not result in same outputs across runs
- I have trouble when run get_data_enfr.sh
- How can I modify the code to train may own dataset on specific language?
- Low utilization rate of cuda HOT 1
- How to train the vector of phrases
- Low BLEU on PBSMT HOT 3
- bpe_end issue
- Getting raise EOFError() while executing Linux Command through Netmiko
- How i can run MUSE alignment in .sh
- How to train the model without para_dataset
- Error in runny bash command. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from unsupervisedmt.