leam's People
Forkers
eridgd dq-liu jianqiaol shaoboc liqunchen0606 fendaq punyajoy liuweiping2020 ccywch nininininini jx57 chease3640 cxncu001 alsm168 alicebupt meshiguge zwlily dieuvh liu-guo-jing zjms queenie88 songfgh hhh920406 hatleon tongcu hfxunlp xgk jcheunglin ghy73 zzx2017 cyzhangathit kremol renlang97 ohnosatoshi3104 okc13 sunxchao binny-mathew isunym iamjohnrain yangazure yyht annalyjak lidejian chivalrouss woohaoshu liucsthu hx-idiot chen7c ngc7023 liupeng0606 rgaonkar fabianoswald shshnk94 jackkuo666 vanchengkai breath123 fangego funweb li-study linzhongping volcano299 markwjj lousiaye mann2108 youhebuke china-challengehub sunzongxin1991 cdj0311 laomagic lqchencode xrosliang rzylucas dvhuang akakaala cowenzhang haoyawhl anshiquanshu66 meradsaid 20161105421 a794408069 yy-mille donjon86 rainyday7 ringosong tansy-tansy sword-ace xutianhan lcg22 xmiaomiao5526leam's Issues
Reproducing the results on the MIMIC dataset?
Hi,
I read your paper and it's an interesting approach. I'm more interested in the applications to clinical text - assigning ICD codes for medical documents. Specifically, I am attempting to reproduce the results on the MIMIC dataset. I have already setup the system for this dataset using the code provided by Mullenbach et al., 2018.
I went through your code - is there a pipeline to try this model on the MIMIC3 dataset and also produce visualizations for learnt attention across documents?
Looking forward to hearing from you.
Best,
Vijay
Issues about codes?
Hi, Dr. Guo.I’m confused when I run your codes, there are several details I want to consult you.
Firstly, the class 'emb_classifier' in 'main_multiclass.py' as follows, why you only calculate the most relevant class's embedding in multi_class and what is the significance of y_emb, which you didn't use in following codes?
y_pos = tf.argmax(y, -1) #?
y_emb, W_class = embedding_class(y_pos, opt, 'class_emb') # b * e, c * e
y_emb=tf.cast(y_emb,tf.float32)
W_class=tf.cast(W_class,tf.float32) # c * e
W_class_tran = tf.transpose(W_class, [1,0]) # e * c
Secondly, in the class of 'att_emb_ngram_encoder_cnn' in 'model.py', the input of convolution you designed here absolutely impossible to get the output size of 'b * s * c' and is not the same with your paper. What is your thinking of here for this convolution and the detailed design for padding, kernel, and output?
x_emb_0 = tf.squeeze(x_emb,) # b * s * e
x_emb_1 = tf.multiply(x_emb_0, x_mask) # b * s * e
H = tf.contrib.layers.conv2d(x_emb_0, num_outputs=opt.embed_size,kernel_size=[10], padding='SAME',activation_fn=tf.nn.relu) #b * s * c
Thirdly, why you only calculate the most relevant class as accuracy in your 'main_multiclass.py' as follow? Shouldn't it calculate the whole classes?
correct_prediction = tf.equal(tf.argmax(prob, 1), tf.argmax(y, 1))
How to get the cosine similarity eq2 from the function att_emb_ngram_encoder_maxout?
How to get the cosine similarity(eq2) in paper from the function att_emb_ngram_encoder_maxout?
AND what is the following code Att_v = tf.contrib.layers.conv2d(G, num_outputs=opt.num_class,kernel_size=[opt.ngram], padding='SAME',activation_fn=tf.nn.relu) #b * s * c refer to in this paper?
你好
d:\users\yamia\eclipse-workspace\zp\leam\preprocess_yahoo.py(182)()->None
-> pdb.set_trace()
(Pdb)
我自己的数据,出现这个错误,怎么解决?我数据格式如下:标签 (tab键) 一行文本数据
公式9的意思?
你好,embedding的损失函数(9),只有K个训练样本的意思吗? 如何嵌套进training batch呢?
try to do experiment on 70-classes dataset and the model doesn't converge
hi, i am very interested in your paper. I tried to do experiment on my own Finance News dataset to predict finicial event type given the finicial news but, the model doesn't converge that the loss on training set and development set are stuck on a value and change slowly. I am looking forward to your reply for deeper discussion. Thanks
main.py 和 main_multiclass.py 区别在什么地方
main.py 和 multiclass.py 查看发现网络结构有区别,这两个是具体解决的问题不一样吗
Error when load file yelp.p using cPickle
Hi,
I loaded your dataset yelp.p by using cPickle followed by your code:
data = cPickle.load(open('yelp.p', 'wb'))
It raises an error like this: "Value Error: could not convert string to int". Can you check this dataset file again. I think there are some problems with this file.
data
Can you send me a new data set that you have trained?Because the address you sent is no longer available.
And I always like to make mistakes when I train data sets myself, because I am a novice
Error in embedding_class
Thanks
Thanks
Does it work for short text, e.g., microblog query?
index out of bounds
Hi,
after correcting a few small typos, I'm running into a real error, apparently related to a wrong embedding size:
Traceback (most recent call last):
File "/home/mat/repos/tmp/LEAM/main.py", line 275, in
main()
File "/home/mat/repos/tmp/LEAM/main.py", line 151, in main
opt.W_class_emb = load_class_embedding( wordtoix, opt)
File "/home/mat/repos/tmp/LEAM/utils.py", line 90, in load_class_embedding
value_list = [ [ opt.W_emb[i] for i in l] for l in id_list]
IndexError: index 514556 is out of bounds for axis 0 with size 300
This is using your latest checkout and the data downloaded from your google drive.
Thanks,
-Mathias
confusion in paper
You take cosine similarity of each vector with each class. so how you know that max similarity is with right class or wrong one?
代码有很多错误,没法跑
你好
d:\users\yamia\eclipse-workspace\zp\leam\preprocess_yahoo.py(182)()->None
-> pdb.set_trace()
(Pdb)
我自己的数据,出现这个错误,怎么解决?我数据格式如下:标签 (tab键) 一行文本数据
能不能把你的原始数据格式给我看下?我这下不了那些数据,谢谢了!
Confusion
I'm confused in somewhere,but got no reply from email,so i'm here.
In section 4.1 Model, when get the matrix G, i thought each element in G can be seen as the score between words and labels. But when you do conv and pooling, what's the meaning of result? max-pooling will lose the position information, which means you can't know the score represent which labels
Besides, how do you get the test accuracy in your paper? Choose the max valid accuracy as the final model then test in Test set?
数据集
你好,代码里面的数据集链接进不去,可以更新一下链接或者重新给个地址吗,谢谢!
issue about the conv2d operation
as in #6 (comment), the implementation is not same as what the paper shows in equation 3.
i think line50 in model.py should be replaced as follows:
G = tf.expand_dims(G, axis=-1) # bsc1 filter: ngram1 filter_num:1
Att_v = tf.contrib.layers.conv2d(G, num_outputs=1, kernel_size=[opt.ngram,1], padding='SAME',activation_fn=tf.nn.relu) #b * s * c * 1
Att_v = tf.squeeze(Att_v,)
multi-label
Can you share the code that handles multi-label classification in this paper
Thank you very much
from utils import normalizing
Hi, there seems to be some issues with the version I cloned:
Traceback (most recent call last):
File "/home/mat/repos/tmp/LEAM/main.py", line 15, in
from model import *
File "/home/mat/repos/tmp/LEAM/model.py", line 2, in
from utils import normalizing
ImportError: cannot import name normalizing
Indeed, there's no function "normalizing" in the utils package.
Thanks,
-Mathias
confused with the conv layer
Att_v = tf.contrib.layers.conv2d(G, num_outputs=opt.num_class,kernel_size=[opt.ngram], padding='SAME',activation_fn=tf.nn.relu) #b * s * c
The implementation code above is a conv2D operation on the match matrix G.
while the formulations in the paper below seems only one filter with the size(2r+1) to produce a further match matrix (K*L). I think they are little different. Is it True?
where W ∈ R2r+1 and b ∈ R K u l ∈ R K
关于generate_emb
id_list = [ [ wordtoidx[i] for i in l] for l in name_list]
value_list = [ [ opt.W_emb[i] for i in l] for l in id_list]
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.