chenyuntc / pytorchtext Goto Github PK

1st Place Solution for Zhihu Machine Learning Challenge . Implementation of various text-classification models.(知乎看山杯第一名解决方案)

Home Page: https://biendata.com/competition/zhihu/

License: MIT License

Python 58.63% Shell 4.24% Jupyter Notebook 37.13%

pytorch nlp textcnn textrnn fasttext textrcnn lstm

pytorchtext's Issues

How to solve the problem that topK's K is different for every input text?

The output topK's K is fixed now.

Do you think training a classifier to predict the value of K for every input is a good solution?

Thank you very much.

CNNText missing?

Line https://github.com/chenyuntc/PyTorchText/blob/master/models/__init__.py#L2 imports CNNText but it doesn't seem to be in the directory. Is it missing?

Thanks

where is your word2vec module?

in embedding2matrix.py there is the code
`
import word2vec
import numpy as np

def main(em_file, em_result):
'''
embedding ->numpy
'''
em = word2vec.load(em_file)
vec = (em.vectors)
word2id = em.vocab_hash
# d = dict(vector = vec, word2id = word2id)
# t.save(d,em_result)
np.savez_compressed(em_result,vector=vec,word2id=word2id)

if name == 'main':
import fire
fire.Fire()
`

but I can not find any module named 'word2vec'

thank you

这个解决办法基于哪些论文的可以分享一下嘛感谢.

请问des3 文件的解压密码是?

等比赛结束看到报导才过来的
想请问一个问题

目前官网只剩下des3 链接了 rar 已经不提供
https://www.dropbox.com/s/auycv8lt6ntd805/ieee_zhihu_cup.des3?dl=0

想请问des3 文件的解压密码是?

关于标签的问题

请问你这个topic的标签是转换成[0,1,0,0,1,......,0]这样的类型的还是别的类型的，谢谢

哦

不是每个类看成0/1分类，
是一次分1999个类，用topk取前5，
已经确定要top5，所以不用考虑预测出多少个标签的问题。

use max-epoch5 v.s. early stop

Hi, I am new to DL and i wonder what's the reason behind using small epoch (5) and not using early stop?

Thanks,

请问词向量转成numpy数组，你这个word2vec怎么装的？

我怎么没有这个包，pip也不能安装

hi,when i run main.py,there was a error.Do you know why?

Traceback (most recent call last):
File "main.py", line 159, in
fire.Fire()
File "/usr/anaconda3/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/usr/anaconda3/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/usr/anaconda3/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "main.py", line 102, in main
for ii,((title,content),label) in tqdm.tqdm(enumerate(dataloader)):
File "/usr/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 417, in iter
return DataLoaderIter(self)
File "/usr/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 242, in init
self._put_indices()
File "/usr/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 290, in _put_indices
indices = next(self.sample_iter, None)
File "/usr/anaconda3/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 119, in iter
for idx in self.sampler:
File "/usr/anaconda3/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 50, in iter
return iter(torch.randperm(len(self.data_source)).long())
RuntimeError: invalid argument 1: must be strictly positive at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/TH/generic/THTensorMath.c:2247

class CNNText(nn.Module): 
    def __init__(self):
        super(CNNText, self).__init__()
        self.encoder_tit = nn.Embedding(3281, 64)
        self.encoder_con = nn.Embedding(496037, 512)
        
        self.title_conv_1 = nn.Sequential(
            nn.Conv1d(in_channels = 1,
                      out_channels = 1,
                      kernel_size = (1, 64)),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=1),
        )
        
        self.title_conv_2 = nn.Sequential(
            nn.Conv1d(in_channels = 1,
                      out_channels = 1,
                      kernel_size = (2, 64)),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=1),
        )

        self.content_conv_3 = nn.Sequential(
            nn.Conv1d(in_channels = 1,
                      out_channels = 1,
                      kernel_size = (3, 512)),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size = 50)
        )
        
        self.content_conv_4 = nn.Sequential(
            nn.Conv1d(in_channels = 1,
                      out_channels = 1,
                      kernel_size = (3, 512)),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size = 50)
        )
            
        self.content_conv_5 = nn.Sequential(
            nn.Conv1d(in_channels = 1,
                      out_channels = 1,
                      kernel_size = (3, 512)),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size = 50)
        )
        
        
            
        self.fc = nn.Linear(5, 9)

    def forward(self, title, content):
        title = self.encoder_tit(title)
        print(title.size())
        title_out_1 = self.title_conv_1(title)
        title_out_2 = self.title_conv_2(title)
        
        content = self.encoder_con(content)
        content_out_3 = self.content_conv_3(content)
        content_out_4 = self.content_conv_4(content)
        content_out_5 = self.content_conv_5(content)
            
        conv_out = t.cat((title_out_1,title_out_2,content_out_3,content_out_4,content_out_5),dim=1)
        logits = self.fc(conv_out)
        return F.log_softmax(logits)

cnnt = CNNText()

optimizer = optim.Adam(cnnt.parameters(), lr=.001)
Loss = nn.NLLLoss()

for epoch in range(50):
    loss = 0
    
    t = ''.join(title[epoch])
    c = ''.join(content[epoch])
    T, C = variables_from_pair(t, c)
#     print(T.squeeze(1).unsqueeze(0))
    T = T.squeeze(1).unsqueeze(0)
    C = C.squeeze(1).unsqueeze(0)
    optimizer.zero_grad()
    
    out = cnnt(T, C)
    target = cla[epoch]
    loss += Loss(out, target)
    
    loss.backward()
    optimizer.step()
    
print("Loss is {} at {} epoch".format(loss, epoch))

Error:

torch.Size([1, 3, 64])
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-34-328d44896eef> in <module>()
     15     optimizer.zero_grad()
     16 
---> 17     out = cnnt(T, C)
     18     target = cla[epoch]
     19     loss += Loss(out, target)

/home/quoniammm/anaconda3/envs/py3Tfgpu/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    222         for hook in self._forward_pre_hooks.values():
    223             hook(self, input)
--> 224         result = self.forward(*input, **kwargs)
    225         for hook in self._forward_hooks.values():
    226             hook_result = hook(self, input, result)

<ipython-input-31-fe95ab78725e> in forward(self, title, content)
     52         title = self.encoder_tit(title)
     53         print(title.size())
---> 54         title_out_1 = self.title_conv_1(title)
     55         title_out_2 = self.title_conv_2(title)
     56 

/home/quoniammm/anaconda3/envs/py3Tfgpu/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    222         for hook in self._forward_pre_hooks.values():
    223             hook(self, input)
--> 224         result = self.forward(*input, **kwargs)
    225         for hook in self._forward_hooks.values():
    226             hook_result = hook(self, input, result)

/home/quoniammm/anaconda3/envs/py3Tfgpu/lib/python3.6/site-packages/torch/nn/modules/container.py in forward(self, input)
     65     def forward(self, input):
     66         for module in self._modules.values():
---> 67             input = module(input)
     68         return input
     69 

/home/quoniammm/anaconda3/envs/py3Tfgpu/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    222         for hook in self._forward_pre_hooks.values():
    223             hook(self, input)
--> 224         result = self.forward(*input, **kwargs)
    225         for hook in self._forward_hooks.values():
    226             hook_result = hook(self, input, result)

/home/quoniammm/anaconda3/envs/py3Tfgpu/lib/python3.6/site-packages/torch/nn/modules/conv.py in forward(self, input)
    152     def forward(self, input):
    153         return F.conv1d(input, self.weight, self.bias, self.stride,
--> 154                         self.padding, self.dilation, self.groups)
    155 
    156 

/home/quoniammm/anaconda3/envs/py3Tfgpu/lib/python3.6/site-packages/torch/nn/functional.py in conv1d(input, weight, bias, stride, padding, dilation, groups)
     81     f = ConvNd(_single(stride), _single(padding), _single(dilation), False,
     82                _single(0), groups, torch.backends.cudnn.benchmark, torch.backends.cudnn.enabled)
---> 83     return f(input, weight, bias)
     84 
     85 

RuntimeError: expected 3D tensor

The title has been a 3D tensor.Why RuntimeError is expected 3D tensor

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

chenyuntc / pytorchtext Goto Github PK

pytorchtext's Issues

How to solve the problem that topK's K is different for every input text?

CNNText missing?

where is your word2vec module?

这个解决办法基于哪些论文的可以分享一下嘛感谢.

请问des3 文件的解压密码是?

关于标签的问题

哦

use max-epoch5 v.s. early stop

请问词向量转成numpy数组，你这个word2vec怎么装的？

hi,when i run main.py,there was a error.Do you know why?

这个有没有节约内存的办法

fc层可以直接接MultiLabelSoftMarginLoss吗？

Why RuntimeError is expected 3D tensor?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent