gauravbh1010tt / deeplearn Goto Github PK

Implementation of research papers on Deep Learning+ NLP+ CV in Python using Keras, Tensorflow and Scikit Learn.

License: MIT License

Python 100.00%

deep-learning nlp computer-vision audio-processing

deeplearn's Introduction

DeepLearn

Welcome to DeepLearn. This repository contains implementation of following research papers on NLP, CV, ML, and deep learning.

- Latest Update : Added _deeplearn_utils modules. Check the releases for description of new features.

[1] Correlation Neural Networks. CV, transfer learning, representation learning. code

[2] Reasoning With Neural Tensor Networks for Knowledge Base Completion. NLP, ML. code

[3] Common Representation Learning Using Step-based Correlation Multi-Modal CNN. CV, transfer learning, representation learning. code

[4] ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs. NLP, deep learning, sentence matching. code

[5] Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks. NLP, deep learning, CQA. code

[6] Combining Neural, Statistical and External Features for Fake News Stance Identification. NLP, IR, deep learning. code

[7] WIKIQA: A Challenge Dataset for Open-Domain Question Answering. NLP, deep learning, CQA. code

[8] Siamese Recurrent Architectures for Learning Sentence Similarity. NLP, sentence similarity, deep learning. code

[9] Convolutional Neural Tensor Network Architecture for Community Question Answering. NLP, deep learning, CQA. code

[10] Map-Reduce for Machine Learning on Multicore. map-reduce, hadoop, ML. code

[11] Teaching Machines to Read and Comprehend. NLP, deep learning. code

[12] Improved Representation Learning for Question Answer Matching. NLP, deep learning, CQA. code

[13] External features for community question answering. NLP, deep learning, CQA. code

[14] Language Identification and Disambiguation in Indian Mixed-Script. NLP, IR, ML. code

[15] Construction of a Semi-Automated model for FAQ Retrieval via Short Message Service. NLP, IR, ML. code

Dependencies:

The required dependencies are mentioned in requirement.txt. I will also use dl-text modules for preparing the datasets. If you haven't use it, please do have a quick look at it.

$ pip install -r requirements.txt

deeplearn's People

Contributors

Stargazers

Watchers

Forkers

benjamesbabala piyush-j akshitac8 georgepar shivani-tyagii zhuxf0407 arunkudiyal abhishek-kandwal prakhargupta34 bibhu910 rahulchaudharydeveloper bhuppijeena tusharraturi123 kshitizbhansali alucard07 kmpooja kaveridb radhikaraturi piyushkaran purushotamkumar798 mmcxxvi vseledkin ml-ai-nlp-ir haroldss tianforks garftalk mainakchain hzitoun yashwantreddy little1tow sermakarevich hieuqtran johncliu allensmile melcutz chiewxia riviera12345 cclauss tapadyuti1991 jdc08161063 shubhampachori12110095 fireae xuanhan863 turpure jeanxi megamayoy fancycheung kumarneeraj2005 zhanghonglishanzai tvkpz locussam jclos eagledangar parmarjigar mschrimpf tonydeep emrul robert-alvarez jwuphysics bityangke lulzzz rae83 enavarroai bcriswell zilehuda hunkimforks neerajsarwan calculatedcontent gauravnitc243 srinivest khalefa rishiabhishek ssh-shashi wuqixiaobai vishalkakkar shubhamwagh lazybrainai gdpan919 morganwang010 meelement merajat satadru5 miaojiang1987 shaunstanislauslau manjunathgit codercouple sumadodo netwrkerror chsafouane pandinosaurus avelezd ruimao1988 chiragsingla hbcbh1999 wantongtang fancyerii ai-jie01 lym0302 vn09 ashishkej

deeplearn's Issues

neural tensor network data

Hi , Thank for your code, I learned a lot from them, especially NTN, now I wana do some new model and eval on new data set, but I don't know how to generate the embedding mat in the experience of neural tensor network, could you please tell me how to do it? thanks

ABCNN some .npy files missed

in wiki_utils.py three files are mentioned, but can not be found.

    feat_LS = np.load('../data/wiki/Extracted_Features/lex.npy')
    feat_read = np.load('../data/wiki/Extracted_Features/read.npy')
    feat_numeric = np.load('../data/wiki/Extracted_Features/numeric.npy')

pip cannot find StandardScaler

The StandardScaler in the requirements.txt cannot be found - are you perhaps referring to the sklearn class?

This repository is great by the way, thank you!

No module named dl

after installing requirements I get a "no module named dl" error in WikiQA_CNN+Feat

ImportError: No module named 'dl_layers'

I successfully imported dl_text.

However, in MaLSTM (Siamese)/model_Siam_LSTM.py, there is this import statement :
from dl_layers.layers import Abs, Exp

and the error I received is :
Traceback (most recent call last): File "main.py", line 10, in <module> import model_Siam_LSTM as model File "/home/remondn/workspace/benchmark/DeepLearn/MaLSTM (Siamese)/model_Siam_LSTM.py", line 13, in <module> from dl_layers.layers import Abs, Exp ImportError: No module named 'dl_layers'

Where can I find dl_layers ?

Add License

There are a pretty important articles implemented.
Would be nice to know the license.

test.ref

Hello,

For the convolution neural tensor network, could you please explain what do the five values on each line inside test.ref file indicate?

Thanks.

NTN about prepare_data function

Undefined names

$ python2 -m flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

./TrecQA_CNN+Sim/model_sim.py:64:14: F821 undefined name 'merge'
        h1 = merge([h1] + channel_1, mode="concat")
             ^
./TrecQA_CNN+Sim/model_sim.py:68:14: F821 undefined name 'merge'
        h2 = merge([h2] + channel_2, mode="concat")
             ^
./TrecQA_CNN+Sim/model_sim.py:133:14: F821 undefined name 'merge'
        h1 = merge([h1] + channel_1, mode="concat")
             ^
./TrecQA_CNN+Sim/model_sim.py:137:14: F821 undefined name 'merge'
        h2 = merge([h2] + channel_2, mode="concat")
             ^
./fake news challenge (FNC-1)/fnc_libs.py:145:62: F821 undefined name 'd'
    X_holdout,y_holdout = generate_features(hold_out_stances,d,"holdout")
                                                             ^
./fake news challenge (FNC-1)/fnc_libs.py:147:66: F821 undefined name 'd'
        Xs[fold],ys[fold] = generate_features(fold_stances[fold],d,str(fold))
                                                                 ^
./convolution neural tensor network/model_cntn.py:66:14: F821 undefined name 'merge'
        h1 = merge([h1] + channel_1, mode="concat")
             ^
./convolution neural tensor network/model_cntn.py:70:14: F821 undefined name 'merge'
        h2 = merge([h2] + channel_2, mode="concat")
             ^
8     F821 undefined name 'merge'

MACorr dataset

Hi! Thanks for sharing the code for the MACorr,. I am trying to run the example https://github.com/GauravBh1010tt/DeepLearn/blob/master/corrMCNN/XRMB_CNN_17.06.v2.py But I was wondering how I can get access to the dataset for that example? Thanks!

Dummy issue

dummy

Released version v1.1

Added _deeplearn_utils modules. Addition of following features:-

Added _deeplearn_utils folder for removing redundancy.
Removed redundancy of data (Trec, Wiki in subfolders)
Removed redundancy of dl_text and dl_layers modules.
Added new versions in requirement.txt file.

If there are problems with this release, then please revert to this post and let me know.

No matching distribution found for StandardScaler

based on your instruction: pip install -r requirements.txt
then:

pip install -r requirements.txt
Collecting numpy==1.11.0 (from -r requirements.txt (line 6))
Using cached https://files.pythonhosted.org/packages/1a/5c/57c6920bf4a1b1c11645b625e5483d778cedb3823ba21a017112730f0a12/numpy-1.11.0.tar.gz
Requirement already satisfied: matplotlib in c:\users\programmer\anaconda3\lib\site-packages (from -r requirements.txt (line 7))
Requirement already satisfied: pandas in c:\users\programmer\appdata\roaming\python\python36\site-packages (from -r requirements.txt (line 8))
Requirement already satisfied: scikit-learn in c:\users\programmer\anaconda3\lib\site-packages (from -r requirements.txt (line 9))
Requirement already satisfied: scipy in c:\users\programmer\anaconda3\lib\site-packages (from -r requirements.txt (line 10))
Requirement already satisfied: h5py in c:\users\programmer\anaconda3\lib\site-packages (from -r requirements.txt (line 11))
Collecting keras==2.1.5 (from -r requirements.txt (line 12))
Using cached https://files.pythonhosted.org/packages/ba/65/e4aff762b8696ec0626a6654b1e73b396fcc8b7cc6b98d78a1bc53b85b48/Keras-2.1.5-py2.py3-none-any.whl
Collecting theano==0.9.0 (from -r requirements.txt (line 13))
Using cached https://files.pythonhosted.org/packages/28/03/6af9ff242da966f89de6ab81164db0d1a36fd89379b7370f07043de62f10/Theano-0.9.0.tar.gz
Collecting StandardScaler (from -r requirements.txt (line 14))
Could not find a version that satisfies the requirement StandardScaler (from -r requirements.txt (line 14)) (from versions: )

No matching distribution found for StandardScaler (from -r requirements.txt (line 14))

Can you please update your requirements.txt or give us a worksaround?
Thanks

AttributeError: 'Tensor' object has no attribute 'T'

StandardScaler.transform is throwing error "ValueError: Expected 2D array, got 1D array instead:"

I just started learning DataScience and ML. Trying out code available online.

I have got PandaDataframes (Xtrain, Xtest) and Panda Series (ytrain and ytest) as output from Train_Test_Split function.

When Xtrain and Xtest are put thru normalization (SC.fit_transform and SC.tranform functions) both were successful.

But, i am not able to get past ytrain and ytest normalization. It gives below error

ValueError Traceback (most recent call last)
in ()
1 #ytrain=StandardScaler.transform(ytrain[:,-1])
2
----> 3 ytrain=sc.transform(ytrain)

C:\Program Files (x86)\Microsoft Visual Studio\Shared\Anaconda3_64\lib\site-packages\sklearn\preprocessing\data.py in transform(self, X, y, copy)
679 copy = copy if copy is not None else self.copy
680 X = check_array(X, accept_sparse='csr', copy=copy, warn_on_dtype=True,
--> 681 estimator=self, dtype=FLOAT_DTYPES)
682
683 if sparse.issparse(X):

C:\Program Files (x86)\Microsoft Visual Studio\Shared\Anaconda3_64\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
439 "Reshape your data either using array.reshape(-1, 1) if "
440 "your data has a single feature or array.reshape(1, -1) "
--> 441 "if it contains a single sample.".format(array))
442 array = np.atleast_2d(array)
443 # To ensure that array flags are maintained

ValueError: Expected 2D array, got 1D array instead:
array=[0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 1. 0.
0. 1. 1. 0. 1. 1. 1. 1. 1. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0. 1.

1. 1. 1. 1. 1.].
        Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

ModuleNotFoundError: No module named 'params'

used only once in word_vecs = [[we[j][i] for j in range(params.embedding_size)] for i in range(len(words[0]))]
in load_embeds function from ntn_input.py

Explanation of the ntn_Eval file in the implementation of the neural tensor network

I search everywhere in the code but I didn't get the usefulness of this file it is not even used. could you explain that, please?

Thanks for the code :)