Giter Site home page Giter Site logo

hamelsmu / code_search Goto Github PK

View Code? Open in Web Editor NEW
490.0 490.0 140.0 75.4 MB

Code For Medium Article: "How To Create Natural Language Semantic Search for Arbitrary Objects With Deep Learning"

Home Page: https://medium.com/@hamelhusain/semantic-code-search-3cd6d244a39c

License: MIT License

Jupyter Notebook 95.45% Python 4.48% Shell 0.01% HTML 0.07%
code-search data-science deep-learning fastai keras machine-learning machine-learning-on-source-code ml-on-code natural-language-processing nlp python pytorch search search-algorithm searching-algorithms semantic-search semantic-search-engine tensorflow tutorial

code_search's People

Contributors

andrewnc avatar cheeseblubber avatar dependabot[bot] avatar hamelsmu avatar zhaoyicc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

code_search's Issues

OverflowError in 1 Preprocess Data

I'm trying to run the notebooks in my own python 3.6 conda environment.

I'm running into a problem when running this code:
pairs = flattenlist(apply_parallel(get_function_docstring_pairs_list, df.content.tolist(), cpu_cores=4))

I see the following traceback:

---------------------------------------------------------------------------
RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/multiprocess/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/multiprocess/pool.py", line 44, in mapstar
    return list(map(*args))
  File "<ipython-input-16-3f34f247210c>", line 40, in get_function_docstring_pairs_list
    return [get_function_docstring_pairs(b) for b in blob_list]
  File "<ipython-input-16-3f34f247210c>", line 40, in <listcomp>
    return [get_function_docstring_pairs(b) for b in blob_list]
  File "<ipython-input-16-3f34f247210c>", line 23, in get_function_docstring_pairs
    source = astor.to_source(f)
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/code_gen.py", line 52, in to_source
    generator.result.append('\n')
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/node_util.py", line 143, in visit
    return visitor(node)
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/code_gen.py", line 320, in visit_FunctionDef
    if not self.indentation:
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/code_gen.py", line 218, in body
    self.indentation -= 1
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/code_gen.py", line 168, in write
    elif callable(item):
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/node_util.py", line 143, in visit
    return visitor(node)
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/code_gen.py", line 472, in visit_Return
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/code_gen.py", line 206, in conditional_write
    # Inform the caller that we wrote
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/code_gen.py", line 168, in write
    elif callable(item):
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/node_util.py", line 143, in visit
    return visitor(node)
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/code_gen.py", line 659, in visit_Tuple
    with self.delimit(node, op) as delimiters:
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/code_gen.py", line 268, in comma_list
    self.write(',' if trailing else '')
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/code_gen.py", line 168, in write
    elif callable(item):
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/node_util.py", line 143, in visit
    return visitor(node)
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/code_gen.py", line 627, in visit_Num
    delimiters.discard = delimiters.pp != pow_lhs
  File "/home/brian/.conda/envs/tmp/lib/python3.6/site-packages/astor/code_gen.py", line 619, in part
    self.write(s)
OverflowError: int too large to convert to float
"""

The above exception was the direct cause of the following exception:

OverflowError                             Traceback (most recent call last)
/media/HDD/brian/code_search/notebooks/general_utils.py in apply_parallel(func, data, cpu_cores)
     75         pool = Pool(cpu_cores)
---> 76         transformed_data = pool.map(func, chunked(data, chunk_size), chunksize=1)
     77     finally:

~/.conda/envs/tmp/lib/python3.6/site-packages/multiprocess/pool.py in map(self, func, iterable, chunksize)
    265         '''
--> 266         return self._map_async(func, iterable, mapstar, chunksize).get()
    267 

~/.conda/envs/tmp/lib/python3.6/site-packages/multiprocess/pool.py in get(self, timeout)
    643         else:
--> 644             raise self._value
    645 

OverflowError: int too large to convert to float

I'm not sure whats going on here. Any help appreciated.

Please Help

Hello,
I am on step 5 of the tutorial using jupyter notebook, and have had an interesting time with the dependencies. I am trying to run this cell on step 5:

lang_model = torch.load('./data/lang_model/lang_model_cpu_v2.torch',
map_location=lambda storage, loc: storage)

vocab = load_lm_vocab('./data/lang_model/vocab_v2.cls')
q2emb = Query2Emb(lang_model = lang_model.cpu(),
vocab = vocab)

search_index = nmslib.init(method='hnsw', space='cosinesimil')
search_index.loadIndex('./data/search/search_index.nmslib')

however, an error is produced...

Note: be v. careful before removing this, as 3rd party device types

136         # likely rely on this behavior to properly .to() modules like LSTM.

--> 137 self._flat_weights = [getattr(self, weight) for weight in self._flat_weights_names]
138
139 # Flattens params (on CUDA)
'LSTM' object has no attribute '_flat_weights_names'
which originates in torch\nn\modules\module.py

This torch library is an old version of pytorch, right? There are also other libraries that I am having trouble getting a hold of. I am pretty sure that this is a compatibility error; is there anyway you could provide the torch library you used for this tutorial as it is not available to download. My OS is Win x64.

Consider re-factoring f-strings and consider python version

From @hohsiangwu , he makes some good points worth considering:

"""
I am not a big fan of f-string yet. I know it is easier and more clear but the downside is that it only supports Python3.6 and onwards.

I ran into several problems while dealing with the dependencies of python, pytorch with your libraries where I might be only in python3.5 and all of a sudden, all your libraries don't work. I would highly recommend that let's use more common patterns in the public repository.

I could be overthinking, so if you decide on keeping using f strings, I would need to modify the part to follow the convention. I think nothing is more confused than a repository with different patterns.
"""

The docker container for this tutorial is running python 3.6.3 but maybe that is not the best for our readers? I want to post this issue so I don't forget about it and come back to it later!

issue in fit method in fastai

when i run a fit function in language model it shows a below error :

Epoch:   0%|          | 0/7 [00:00<?, ?it/s]
  0%|          | 0/3280 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "E:/Mindtree_IDP/neural_language_model.py", line 41, in <module>
    wd=1e-6)
  File "E:\Mindtree_IDP\lang_model_utils.py", line 243, in train_lang_model
    best_save_name = 'language_model'
  File "C:\Users\GK\AppData\Roaming\Python\Python36\site-packages\fastai-0.7.0-py3.6.egg\fastai\learner.py", line 287, in fit
    return self.fit_gen(self.model, self.data, layer_opt, n_cycle, **kwargs)
  File "C:\Users\GK\AppData\Roaming\Python\Python36\site-packages\fastai-0.7.0-py3.6.egg\fastai\learner.py", line 234, in fit_gen
    swa_eval_freq=swa_eval_freq, **kwargs)
  File "C:\Users\GK\AppData\Roaming\Python\Python36\site-packages\fastai-0.7.0-py3.6.egg\fastai\model.py", line 132, in fit
    loss = model_stepper.step(V(x),V(y), epoch)
  File "C:\Users\GK\AppData\Roaming\Python\Python36\site-packages\fastai-0.7.0-py3.6.egg\fastai\model.py", line 50, in step
    output = self.m(*xs)
  File "c:\users\gk\appdata\local\continuum\anaconda3\envs\mindtree\lib\site-packages\torch\nn\modules\module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "c:\users\gk\appdata\local\continuum\anaconda3\envs\mindtree\lib\site-packages\torch\nn\modules\container.py", line 92, in forward
    input = module(input)
  File "c:\users\gk\appdata\local\continuum\anaconda3\envs\mindtree\lib\site-packages\torch\nn\modules\module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\GK\AppData\Roaming\Python\Python36\site-packages\fastai-0.7.0-py3.6.egg\fastai\lm_rnn.py", line 97, in forward
    raw_output, new_h = rnn(raw_output, self.hidden[l])
  File "c:\users\gk\appdata\local\continuum\anaconda3\envs\mindtree\lib\site-packages\torch\nn\modules\module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\GK\AppData\Roaming\Python\Python36\site-packages\fastai-0.7.0-py3.6.egg\fastai\rnn_reg.py", line 122, in forward
    return self.module.forward(*args)
  File "c:\users\gk\appdata\local\continuum\anaconda3\envs\mindtree\lib\site-packages\torch\nn\modules\rnn.py", line 179, in forward
    self.dropout, self.training, self.bidirectional, self.batch_first)
RuntimeError: shape '[1000000, 1]' is invalid for input of size 2000

Facing error while executing -> from fastai.text import *

I am facing below error:
errofile.txt

from fastai import text
Traceback (most recent call last):

File "", line 1, in
from fastai import text

File "C:\Users\NJ077229\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\text_init_.py", line 1, in
from .. import basics

File "C:\Users\NJ077229\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\basics.py", line 1, in
from .basic_train import *

File "C:\Users\NJ077229\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\basic_train.py", line 2, in
from .torch_core import *

File "C:\Users\NJ077229\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\torch_core.py", line 2, in
from .imports.torch import *

ModuleNotFoundError: No module named 'fastai.imports.torch'; 'fastai.imports' is not a package

from fastai.text import *
Traceback (most recent call last):

File "", line 1, in
from fastai.text import *

File "C:\Users\NJ077229\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\text_init_.py", line 1, in
from .. import basics

File "C:\Users\NJ077229\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\basics.py", line 1, in
from .basic_train import *

File "C:\Users\NJ077229\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\basic_train.py", line 2, in
from .torch_core import *

File "C:\Users\NJ077229\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\torch_core.py", line 2, in
from .imports.torch import *

ModuleNotFoundError: No module named 'fastai.imports.torch'; 'fastai.imports' is not a package

from fastai.text import *
Traceback (most recent call last):

File "", line 1, in
from fastai.text import *

File "C:\Users\NJ077229\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\text_init_.py", line 1, in
from .. import basics

File "C:\Users\NJ077229\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\basics.py", line 1, in
from .basic_train import *

File "C:\Users\NJ077229\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\basic_train.py", line 2, in
from .torch_core import *

File "C:\Users\NJ077229\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\torch_core.py", line 2, in
from .imports.torch import *

ModuleNotFoundError: No module named 'fastai.imports.torch'; 'fastai.imports' is not a package

fastai backward compatibility issue

The image hamelsmu/ml-gpu is installing the latest version of fastai making some parts of the code to crash. Solved by uninstalling and: pip install fastai==0.7.0; pip install torchtext==0.2.3 but you could use the requirements.txt specification you also provided on the repo.

Q4.ipynb NotFound this file code_summary_seq2seq_model.h5

seq2seq_Model = load_model(str(seq2seq_path)+'/code_summary_seq2seq_model.h5')

'no such file or directory'

In my data/seq2seq folder, I only have those 4 files: py_code_proc_v2.dpkl py_comment_proc_v2.dpkl py_t_code_vecs_v2.npy py_t_comment_vecs_v2.npy

Doesn't have file code_summary_seq2seq_model.h5

Dataset Not found

Hi, @hamelsmu I really like your work on this code search, But I couldn't found the dataset for the development of this project, which you have mentioned in notebook1. Is it possible to share the 10 CSV file that you have mentioned getting?
Thank you

Part 3 - Training the language model

Hi @hamelsmu

Training the language model(train_lang_model) seems to take 13 hours and gpu utilization is at 0%. Why this step does not utilize gpu? Is this intentional? or Is there a configuration that needs to be set to enable gpu utilization?

I verified that the environment variable is set to a gpu device.

ModuleNotFoundError: No module named 'tensorflow'

Hello,
how are you?

I'm following notebook 2 using the hamelsmu/ml-cpu docker container. But I encounter the following error when trying to load the preprocessed data (in the "Read Text From File" part):

File "/opt/project/utils/seq2seq.py", line 2, in <module> import tensorflow as tf ModuleNotFoundError: No module named 'tensorflow'

Any suggestions for solution?

Thank you very much and congratulations for the excellent tutorial.

Can't load index?

Hello,

I am trying to load the index using the code provided in notebook 5:

search_index = nmslib.init(method='hnsw', space='cosinesimil')
search_index.loadIndex('./data/search/search_index.nmslib')

But, the following error happens:

Check failed: data_level0_memory_
Traceback (most recent call last):
search_index.loadIndex("./data/search/search_index.nmslib")
RuntimeError: Check failed: it's either a bug or inconsistent data!

My computer has only 8GB as primary memory. So, did this happen because the index is over 8GB and could not be loaded into memory?

Thank you for any help.

Could not vectorize no docstring codes

Hello,

I am running notebook 4, the part where codes whithout docstrings are turned into embeddings. Specifically I am having trouble executing the following code snippet:

encinp = enc_pp.transform_parallel(no_docstring_funcs)
np.save(code2emb_path/'nodoc_encinp.npy', encinp)

which returns the following error:

WARNING:root:...tokenizing data
---------------------------------------------------------------------------
RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/multiprocess/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/opt/conda/lib/python3.6/site-packages/multiprocess/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/opt/conda/lib/python3.6/site-packages/ktext/preprocess.py", line 88, in process_text
    return [tokenizer(cleaner(doc)) for doc in text]
  File "/opt/conda/lib/python3.6/site-packages/ktext/preprocess.py", line 88, in <listcomp>
    return [tokenizer(cleaner(doc)) for doc in text]
  File "/opt/conda/lib/python3.6/site-packages/ktext/preprocess.py", line 55, in textacy_cleaner
    no_accents=True)
  File "/opt/conda/lib/python3.6/site-packages/textacy/preprocess.py", line 256, in preprocess_text
    text = replace_urls(text)
  File "/opt/conda/lib/python3.6/site-packages/textacy/preprocess.py", line 101, in replace_urls
    return constants.RE_URL.sub(
AttributeError: module 'textacy.constants' has no attribute 'RE_URL'
"""

The above exception was the direct cause of the following exception:

AttributeError                            Traceback (most recent call last)
/opt/conda/lib/python3.6/site-packages/ktext/preprocess.py in apply_parallel(func, data, cpu_cores)
     71         pool = Pool(cpu_cores)
---> 72         transformed_data = pool.map(func, chunked(data, chunk_size), chunksize=1)
     73     finally:

/opt/conda/lib/python3.6/site-packages/multiprocess/pool.py in map(self, func, iterable, chunksize)
    265         '''
--> 266         return self._map_async(func, iterable, mapstar, chunksize).get()
    267 

/opt/conda/lib/python3.6/site-packages/multiprocess/pool.py in get(self, timeout)
    643         else:
--> 644             raise self._value
    645 

AttributeError: module 'textacy.constants' has no attribute 'RE_URL'

During handling of the above exception, another exception occurred:

UnboundLocalError                         Traceback (most recent call last)
<ipython-input-19-8e34745b4d23> in <module>
----> 1 encinp = enc_pp.transform_parallel(test_codes[:5])
      2 np.save(code2emb_path/'test_codes_encinp.npy', encinp)

/opt/conda/lib/python3.6/site-packages/ktext/preprocess.py in transform_parallel(self, data)
    375         """
    376         logging.warning(f'...tokenizing data')
--> 377         tokenized_data = self.parallel_process_text(data)
    378         logging.warning(f'...indexing data')
    379         indexed_data = self.indexer.tokenized_texts_to_sequences(tokenized_data)

/opt/conda/lib/python3.6/site-packages/ktext/preprocess.py in parallel_process_text(self, data)
    231                                                 end_tok=self.end_tok)
    232         n_cores = self.num_cores
--> 233         return flattenlist(apply_parallel(process_text, data, n_cores))
    234 
    235     def generate_doc_length_stats(self):

/opt/conda/lib/python3.6/site-packages/ktext/preprocess.py in apply_parallel(func, data, cpu_cores)
     74         pool.close()
     75         pool.join()
---> 76         return transformed_data
     77 
     78 

UnboundLocalError: local variable 'transformed_data' referenced before assignment

Thanks in Advance.

Pre-Trained Model for Code Search

Is there a pre-trained model for the code search task? I couldn't find it here or in CodeSearchNet Repository. Pre-Trained model for any of Part 3 or Part 4 can also help.

Why did you fitted several times the model?

I have been reading the tutorial, however in notebook 3 I noted you fitted the model several times:

In [18]:

if not use_cache:
    fastai_learner.fit(1e-3, 3, wds=1e-6, cycle_len=2)

HBox(children=(IntProgress(value=0, description='Epoch', max=6), HTML(value='')))

epoch      trn_loss   val_loss                                
    0      3.954703   3.989164  
    1      3.907728   3.975681                                
    2      3.936994   3.976287                                
    3      3.871557   3.96412                                 
    4      3.927649   3.969976                                
    5      3.873011   3.956639                                

Then you use here different parameters:

In [19]:

if not use_cache:
    fastai_learner.fit(1e-3, 2, wds=1e-6, cycle_len=3, cycle_mult=2)

HBox(children=(IntProgress(value=0, description='Epoch', max=9), HTML(value='')))

epoch      trn_loss   val_loss                                
    0      3.925804   3.971093  
    1      3.857519   3.951696                                
    2      3.840948   3.946251                                
    3      3.907309   3.970567                                
    4      3.879899   3.956719                                
    5      3.840587   3.947983                                
    6      3.823401   3.935096                                
    7      3.838912   3.929217                                
    8      3.778818   3.930717                                

And here you use cycle_mult=10:

In [20]:

if not use_cache:
    fastai_learner.fit(1e-3, 2, wds=1e-6, cycle_len=3, cycle_mult=10)

HBox(children=(IntProgress(value=0, description='Epoch', max=33), HTML(value='')))

epoch      trn_loss   val_loss                                
    0      3.86375    3.953147  
    1      3.851326   3.930299                                
    2      3.773453   3.927069                                
    3      3.879102   3.957266                                
    4      3.858202   3.954743                                
    5      3.852824   3.951508                                
    6      3.837561   3.9509                                  
    7      3.818845   3.947756                                
    8      3.809637   3.944036                                
    9      3.835555   3.942263                                
    10     3.824583   3.935868                                
    11     3.827287   3.932043                                
    12     3.817058   3.927741                                
    13     3.778389   3.927357                                
    14     3.779933   3.925774                                
    15     3.780848   3.918761                                
    16     3.746735   3.920191                                
    17     3.743517   3.915674                                
    18     3.752455   3.911835                                
    19     3.758213   3.908067                                
    20     3.768209   3.904584                                
    21     3.711149   3.904635                                
    22     3.770484   3.898746                                
    23     3.767993   3.897296                                
    24     3.707685   3.898568                                
    25     3.694116   3.898346                                
    26     3.749094   3.89368                                 
    27     3.727432   3.894122                                
    28     3.682065   3.89575                                 
    29     3.712119   3.894845                                
    30     3.721573   3.894399                                
    31     3.668023   3.89601                                 
    32     3.710865   3.896029                                

Why do you fit the model like this? Just for teaching purposes or is that the number of times the model must be fitted?

Getting error with apply_parallel in 1-Preprocessing

UnboundLocalError: local variable 'transformed_data' referenced before assignment while using below line of code. Please help out

%%time
pairs = flattenlist(apply_parallel(get_function_docstring_pairs_list, df.content.tolist(), cpu_cores=32))

Questions on how to use GAN in your article

Hi, Hamel. I’m an undergraduate student in a Chinese university, and I’m currently doing a project(actually my graduation project) on code generation. I have read your article Semantic Search on towardsdatascience.com which inspires me lot. I’m wondering the paragraph in your article “It should be noted that training a seq2seq model to summarize code is not the only technique you can use to build a feature extractor for code. For example, you could also train a GAN and use the discriminator as a feature extractor.

I feel confused about how to do this work on GAN because I feel there exist many difficulties so can you give me some specific advice or reference papers on how to do code search by GAN?

I have also read some other papers such as DeepCodeSearch(published on ICSE’ 18) by XiaoDong Gu, a HKUST professor. Their work is mainly on joint embedding and got good results on java code generation. Their work seems a little different with yours but also have good experimental results.

What’s more, I want to reproduce your work on pytorch. And I really hope I can get ideal results.

Sincerely

regarding issue in parameter

Why in keras and pytorch shows a diffrent parameter count always.I have made a model in torch and i campare with keras there in less learnable parameter than keras.why??

NameError: name 'LanguageModelData' is not defined

@hamelsmu

Hi, I am trying to do run this set of notebooks that are provided. I was able to preprocess the data successfully, but when I try to run the third notebook : "3 - Train Language Model Using FastAI", I get the following error: -

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-11-a18631c71446> in <module>
     12                                                   cycle_mult=2,
     13                                                   bs = 32,
---> 14                                                   wd = 1e-6)
     15 
     16 elif use_cache:

D:\code_search-github\lang_model_utils.py in train_lang_model(model_path, trn_indexed, val_indexed, vocab_size, lr, n_cycle, cycle_len, cycle_mult, em_sz, nh, nl, bptt, wd, bs)
    191 
    192     # create lang model data
--> 193     md = LanguageModelData(mpath, 1, vocab_size, trn_dl, val_dl, bs=bs, bptt=bptt)
    194 
    195     # build learner

NameError: name 'LanguageModelData' is not defined

In one of the issues at the FastAI repository I read that 'LanguageModelData' was replaced by 'TextLMDataBunch'. I have tried using that but didn't get any success.

How should I proceed forward so that I can run this notebook properly?

Python-Version - 3.6.8
Cuda-Version - 10.0
FastAI-Version - 1.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.