xiaojunxu / sqlnet Goto Github PK

View Code? Open in Web Editor NEW

426.0 426.0 160.0 56.74 MB

Neural Network for generating structured queries from natural language.

License: BSD 3-Clause "New" or "Revised" License

Shell 0.16% Python 99.84%

sqlnet's People

Stargazers

Watchers

Forkers

neptune4year liuchangacm woniuhu aneesha cosecant-csc shawnli liushui9404 milledragon jeanxi qyliuai bwendy1 eudie yitongcu prebenleroy soechun ikshu ml-nic shubhampachori12110095 rulai-yananxie camdenclark lvyufeng emezac oneliner-translations jinlongyu60 iamsusiep fendaq tin-chata lumiqai chenglongchen samba1997 sungjinlees miyamm guanguangua renkexinmay tushn pwforks amberian abhishek-prusty joeycrona0330 hidhineshraja pyseany arinarmo datankai prowisedatascience tomfat abis330 sakshi-23 manifold-labs benhoff dingzhelun dhs007 guptam blurrrb omkarprabhu-98 devalnaik jordanott aspk henghuiz-zz sgsvnk ms8909 ieee820 reloadbrain takayuki211 stjordanis qingspring pyyyyyyy lduml stuartchan xiandshi futong saumitrabg jeinlee1991 dhruv121197 fishewyz bilal-rachik akhilkishore nomkat awesomemachinelearning imransalam littlematch7 gajmp chenshaolong sayali-p ravi-0809 xqjiang1993 luyi666 sourish97 pokbe nayan-das candyyinzx suyi32 amirstudy eweiguu not4win pq2385601 bbrangeo devanshpandey saisimha97 ssitb bohanjason

sqlnet's Issues

training with reinforcement learning taking unsually long time

even with the GPU the training takes a very long time. Has anyone faced this issue

ValueError: all input arrays must have the same shape

Traceback (most recent call last):
File "extract_vocab.py", line 62, in
emb_array = np.stack(embs, axis=0)
File "C:\python\Anaconda3\lib\site-packages\numpy\core\shape_base.py", line 347, in stack
raise ValueError('all input arrays must have the same shape')

OS：windows10 64bit
python:3.6.1
cmd:python extract_vocab.py

Problem when using it with dataset other than WikiSQL

while i m running the train.py file with a dataset other than WikiSQL. i meet the error as follow:

Traceback (most recent call last):
File "train.py", line 128, in
sql_data, table_data, TRAIN_ENTRY)
File
"/[email protected]#0/sqlnet/utils.py", line 146, in epoch_train
gt_where=gt_where_seq, gt_cond=gt_cond_seq, gt_sel=gt_sel_seq)
File
"/[email protected]#0/sqlnet/model/sqlnet.py", line 141, in forward
gt_where, gt_cond, reinforce=reinforce)
File "/opt/conda/envs/python2.7/lib/python2.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File
"/[email protected]#0/sqlnet/model/modules/sqlnet_condition_predict.py", line 253, in forward
cond_str_score[b, :, :, num:] = -100
IndexError: too many indices for tensor of dimension 3

Can anyone help?

How to use it with DB other than WikiSQL

I would like to understand how to use it database other than WikiSQL, I'm new to ML and would like to use it for querying attendance data. Can you please provide instructions to implement it?

Why cond_num_out is fixed to 5?

self.cond_num_out = nn.Sequential(nn.Linear(N_h, N_h),nn.Tanh(), nn.Linear(N_h, 5))

https://github.com/xxj96/SQLNet/blob/master/sqlnet/model/modules/sqlnet_condition_predict.py#L22

@xxj96 Thank you!!

about "order matters" problem

hi, Different conditional orders can produce the same query results. Our goal is to query the results. Why does this affect performance instead? Incomprehension
thanks

dataset error: sqlite3.ProgrammingError, sqlalchemy.exc.ProgrammingError:

Hi, when I run 'python test.py --ca' to get execution results, it fails at 'print("Dev execution acc: {}".format(epoch_exec_acc(model, BATCH_SIZE, val_sql_data, val_table_data, DEV_DB)))'.
The error is like this:
rror closing cursor
Traceback (most recent call last):
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1268, in fetchone
row = self._fetchone_impl()
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1148, in _fetchone_impl
return self.cursor.fetchone()
sqlite3.ProgrammingError: Cannot operate on a closed database.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 1325, in _safe_close_cursor
cursor.close()
sqlite3.ProgrammingError: Cannot operate on a closed database.
Traceback (most recent call last):
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1268, in fetchone
row = self._fetchone_impl()
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1148, in _fetchone_impl
return self.cursor.fetchone()
sqlite3.ProgrammingError: Cannot operate on a closed database.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "test.py", line 83, in
model, BATCH_SIZE, test_sql_data, test_table_data, TEST_DB)))
File "/data/home/naturallanguage/text2sql/sqlnet/utils.py", line 178, in epoch_exec_acc
ret_gt = engine.execute(tid, sql_gt['sel'], sql_gt['agg'], sql_gt['conds'])
File "/data/home/naturallanguage/text2sql/sqlnet/lib/dbengine.py", line 25, in execute
table_info = self.db.query('SELECT sql from sqlite_master WHERE tbl_name = :name', name=table_id).all()[0].sql.replace('\n','')
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/records.py", line 195, in all
rows = list(self)
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/records.py", line 126, in iter
yield next(self)
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/records.py", line 136, in next
nextrow = next(self._rows)
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/records.py", line 365, in
row_gen = (Record(cursor.keys(), row) for row in cursor)
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 946, in iter
row = self.fetchone()
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1276, in fetchone
e, None, None, self.cursor, self.context
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 1458, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/sqlalchemy/util/compat.py", line 296, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/sqlalchemy/util/compat.py", line 276, in reraise
raise value.with_traceback(tb)
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1268, in fetchone
row = self._fetchone_impl()
File "/data/anaconda/envs/py35/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1148, in _fetchone_impl
return self.cursor.fetchone()
sqlalchemy.exc.ProgrammingError: (sqlite3.ProgrammingError) Cannot operate on a closed database. (Background on this error at: http://sqlalche.me/e/f405)

Any hint on how to solve this? Many thanks!

请问下哈，为啥我感觉，训练阶段为啥就decode了一次？

https://github.com/xiaojunxu/SQLNet/blob/master/sqlnet/model/modules/sqlnet_condition_predict.py#L229

g_str_s_flat, _ = self.cond_str_decoder(
                    gt_tok_seq.view(B*4, -1, self.max_tok_num))

因为pytorch的LSTM就是一整个sequence作为输入，一整个sequence输出？

应该是……

@xiaojunxu 多谢多谢！！！

Column Slots Equation representation

Hi Xiaojun Xu ,

I'm trying to understand the below equation in your paper for finding the number of columns in where condition.
P#col(K|Q) = softmax(U#col1 tanh(U#col2 EQ|Q))i

Can you please explain Eq|q here?

Thanks,
Niyas

Help!! Don't have Cuda

Hi, This package uses gpu (cuda) for processing which is not available in my server. Can you please guide me how to use it without gpu.

P.S. working on a critical project. Would be really grateful towards early help.

Thanks & Regards,
Manas

Assertion `cur_target >= 0 && cur_target < n_classes' failed

I am getting error in the following line:-

loss += self.CE(sel_score, sel_truth_var)

Error:-
RuntimeError: Assertion `cur_target >= 0 && cur_target < n_classes' failed. at c:\new-builder_3\win-wheel\pytorch\aten\src\thnn\generic/ClassNLLCriterion.c:93

What does the code mean?

SQLNet/sqlnet/model/modules/seq2sql_condition_predict.py

Line 86 in 533ac0d

while len(done_set) < B*4 and t < 100:

I'm so sorry to trouble you, but I really can not understand the logic of that code.

Thank you @xxj96

extract_vocab.py : all input arrays must have the same shape

When executing extract_vocab.py it raised this error :

(base) C:\Users\Albel\Documents\SQLNet>python extract_vocab.py
Loading from original dataset
Loading data from %s data/train_tok.jsonl
Loading data from %s data/train_tok.tables.jsonl
Loading data from %s data/dev_tok.jsonl
Loading data from %s data/dev_tok.tables.jsonl
Loading data from %s data/test_tok.jsonl
Loading data from %s data/test_tok.tables.jsonl
Loading word embedding from %s glove/glove.42B.300d.txt
Length of word vocabulary: %d 1917495
Length of used word vocab: %s 39936
Traceback (most recent call last):
File "extract_vocab.py", line 62, in
emb_array = np.stack(embs, axis=0)
File "C:\Anaconda3\lib\site-packages\numpy\core\shape_base.py", line 353, in stack
raise ValueError('all input arrays must have the same shape')
ValueError: all input arrays must have the same shape

论文里公式(1)的dimension貌似对不上？

E_col是[batch_size, column_num, hidden_size]
E_Q是[batch_size, question_words_num, hidden_size]

@xiaojunxu 多谢！！！

Any reason for not using sequence labeling model?

Memory Issue

while i m running the extract_vocab.py file... the memory used is pushed to 98% and i m scared to continue running the script and hence i had to stop the script.
Can anyone help
@xiaojunxu what should i do to deal with this?

Error in Train.py

?Seems that train.py generate errors. is there any pre-requisits
(

base) C:\Users\Albel\Documents\SQLNet>python train.py --ca
Loading from original dataset
Loading data from %s data/train_tok.jsonl
Loading data from %s data/train_tok.tables.jsonl
Loading data from %s data/dev_tok.jsonl
Loading data from %s data/dev_tok.tables.jsonl
Loading data from %s data/test_tok.jsonl
Loading data from %s data/test_tok.tables.jsonl
Loading word embedding from %s glove/glove.42B.300d.txt
Using fixed embedding
Traceback (most recent call last):
File "train.py", line 57, in
gpu=GPU, trainable_emb = args.train_emb)
File "C:\Users\Albel\Documents\SQLNet\sqlnet\model\sqlnet.py", line 43, in init
self.agg_pred = AggPredictor(N_word, N_h, N_depth, use_ca=use_ca)
File "C:\Users\Albel\Documents\SQLNet\sqlnet\model\modules\aggregator_predict.py", line 18, in init
dropout=0.3, bidirectional=True)
File "C:\Anaconda3\lib\site-packages\torch\nn\modules\rnn.py", line 425, in init
super(LSTM, self).init('LSTM', *args, **kwargs)
File "C:\Anaconda3\lib\site-packages\torch\nn\modules\rnn.py", line 52, in init
w_ih = Parameter(torch.Tensor(gate_size, layer_input_size))
TypeError: new() received an invalid combination of arguments - got (float, int), but expected one of:

(torch.device device)

(torch.Storage storage)

(Tensor other)

(tuple of ints size, torch.device device)
didn't match because some of the arguments have invalid types: (�[31;1mfloat�[0m, �[31;1mint�[0m)

(object data, torch.device device)
didn't match because some of the arguments have invalid types: (�[31;1mfloat�[0m,

�[31;1mint�[0m)

Aggregation Prediction uses select column from the ground truth

The below code snippet looks like using the column position for select in evaluation as well as training.

SQLNet/sqlnet/utils.py

Line 204 in 5dfb96e

gt_sel_seq = [x[1] for x in ans_seq]

SQLNet/sqlnet/model/sqlnet.py

Line 131 in 5dfb96e

col_name_len, col_len, col_num, gt_sel=gt_sel)

SQLNet/sqlnet/model/modules/aggregator_predict.py

Line 41 in 5dfb96e

chosen_sel_idx = torch.LongTensor(gt_sel)

Don't you think the model should predict the column for select instead of the column given as ground truth before aggregation prediction?
In fact, the select column will not be given in prediction time when applying this to real worlds.

You made selection prediction, so the output could be fed with the aggregation prediction.
In result, the evaluation number might be wrong while comparing with the original paper, seq2sql.

Why over-fitting is the biggest problem for WHERE-clause?

Epoch 300 @ 2018-03-25 16:22:07.151084
 Loss = 0.16111447376294669
 Train acc_qm: 0.957164404223
   breakdown result: [0.99918375 0.99655754 0.96026972]
 Dev acc_qm: 0.579147369671
   breakdown result: [0.880418   0.89882437 0.70383565]
 Best val acc = (0.8990618691366821, 0.9055931599572498, 0.7138107113169457), on epoch (3, 48, 259) individually

@xiaojunxu
Thank you!!

Do you think text classification is a better method for AGG prediction?

用文本分类模型来做AGG的预测，一共就MIN MAX COUNT等几类
Thank you
@xiaojunxu

Why ground truth(gt_sel) is participate here?

https://github.com/xxj96/SQLNet/blob/master/sqlnet/model/modules/aggregator_predict.py#L41

What if test/predict mode which don't have the ground truth?

Thank you @xxj96

Error in rnn.py file

After making a few changes in the utils.py file, as well as changing from python 2 to Python 3, I am getting an error in the rnn.py file when i try to run the train.py file. The error states- "TypeError: super(type, obj): obj must be an instance or subtype of type". I changed "super(LSTM, self).init('LSTM', *args, **kwargs)" to the below two lines-
self.as_super = super(LSTM, self)
self.as_super.init('LSTM', *args, **kwargs) in the def init function,
but to no avail.
Any help would be appreciated.

Other datasets

has anyone been able to test this model with other datasets like the IMDB or SENLIDB. If yes could you please guide me as to how the files need to be prepared

Different between cond_num_lstm and cond_num_name_enc

Could you explain the differences between cond_num_lstm and cond_num_name_enc?

Results for Logical Form Accuracy

I'm just curious about why the results for logical form accuracy is not included in Table 1 at https://arxiv.org/pdf/1711.04436.pdf, though it is mentioned in the text that SQLNet outperforms Seq2SQL by 10-13 points. Can you please explain?

And can you please help find the code, which is used to calculate logical form accuracy?

IndexError: too many indices for array

Hi,

I got the following error while training. (python train.py --ca)

i guess in utils.py line 145
loss.data.cpu().numpy() is an empty array

can you please let us know how to resolve this issue.

What does ‘4’ mean? Have trouble to figure out...

# Pad the columns to maximum (4)

https://github.com/xxj96/SQLNet/blob/master/sqlnet/model/modules/sqlnet_condition_predict.py#L191

It also appears a lot of time in the file.

@xxj96 Thank you!!

A typo

SQLNet/sqlnet/model/seq2sql.py

Line 179 in 8997542

cond_pred_score, cond_truth_var) / len(gt_where) )

It should be len(gt_where[b]).

Ressources and runtimes

Hi,

I was wondering whether you could give some informations about the resources you used and which runtimes you achieved.

How many and what kind of GPUs did you use?
What runtimes did you obtain for the normal dataset and for the toy-dataset (for debugging)?

Best regards,
Sebastian

Why the result is different for two time of run?

The first 100 epoch result for where differ from 0.71547322170763572 to 0.68685429283933019.
@xiaojunxu Thank you

Errors during training : Help needed !

Python : 3.6
OS : windows 10

Dear all,

I tried to figure out what is going wrong but due to my limited knowledge, I'm still facing some issues :

1 / First : without changing anything to the code, I receiving this error :

(base) C:\Users\albel\Documents\SQLNet>python train.py --ca
Loading from original dataset
Loading data from data/train_tok.jsonl
Loading data from data/train_tok.tables.jsonl
Loading data from data/dev_tok.jsonl
Loading data from data/dev_tok.tables.jsonl
Loading data from data/test_tok.jsonl
Loading data from data/test_tok.tables.jsonl
Loading word embedding from glove/glove.42B.300d.txt
Using fixed embedding
Using column attention on aggregator predicting
Using column attention on selection predicting
Using column attention on where predicting
C:\Users\albel\Documents\SQLNet\sqlnet\model\modules\aggregator_predict.py:55: UserWarning: Implicit dimension choice for softmax has been deprecated. Change 
Init dev acc_qm: 0.0
  breakdown on (agg, sel, where): [0.09250683 0.17895737 0.        ]
Epoch 1 @ 2018-08-20 14:06:54.446966
Traceback (most recent call last):
  File "train.py", line 128, in <module>
    sql_data, table_data, TRAIN_ENTRY))
  File "C:\Users\albel\Documents\SQLNet\sqlnet\utils.py", line 144, in epoch_train
    loss = model.loss(score, ans_seq, pred_entry, gt_where_seq)
  File "C:\Users\albel\Documents\SQLNet\sqlnet\model\sqlnet.py", line 152, in loss
    data = torch.from_numpy(np.array(agg_truth))
**TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: double, float, float16, int64, int32, and uint8.**

2/secondly : When I'm forcing the "dtype = float32" but I tried also the others and I'm still getting another error. Whatever I 'm doing to force the type of "data" variable, I'm still getting errors.

(base) C:\Users\albel\Documents\SQLNet>python train.py --ca
Loading from original dataset
Loading data from data/train_tok.jsonl
Loading data from data/train_tok.tables.jsonl
Loading data from data/dev_tok.jsonl
Loading data from data/dev_tok.tables.jsonl
Loading data from data/test_tok.jsonl
Loading data from data/test_tok.tables.jsonl
Loading word embedding from glove/glove.42B.300d.txt
Using fixed embedding
Using column attention on aggregator predicting
Using column attention on selection predicting
Using column attention on where predicting

Init dev acc_qm: 0.0
  breakdown on (agg, sel, where): [0.03811899 0.14772592 0.        ]
Epoch 1 @ 2018-08-20 13:58:02.098906
Traceback (most recent call last):
  File "train.py", line 128, in <module>
    sql_data, table_data, TRAIN_ENTRY))
  File "C:\Users\albel\Documents\SQLNet\sqlnet\utils.py", line 144, in epoch_train
    loss = model.loss(score, ans_seq, pred_entry, gt_where_seq)
  File "C:\Users\albel\Documents\SQLNet\sqlnet\model\sqlnet.py", line 152, in loss
    _**data = torch.from_numpy(np.array(agg_truth,dtype=np.float32))**_
**TypeError: float() argument must be a string or a number, not 'map'**

Can someone guides me to solve this ? Thanks in advance.

A prediction file to get predicted query instead of agg,sel and conds

Hi,
Can anyone help me how to convert predicted agg,sel and conds to sql query?

Also, is there a predict.py which can be used for predicting a sql query from custom question and table details instead of just using in fixed test.py dataset

Why the variable 'col_inp_var' is not used?

SQLNet/sqlnet/model/modules/seq2sql_condition_predict.py

Line 52 in 8997542

def forward(self, x_emb_var, x_len, col_inp_var, col_name_len, col_len,

Sorry to trouble you again, but I think the embedding of columns should be used to encode the hidden state.

Thank you @xxj96

Help needed

Is there any one who can share workable code. I went through the whole installation and I'm ending up with errors.
or at least a trained model.
thanks

Addition of tensors of different size

I am having issue adding 2 tensors of different sizes. What could be the possible solution?

In seq2sql_condition_predict_rl.py (4D tensor addition)
cond_score = self.cond_out( self.cond_out_h(h_enc_expand) +self.cond_out_g(g_s_expand) ).squeeze()

In selection_predict_rl.py (3D tensor addition)
sel_score = self.sel_out( self.sel_out_K(K_sel_expand) + self.sel_out_col(e_col) ).squeeze()

error like The size of tensor a (26) must match the size of tensor b (15) at non-singleton dimension 0

Test time still using ground truth sql?

Hi, thanks for providing the source code of your model. There is one thing I am not quite sure. Is the model using part of the ground truth SQL query as input to the aggregator predictor during dev/test time?

The test calls epoc_acc function
This is in the epoc_acc function( ) in the sqlnet/utils.py epoc_acc function(), where the code is still feeding gt_sel_seq to the model. I think at dev/test time should first generate the columns in the select clause, and feed that result as input to the aggregator .

    q_seq, col_seq, col_num, ans_seq, query_seq, gt_cond_seq, raw_data = to_batch_seq(sql_data, table_data, perm, st, ed, ret_vis_data=True)
    raw_q_seq = [x[0] for x in raw_data]
    raw_col_seq = [x[1] for x in raw_data]
    query_gt, table_ids = to_batch_query(sql_data, perm, st, ed)
    gt_sel_seq = [x[1] for x in ans_seq]
    score = model.forward(q_seq, col_seq, col_num,
            pred_entry, gt_sel = gt_sel_seq)

Results not as good as in paper

Hi Xiaojun,

I trained the model without changing any hyperparameter's value. (python train.py --ca)

When executing the test.py, I obtain the following accuracy scores:

Dev acc_qm: 0.584253651585;
  breakdown on (agg, sel, where): [0.90048688 0.91307446 0.68459803]
Dev execution acc: 0.654435340221
Test acc_qm: 0.571671495151;
  breakdown on (agg, sel, where): [0.90212873 0.90370324 0.67092833]
Test execution acc: 0.641768484696

These results are several points below the ones reported in your paper.

Although you do not report Acc_qm and Acc_ex for your model when the word embedding isn't allowed to train, you mention in section 4.3 that the improvement is about 2 points when training the word_embedding.
After subtracting these 2 points to the results reported on table 1, there still is a 2-3 points difference between my results and yours.

My question is:
Are the results reported in the paper the best ones you obtained after running the whole training procedure multiple time? In that case, were the results obtained on average closer to mines or yours ? How many times did you run the training procedure to obtain those results ?

Thanks,
Thomas

如果我理解没错的话……

如图sequence-to-set其实是得到这个概率之后取TOP K概率的columns作为set？
而不是TOP1
@xiaojunxu 多谢！！！

Trained model

Hello! I am studying SQLNet and first I would like to congratulate you for the great work you have done in this paper. I got your code from https://github.com/xiaojunxu/SQLNet but I could not run the tests using your trained model once the "saved_model" folder is empty. Could you please share the trained model? Thank you!!

Tokenization script

Hi @xiaojunxu

Could you upload your tokenizatoin script?
The reason is that I found there are some difference in "question" and "query_tok" sometimes.
For example, at 25th data in dev_tok.jsonl,

"question": "What is the district when the total amount of trees is smaller than 150817.6878461314 and amount of old trees is 1,928 (1.89%)?",
However, in "query_tok": ["SELECT", "district", "WHERE", "total", "amount", "of", "trees", "LT", "150817.687846", "AND", "amount", "of", "old", "trees", "EQL", "1,928", "(", "1.89", "%", ")"],

You can see that float number is different somehow.
So, if possible, I would like to modify the tokenization script.

Thanks!

Prediction model

Do you think it would be possible to write a predict.py file to verify live test the result based on the trained model

Issue in running python extract_vocab.py

Error while loading word embedding glove

Logs:
Loading from original dataset
Loading data from data/train_tok.jsonl
Loading data from data/train_tok.tables.jsonl
Loading data from data/dev_tok.jsonl
Loading data from data/dev_tok.tables.jsonl
Loading data from data/test_tok.jsonl
Loading data from data/test_tok.tables.jsonl
Loading word embedding from glove/glove.42B.300d.txt
Traceback (most recent call last):
File "extract_vocab.py", line 23, in
use_small=USE_SMALL)
File "C:\Users\SQLNet\sqlnet\utils.py
", line 274, in load_word_emb
for idx, line in enumerate(inf):
File "C:\Users\miniconda3\lib\encodings\cp1252.py", line 23, in dec
ode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2438: cha
racter maps to

What does max_tok_num mean?

https://github.com/xxj96/SQLNet/blob/master/sqlnet/model/sqlnet.py#L25

To the best of my knowledge, I guess self.max_tok_num is more like a hidden dim and is not the max token number.

Thank you @xxj96

ImportError: No module named records

Getting this error when trying to run extract_vocab.py.
Pls. view the screenshot below for error details
https://drive.google.com/open?id=1Fb2f0wacXwwhd1Mzz6c3RfHbQATQnR84

permission denied 'word2idx.json' ?

when using python extract_vocab.py, there is an error,
error 13: permission denied 'word2idx.json'

when I unzip the glove.XX.zip, there is no file named word2idx.json, would you please tell me how to deal with such an error ?

How can I use SQLNet for a new database?

Very impressing work! I am curious to know whether one can use SQLNet on a new database and how to do it. Could you give some hints? Thanks!

What does this line mean? Why None?

torch.autograd.backward(cond_score[1], [None for _ in cond_score[1]])

https://github.com/xiaojunxu/SQLNet/blob/master/sqlnet/model/seq2sql.py#L197

@xiaojunxu Thank you!

condition accuracy

If anyone can please assist i ran the code but i am finding the condition accuracy is not computing i am currently getting the below results. for both the SQLNet and Seq2SQL
the best_cond_acc = init_acc[1][2] gives 0.0 and i'm trying to debug but i am not finding where the error is.
Init dev acc_qm: 0.0
breakdown on (agg, sel, where): [0.046875 0.125 0. ]
Thank you

Predict the number of conditions

Hi xiaojunxu, I don't find how to predict the number of conditions in your paper.

这三个unsqueeze再过MLP的操作实在不懂为啥要这样啊……

https://github.com/xiaojunxu/SQLNet/blob/master/sqlnet/model/modules/sqlnet_condition_predict.py#L233-L235

h_ext = h_str_enc.unsqueeze(1).unsqueeze(1)
g_ext = g_str_s.unsqueeze(3)
col_ext = col_emb.unsqueeze(2).unsqueeze(2)

@xiaojunxu 多谢！！！

xiaojunxu / sqlnet Goto Github PK

sqlnet's People

Stargazers

Watchers

Forkers

sqlnet's Issues

Recommend Projects

Recommend Topics

Recommend Org