Hi again, Is it possible to extract a beam with N queries, instead o

Yes, you are correct. <a class="issue-link js-issue-link" data-error-text="Failed to l

Extract beam instead of best query about smbop HOT 8 CLOSED

PedroEstevesPT commented on September 27, 2024

Extract beam instead of best query

from smbop.

Comments (8)

OhadRubin commented on September 27, 2024 4

Yes, you are correct. #13 will fix that.
beam is now sorted such that the top element is beam[-1]

from smbop.

OhadRubin commented on September 27, 2024

Yeah, I have a code that does this in a notebook somewhere, i'll add it soon.

from smbop.

OhadRubin commented on September 27, 2024

@Muradean Hey, I added beam.py, you can use it similarly to eval.py to write to a file all the beam items.

from smbop.

PedroEstevesPT commented on September 27, 2024

Thanks a lot, I have been trying it out but it seems like the queries in the beam are not sorted by score:

sql: singer
sql: COUNT( * )
sql: SELECT COUNT( * ) FROM singer
sql: SELECT COUNT( * ) FROM singer
sql: SELECT COUNT( * ) FROM singer
sql: (SELECT COUNT( * ) FROM singer)
sql: SELECT COUNT( * ) FROM singer
sql: SELECT COUNT( * ) FROM singer
sql: SELECT COUNT( * ) FROM singer
sql: SELECT COUNT( * ) FROM singer
sql: (SELECT COUNT( * ) FROM singer)
sql: SELECT COUNT( * ) FROM singer
sql: singer
sql: COUNT( * )

As it can be seen, both the first and last_query in this example are wrong, however, some of the queries predicted by the model, in the middle, match the reference. Oddly enough, this problem does not happen when only predicting the best query.

After a closer inspection of the output of model.forward_on_instances I see that the beam_scores is None.

Maybe if the information in beam_scores was available, it would be possible to sort the generated queries?

Thanks a lot for the velocity in solving the issues.

from smbop.

PedroEstevesPT commented on September 27, 2024

Hi again,

I just wanted to point out another thing I found. While using the pre-trained model I noticed that 14 of the queries differ when comparing the results of eval.py and beam.py (using the top query from the beam). As a result, this causes the top 1 query from the beam to be wrong (this results in a 1.5% error in Execution Accuracy). Shouldn't the queries from eval.py match the last queries of the beam ?

Thanks a lot

Here follow the instances where the queries differ (query obtained from eval.py and beam[-1] from beam.py)

84 eval.py: has_pet , pets
84 beam.py: student

104 eval.py: SELECT DISTINCT car_names.model FROM cars_data JOIN car_names ON cars_data.id = car_names.makeid WHERE cars_data.year > 1980
104 beam.py: SELECT DISTINCT model_list.model FROM cars_data JOIN car_names ON cars_data.id = car_names.makeid WHERE cars_data.year > 1980

116 eval.py: car_makers , model_list
116 beam.py: model_list , countries , countries

172 eval.py: car_makers , car_names
172 beam.py: DISTINCT model_list.model

177 eval.py: car_makers.id = car_makers.country
177 beam.py: countries.countryid = car_makers.country AND model_list.maker = car_makers.id

178 eval.py: car_makers.id = car_makers.country
178 beam.py: countries.countryid , countries.countryname

240 eval.py: airlines , airports
240 beam.py: flights.destairport = airports.airportcode

552 eval.py: semesters , courses
552 beam.py: semesters.semester_name , semesters.semester_id

563 eval.py: SELECT AVG( transcripts.other_details ) FROM transcripts WHERE transcripts.other_details > (SELECT AVG( transcripts.other_details ) FROM transcripts)
563 beam.py: SELECT AVG( transcripts.other_details ) FROM transcripts WHERE transcripts.transcript_date > (SELECT AVG( transcripts.transcript_date ) FROM transcripts)

705 eval.py: SELECT COUNT( * ) FROM country WHERE country.governmentform = 'republic'
705 beam.py: SELECT COUNT( * ) FROM country WHERE country.governmentform = 'republics'

776 eval.py: city , country
776 beam.py: country.continent = 'Africa' AND country.population > (SELECT MAX( country.population ) FROM country WHERE country.continent = 'Africa') AND country.continent = 'Africa' AND country.population > (SELECT MAX( country.population ) FROM country WHERE country.region = 'Africa')

777 eval.py: city , country
777 beam.py: country.continent = 'Africa' AND country.population > (SELECT MAX( country.population ) FROM country WHERE country.continent = 'Africa') AND country.continent = 'Africa' AND country.population > (SELECT MAX( country.population ) FROM country WHERE country.region = 'Africa')

924 eval.py: charges , dogs
924 beam.py: dogs.name

984 eval.py: SELECT dogs.name , dogs.age , dogs.weight FROM dogs WHERE dogs.abandoned_yn = '1'
984 beam.py: SELECT dogs.name , dogs.age , dogs.weight FROM dogs WHERE dogs.abandoned_yn = 'no'

from smbop.

aarsri commented on September 27, 2024

I think this may occur because the top beam scores are equal for some of the examples posted above. In beam.py, taking the -1 element from argsort() returns something different than the argmax() used in eval.py, but in reality the top results all have the same beam score (i.e., they are all tied). For example, example 84 has top 3 scores [-3.4028e+38, -3.4028e+38, -3.4028e+38]. I am wondering, is it expected behavior for the top 3 results to have equal scores in these cases?

from smbop.

OhadRubin commented on September 27, 2024

That is really weird... They should be the same, maybe the masking is incorrect?.

from smbop.

OhadRubin commented on September 27, 2024

Ok, I see, in the case of example 84, all the items in the final beams are not valid, so they are maked, it makes sense to return any result.
I am more interested in the cases of 984 where beam.py returned a valid item and eval.py didn't.
And 705 where both are valid.

from smbop.

Extract beam instead of best query about smbop HOT 8 CLOSED

Comments (8)

84 eval.py: has_pet , pets
84 beam.py: student

104 eval.py: SELECT DISTINCT car_names.model FROM cars_data JOIN car_names ON cars_data.id = car_names.makeid WHERE cars_data.year > 1980
104 beam.py: SELECT DISTINCT model_list.model FROM cars_data JOIN car_names ON cars_data.id = car_names.makeid WHERE cars_data.year > 1980

116 eval.py: car_makers , model_list
116 beam.py: model_list , countries , countries

172 eval.py: car_makers , car_names
172 beam.py: DISTINCT model_list.model

177 eval.py: car_makers.id = car_makers.country
177 beam.py: countries.countryid = car_makers.country AND model_list.maker = car_makers.id

178 eval.py: car_makers.id = car_makers.country
178 beam.py: countries.countryid , countries.countryname

240 eval.py: airlines , airports
240 beam.py: flights.destairport = airports.airportcode

552 eval.py: semesters , courses
552 beam.py: semesters.semester_name , semesters.semester_id

705 eval.py: SELECT COUNT( * ) FROM country WHERE country.governmentform = 'republic'
705 beam.py: SELECT COUNT( * ) FROM country WHERE country.governmentform = 'republics'

924 eval.py: charges , dogs
924 beam.py: dogs.name

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Comments (8)

84 eval.py: has_pet , pets 84 beam.py: student

104 eval.py: SELECT DISTINCT car_names.model FROM cars_data JOIN car_names ON cars_data.id = car_names.makeid WHERE cars_data.year > 1980 104 beam.py: SELECT DISTINCT model_list.model FROM cars_data JOIN car_names ON cars_data.id = car_names.makeid WHERE cars_data.year > 1980

116 eval.py: car_makers , model_list 116 beam.py: model_list , countries , countries

172 eval.py: car_makers , car_names 172 beam.py: DISTINCT model_list.model

177 eval.py: car_makers.id = car_makers.country 177 beam.py: countries.countryid = car_makers.country AND model_list.maker = car_makers.id

178 eval.py: car_makers.id = car_makers.country 178 beam.py: countries.countryid , countries.countryname

240 eval.py: airlines , airports 240 beam.py: flights.destairport = airports.airportcode

552 eval.py: semesters , courses 552 beam.py: semesters.semester_name , semesters.semester_id

705 eval.py: SELECT COUNT( * ) FROM country WHERE country.governmentform = 'republic' 705 beam.py: SELECT COUNT( * ) FROM country WHERE country.governmentform = 'republics'

924 eval.py: charges , dogs 924 beam.py: dogs.name

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org

84 eval.py: has_pet , pets
84 beam.py: student

104 eval.py: SELECT DISTINCT car_names.model FROM cars_data JOIN car_names ON cars_data.id = car_names.makeid WHERE cars_data.year > 1980
104 beam.py: SELECT DISTINCT model_list.model FROM cars_data JOIN car_names ON cars_data.id = car_names.makeid WHERE cars_data.year > 1980

116 eval.py: car_makers , model_list
116 beam.py: model_list , countries , countries

172 eval.py: car_makers , car_names
172 beam.py: DISTINCT model_list.model

177 eval.py: car_makers.id = car_makers.country
177 beam.py: countries.countryid = car_makers.country AND model_list.maker = car_makers.id

178 eval.py: car_makers.id = car_makers.country
178 beam.py: countries.countryid , countries.countryname

240 eval.py: airlines , airports
240 beam.py: flights.destairport = airports.airportcode

552 eval.py: semesters , courses
552 beam.py: semesters.semester_name , semesters.semester_id

705 eval.py: SELECT COUNT( * ) FROM country WHERE country.governmentform = 'republic'
705 beam.py: SELECT COUNT( * ) FROM country WHERE country.governmentform = 'republics'

924 eval.py: charges , dogs
924 beam.py: dogs.name