Comments (3)
Hello @rsancheztaksoiai
Do you mean the example using python package?
pip install schema-matching
from schema_matching import schema_matching
df_pred,df_pred_labels,predicted_pairs = schema_matching("Test Data/QA/Table1.json","Test Data/QA/Table2.json")
print(df_pred)
print(df_pred_labels)
for pair_tuple in predicted_pairs:
print(pair_tuple)
I just tried that and it works well:
schema_matching|Loading sentence transformer, this will take a while...
schema_matching|Done loading sentence transformer
data.title ... paragraphs.context
questions.body 0.002472 ... 0.001018
questions.documents 0.000888 ... 0.000574
questions.ideal_answer 0.000896 ... 0.011124
questions.concepts 0.000594 ... 0.003764
questions.type 0.004110 ... 0.000112
questions.id 0.000075 ... 0.000093
snippets.offsetInBeginSection 0.000063 ... 0.000174
snippets.offsetInEndSection 0.000066 ... 0.000165
snippets.text 0.000282 ... 0.016571
snippets.beginSection 0.001831 ... 0.000513
snippets.document 0.000643 ... 0.000653
snippets.endSection 0.004702 ... 0.000530
triples.p 0.000357 ... 0.000383
triples.s 0.000438 ... 0.000388
triples.o 0.000965 ... 0.002000
questions.exact_answer 0.000799 ... 0.000229
[16 rows x 9 columns]
data.title ... paragraphs.context
questions.body 0 ... 0
questions.documents 0 ... 0
questions.ideal_answer 0 ... 0
questions.concepts 0 ... 0
questions.type 0 ... 0
questions.id 0 ... 0
snippets.offsetInBeginSection 0 ... 0
snippets.offsetInEndSection 0 ... 0
snippets.text 0 ... 0
snippets.beginSection 0 ... 0
snippets.document 0 ... 0
snippets.endSection 0 ... 0
triples.p 0 ... 0
triples.s 0 ... 0
triples.o 0 ... 0
questions.exact_answer 0 ... 0
[16 rows x 9 columns]
('questions.body', 'qas.question', 0.86622685)
('questions.concepts', 'qas.question', 0.17055672)
('questions.id', 'qas.id', 0.5095535)
('snippets.offsetInBeginSection', 'answers.answer_start', 0.9288852)
('snippets.offsetInEndSection', 'answers.answer_start', 0.86390895)
('questions.exact_answer', 'answers.text', 0.5319033)
('questions.exact_answer', 'plausible_answers.text', 0.56676453)
from python-schema-matching.
Hi @fireindark707 , thanks for your quick response!
Yeah, I meant that example, these are the steps I'm performing:
$mkdir python_matching
$cd python_matching/
$python3 -m venv env
$source env/bin/activate
$pip install schema-matching
$touch matching_test.py ( and copy the example code into it)
$cp -r 'Test Data'/ ~/python_matching/TestData (copy the Test Data directory from the repository)
$python3 matching_test.py
schema_matching|Loading sentence transformer, this will take a while...
schema_matching|Done loading sentence transformer
Traceback (most recent call last):
File "/python_matching/matching_test.py", line 3, in /python_matching/env/lib/python3.10/site-packages/schema_matching/cal_column_similarity.py", line 83, in schema_matching
df_pred,df_pred_labels,predicted_pairs = schema_matching("TestData/QA/Table1.json","TestData/QA/Table2.json")
File "
features,_ = make_data_from(table1_df, table2_df, type="test")
File "/python_matching/env/lib/python3.10/site-packages/schema_matching/relation_features.py", line 122, in make_data_from/python_matching/env/lib/python3.10/site-packages/schema_matching/relation_features.py", line 88, in get_colnames_features
colnames_features = get_colnames_features(c1_name, c2_name,column_name_embeddings)
File "
colnames_features = np.array([bleu_score, edit_distance, lcs,transformer_score, one_in_one])
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (5,) + inhomogeneous part.
from python-schema-matching.
i was getting this error and able to fix the issue by adding below line:
transformer_score = transformer_score.item()
in the function :
def get_colnames_features(text1,text2,column_name_embeddings):
"""
Use BLEU, edit distance and word2vec to calculate features.
"""
bleu_score = bleu([text1], text2, smoothing_function=smoothie)
print(type(bleu_score))
edit_distance = damerau.distance(text1, text2)
lcs = metriclcs.distance(text1, text2)
transformer_score = util.cos_sim(column_name_embeddings[text1], column_name_embeddings[text2])
transformer_score = transformer_score.item()
one_in_one = text1 in text2 or text2 in text1
colnames_features = np.array([bleu_score, edit_distance, lcs,transformer_score, one_in_one])
return colnames_features
from python-schema-matching.
Related Issues (4)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from python-schema-matching.