Comments (8)
@Downchuck We're already bundling params per type of command sent to execute in addN. What is your proposed change?
from rdflib-sqlalchemy.
We may have been using an old version; I'll circle back after debugging.
Turning on echo logging was showing separate INSERT statements in my addN / sql_graph += memory_graph
statement.
from rdflib-sqlalchemy.
Something like this here would be significantly faster, using placeholders and executemany, as placeholders are about 3x faster than a dictionary for sql alchemy, and executemany is significantly faster for all sql engines which support it for inserts.
statement = self._add_ignore_on_conflict(command['statement'], with_placeholders)
connection.execute(statement, command['rows'])
Instead of
for command in commands_dict.values():
statement = self._add_ignore_on_conflict(command['statement'])
connection.execute(statement, command["params"])
Loaded by the bulk loading method -- addN - as direct parse will do add, which is one at at time:
test_graph = ConjunctiveGraph()
test_graph.parse(data = test_data, format='nt')
# bulk insert seems ok here:
%time sql_graph = sql_graph + graph
from rdflib-sqlalchemy.
At version 0.4.0, currently on master, in
for command in commands_dict.values():
statement = self._add_ignore_on_conflict(command['statement'])
connection.execute(statement, command["params"])
command["params"]
is a list of dict
s with parameters for statement
, a 'sqlalchemy.sql.dml.Insert' like INSERT INTO kb_bec6803d52_type_statements (id, member, klass, context, termcomb) VALUES (:id, :member, :klass, :context, :termComb)
. This should already call executemany. One way you can confirm this is by calling addN on an rdflib-sqlalchemy store backed by sqlite3 with cProfile enabled. You should see do_executemany
called in sqlalchemy and executemany
called on the sqlite3 cursor.
from rdflib-sqlalchemy.
ah, I either misread my debug output -- "echo" works just fine for Sqlite3 too -- or I did a bad job at calling addN. Closing issue.
I did notice benchmarks that show placeholder styles could be a bit faster for large data, but that's not the purpose of this report. Closing.
from rdflib-sqlalchemy.
While this is open - I ran a view/materialization query, and that one using literal did use placeholders, but it also wound up using insert one record at a time.
Is there something just off with this count query?
"""PREFIX tg: <http://www.context.com>
INSERT {
GRAPH <tg:po-cnt> {?p ?o ?ct}
}
WHERE {
SELECT ?p ?o (count(*) as ?ct)
{?s ?p ?o}
GROUP BY ?p ?o
}""")
from rdflib-sqlalchemy.
For that, it would depend on the combinations of predicates and objects in your source graph -- INSERT is evaluated for every unique ?p and ?o and that happens to translate into one call to addN for each evaluation.
Please check out RDFLib support resources for further questions.
from rdflib-sqlalchemy.
Thanks @mwatts15 -- that's the issue I was wondering about. It's unfortunate that each call results in one addN, instead of a bulk insertion, given a trivial query.
That said, this does seem like a potential issue upstream -- so I will work further on understanding those abstractions to see if these trivial aggregates can be improved.
from rdflib-sqlalchemy.
Related Issues (20)
- Violent deprecation warnings vs sqlalchemy 1.4.46 HOT 12
- 0.5.3: pytest is failing in 11 units HOT 3
- Maintainer Request HOT 17
- Extra "test" subdirectory in tests ? HOT 1
- Drop support for old Python versions HOT 6
- Build error for Python 2.7 for MySQL HOT 1
- Engine.has_table() method is deprecated HOT 1
- addN returns how many triples were actually added? HOT 2
- Loading a large ontology using PostgreSQL as the backend results in BTREE index error HOT 2
- How to query the database? HOT 2
- Cryptic `KeyError: LUUU` when adding an invalid triple into a graph HOT 1
- Failure when trying to search for a triple where object is a Literal with a language tag HOT 1
- When does a transaction start? HOT 1
- 0.5.0: test suite is usimg `nose` HOT 4
- Performance warning HOT 5
- N3 Formula <-> SQLAlchemy Store error HOT 3
- 0.5.3: sphinx warnings `reference target not found` HOT 3
- N3 formula/rule persistence not idempotent HOT 7
- docstring for the `configuration` argument of `rdflib_sqlalchemy.store.SQLAlchemy` seems wrong HOT 1
- Adding graph_aware for `rdflib.Dataset`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rdflib-sqlalchemy.