mgaare / cumulusrdf Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/cumulusrdf
Automatically exported from code.google.com/p/cumulusrdf
What steps will reproduce the problem?
1. when the HttpRepository send a clear request with null context
2. the RepositoryConnectino that is get from the repository in servletContext
would execute like conn.clear(null)
3. then get the message as "not supported: contexts == null || contexts.length
== 0"
What is the expected output? What do you see instead?
according to the sesame API, if the context is null, then it would clear the
whole repository. So it should support this operation instead.
Please use labels and text to provide additional information.
Original issue reported on code.google.com by [email protected]
on 8 Apr 2014 at 12:51
SPARQLServlet does not send error properly. See attached stacktrace.
Original issue reported on code.google.com by andreas.josef.wagner
on 24 Jan 2014 at 5:04
Attachments:
Simple keyword search: just a conjunction of terms tokenised from literals.
* Could be done using CQL collections: http://www.datastax.com/documentation/cql/3.0/webhelp/index.html#cql/cql_using/use_collections_c.html#useCollections
* Lucence/Solr integration
* Stargate: http://tuplejump.github.io/stargate/index.html //looks cool
* Lucandra/Solandra: https://github.com/tjake/Solandra //not maintained
* Datastax Enterprise search(DSE) //not open-source
Original issue reported on code.google.com by andreas.josef.wagner
on 12 Feb 2014 at 4:18
What steps will reproduce the problem?
1. Loader creates 4 indexes but only CSPO would be needed for proxy mode
Original issue reported on code.google.com by [email protected]
on 23 May 2012 at 10:12
Documentation is unclear.
Webapp can be both configured using config file in /etc or WEB-INF properties.
Client does not read config file.
Possible solutions:
* improve documentation to make current setup clearer
* get rid of client and do loading also via webapp HTTP interface (so only
webapp needs to be configured) - should be possible with current setup as
thread buffers input and thus can iterate over the in-memory buffer for
multiple index construction
* generate *.deb which installs webapp and config file (and log files with
logrotate) in the right directories and with cassandra dependencies
* ?
Original issue reported on code.google.com by [email protected]
on 2 May 2012 at 8:56
What steps will reproduce the problem?
1. Start Cassandra
2. Start Tomcat
What is the expected output? What do you see instead?
Cumulus webapp should connect to Cassandra, but Cassandra is still booting up.
Increase timeout (or do retries) for connecting.
Original issue reported on code.google.com by [email protected]
on 25 Jan 2013 at 2:21
Currently, we use the standard nested-loop join (with index support) from
Sesame.
However, stored SPO-style indexes in a sorted fashion is fairly easy in
Cassandra (and already implemented to some extend). Thus, a sorted-merge could
be implemented without that much work. See, e.g., [1].
- Andreas
[1] http://www.informatik.uni-freiburg.de/~mschmidt/docs/sp2b_exp.pdf
Original issue reported on code.google.com by andreas.josef.wagner
on 29 Jan 2014 at 9:14
No support for transactions in Sesame, see
[http://openrdf.callimachus.net/sesame/2.7/docs/users.docbook?view#section-repos
itory-api6 Sesame documentation].
Original issue reported on code.google.com by andreas.josef.wagner
on 22 Nov 2013 at 12:29
What steps will reproduce the problem?
1. Browser shows wrong compression message with empty results
What is the expected output? What do you see instead?
Do correct compression.
Original issue reported on code.google.com by [email protected]
on 25 Jan 2013 at 2:22
What steps will reproduce the problem?
1. svn co https://cumulusrdf.googlecode.com/svn/branches/1.0.1 cumulusRDF
2. cd cumulusRDF
3. mvn clean install
Expected output is a build success but instead a build failure is reported.
Specifically, there are two problems
1) cannot find symbol LRUMap
LRUMap (used for example in NodeDictionaryBase) comes from sesame-sail-rdbms.
Now, I'm not able to build the project using maven because that jar is
(indirectly) declared with "runtime" scope.
That means
- in an Eclipse workspace all works fine (no compilation errors) because m2e
imports runtime jars (actually it makes no distinction between scopes) in build
path;
- running a m2e or a Maven build will fail because that dependency is not found
at compile time.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
...
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-compiler-plugin:2.3.2:compile (default-compile)
on project cumulusrdf: Compilation failure: Compilation failure:
[ERROR]
/home/agazzarini/workspaces/cumulusRDF/cumulusrdf/src/main/java/edu/kit/aifb/cum
ulus/store/dict/NodeDictionaryBase.java:[13,34] package
org.openrdf.sail.rdbms.util does not exist
[ERROR]
/home/agazzarini/workspaces/cumulusRDF/cumulusrdf/src/main/java/edu/kit/aifb/cum
ulus/store/dict/NodeDictionaryBase.java:[47,9] cannot find symbol
...
[ERROR]
/home/agazzarini/workspaces/cumulusRDF/cumulusrdf/src/main/java/edu/kit/aifb/cum
ulus/util/hector/CassandraHectorMap.java:[29,34] package
org.openrdf.sail.rdbms.util does not exist
[ERROR]
/home/agazzarini/workspaces/cumulusRDF/cumulusrdf/src/main/java/edu/kit/aifb/cum
ulus/util/hector/CassandraHectorMap.java:[118,9] cannot find symbol
[ERROR] symbol : class LRUMap
[ERROR] location: class edu.kit.aifb.cumulus.util.hector.CassandraHectorMap<K,V>
2) RestServletPojoTest
This class, which is in the test/src folder, is referenced in a @See comment,
in RestApplicationResource (line 483) which belongs to main/src folder.
As consquence of that, RestApplicationResource imports a class which belongs to
tests which are not visible during the build.
That is not immediately visible on IDE (i.e. Eclipse) where there are no
compilation errors but running a m2e or a Maven build I get
[ERROR]
/home/agazzarini/workspaces/cumulusRDF/cumulusrdf/src/main/java/edu/kit/aifb/cum
ulus/webapp/rest/RESTApplicationResource.java:[44,34] cannot find symbol
[ERROR] symbol : class RestServletPojoTest
[ERROR] location: package edu.kit.aifb.cumulus.webapp
Original issue reported on code.google.com by [email protected]
on 23 Jan 2014 at 10:31
Time to live for added data, to be able to use CumulusRDF as a buffer for
streams (e.g., always keep one year's worth of data of a given stream).
Original issue reported on code.google.com by andreas.josef.wagner
on 22 Jan 2014 at 8:22
We will provide directly in the project the checkstyle configuration. That
could be used by Maven for continuous integration builds and by developers in
Eclipse (see the apposite page on wiki for configuring that)
Original issue reported on code.google.com by [email protected]
on 4 Feb 2014 at 10:35
As discussed here
https://groups.google.com/forum/#!topic/cumulusrdf-dev-list/1oW1mhOSHRY
we should move to Sesame API 2.7.10 which solve the BNode prefix issue.
Original issue reported on code.google.com by [email protected]
on 4 Feb 2014 at 11:35
We currently have
<groupId>org.openrdf.sesame</groupId>
<artifactId>sesame-runtime</artifactId>
in our current pom. This simply adds (almost) all sesame libs. Regardless if
they are needed. TODO: remove unnecessary sesame dependencies. This would make
the jar/war more lightweight in terms of space.
Original issue reported on code.google.com by andreas.josef.wagner
on 24 Jan 2014 at 8:24
Deleting from the store will trigger one test for deletion from a secondary
index for every triple. Using a hashtable or sorted tree as buffer would
increase performance here.
Original issue reported on code.google.com by [email protected]
on 11 Feb 2014 at 11:02
00:37:45,114 ERROR
[edu.kit.aifb.cumulus.util.hector.CassandraHectorCounterFactory] counter:
TRIPLE_COUNTER suffered an overflow! current counter value: -3
Original issue reported on code.google.com by andreas.josef.wagner
on 3 Mar 2014 at 2:01
SimpleCassandraMapDictionary has a terrible performance. This, in turn, leads a
bad performance for RDF insert operations.
Original issue reported on code.google.com by andreas.josef.wagner
on 22 Jan 2014 at 8:32
Need to access all the data; either via Dump CLI or HTTP interface or both.
Original issue reported on code.google.com by [email protected]
on 8 Feb 2013 at 12:09
Cannot find the license.
Original issue reported on code.google.com by [email protected]
on 9 Oct 2013 at 4:07
Lot of test failures after running the whole suite with new Asynch Bulk loader
See
http://dev.aifb.kit.edu/jenkins/job/CumulusRDF-Milestone-v1.1/lastBuild/testRepo
rt/
Original issue reported on code.google.com by [email protected]
on 5 Mar 2014 at 4:29
LoadCLI does not support multithreading any more. It simply uses Sesame to add
the file. This is not the intended way LoadCLI should work.
Original issue reported on code.google.com by andreas.josef.wagner
on 13 Mar 2014 at 11:03
As briefly discussed with Andreas, I would like to create a whole SPARQL test
suite that covers as much scenarios as possible.
To do that, we could use (I asked to author and is ok for him, I'm waiting from
OReilly permission) the examples (both ttl and rq files in book "Leaning
SPARQL" [1])
So we will create a test case with several test methods that use and assert the
examples in the book.
In case OReilly doesn't allow such usage I'll use those examples in order to
create our own set of datafiles.
[1] http://www.learningsparql.com/
Original issue reported on code.google.com by [email protected]
on 19 Feb 2014 at 2:42
00:38:06,669 ERROR [edu.kit.aifb.cumulus.store.CassandraRdfHectorTriple] caught
java.lang.ArithmeticException: / by zero while inserting 0 [0, tries left: 10]
java.lang.ArithmeticException: / by zero
at com.ecyrd.speed4j.StopWatch.toString(StopWatch.java:258)
at edu.kit.aifb.cumulus.util.Util.logAndStopTimer(Util.java:245)
at edu.kit.aifb.cumulus.util.Util.logAndStopTimer(Util.java:218)
at edu.kit.aifb.cumulus.store.CassandraRdfHectorTriple.batchInsert(CassandraRdfHectorTriple.java:419)
Original issue reported on code.google.com by andreas.josef.wagner
on 3 Mar 2014 at 1:59
The close() method of edu.kit.aifb.cumulus.store.AbstractCassandraRdfStore must
stop the internal workers pool (as last step) otherwise they don't allow a
proper shutdown of the system.
Original issue reported on code.google.com by [email protected]
on 2 Apr 2014 at 6:21
This is just a little problem with the home page of the software, rather than
the software itself.
What steps will reproduce the problem?
1. Go to Project Home (https://code.google.com/p/cumulusrdf/)
2. Under overview, click on the link to Apache Cassandra
3. You will be redirected to the dead link http://casssandra.apache.org/
(cassandra with the letter s 3 times).
What is the expected output? What do you see instead?
I suppose it should be http://cassandra.apache.org/ (2ses)
Original issue reported on code.google.com by [email protected]
on 30 Oct 2013 at 1:16
Although this is not a real priority for CumulusRDF, I believe we should create
a more nice (simple) GUI for web pages.
in order to keep things simple, lightweight and fast, I suggest to use
- bootstrap [1] for graphical things: there's a dashboard [2] sample page that
should perfectly fits out needs;
- velocity [3] for dynamic pages: it has a very easy and powerful scripting
language
In this way we could, at least, substitue the info and the welcome page with a
more attractive dashboard. On top of that, we could gradually insert some
additional functionality on the sidebar, as happens on Sesame admin console
(e.g. summary, reports, export, add data, query, explore, remove data, SPARQL
query & update)
[1] http://getbootstrap.com/
[2] http://getbootstrap.com/examples/dashboard
[3] http://velocity.apache.org/
Original issue reported on code.google.com by [email protected]
on 16 Feb 2014 at 4:24
What steps will reproduce the problem?
Send a PUT request with
s=<http://a.b.c#d>
p=<http://a.b.c#e>
o="A literal"
s2=<http://a.b.c#d>
p2=<http://a.b.c#e>
o2="Another literal"
What is the expected output? What do you see instead?
I would expect the following triple on the store
<http://a.b.c#d> <http://a.b.c#e> Another literal"
Instead, the servlet throws an exception because the object is always supposed
to be a valid URI (i.e. the following line URI o = valueFactory.createURI("A
Literal") fails)
Original issue reported on code.google.com by [email protected]
on 5 Feb 2014 at 3:40
As discussed in [1], we could remove the CompositeColumns in favor of simple
byte arrays (byte array concatenations).
[1]
https://groups.google.com/d/msgid/cumulusrdf-dev-list/52FCE956.3040606%40gmail.c
om
Original issue reported on code.google.com by andreas.josef.wagner
on 16 Feb 2014 at 1:01
Better selectivity estimation, i.e., collect meaningful statistics for, e.g.,
triple pattern, join pattern.
Original issue reported on code.google.com by andreas.josef.wagner
on 22 Nov 2013 at 12:21
Currently, entity queries, e.g,.
?x knows ?y .
?x name "x" .
?x age "18" .
are evaluated via joins along their subject (in the above example: ?x). That
is, one would need to compute bindings for each triple patten, and join them
using two equi-joins.
However, this (probably) could be done much more efficiently with a single
scan. That is, one would start with a scan of the pattern with the least
matches (e.g., ?x age "18"):
x1 age "18" --> scan for x1 ?p ?o
x2 age "18" --> scan for x2 ?p ?o
x3 age "18" --> scan for x3 ?p ?o
...
Each such scan (x1 ?p ?o) would result in additional property/object pairs -
these could be pushed to subsequent triple pattern accesses. For instance,
"x1 ?p ?o" could find "x1 knows y1", "x1 knows y2", "x1 name "x"" ... The
former two triples could be pushed to access ?x knows ?y, the latter triple
("x1 name "x") to pattern access for ?x name "x".
The key advantage is really that scans (sorted accesses) are fairly cheap, in
comparison to random access probes. Thus, when finding the first potential
result entity (e.g, x1), we could just scan over (all) its associated triples
...
- Andreas
Original issue reported on code.google.com by andreas.josef.wagner
on 29 Jan 2014 at 9:33
Support further RDF serializations, e.g., JSON-LD, Turtle, etc. These
serializations could be used, e.g., in
* Dump CLI
* Loader CLI
* Servlets
Original issue reported on code.google.com by andreas.josef.wagner
on 22 Jan 2014 at 8:50
What steps will reproduce the problem?
1. Serve data with the SimpleRDFXMLFormatter that contains a Literal that
contains a space
What is the expected output? What do you see instead?
Expected output is something like >Luiz Felipe<
Instead we get >Luiz+Felipe<
The reason is that the same escape function is used for Literals and Resources.
Original issue reported on code.google.com by [email protected]
on 23 May 2012 at 7:46
What steps will reproduce the problem?
1. Load a large (> 2 m triples) file.
2. You will see timeout messages.
What is the expected output? What do you see instead?
Higher timeouts, perhaps slowing down input.
Original issue reported on code.google.com by [email protected]
on 25 Jan 2013 at 2:23
Remove dependencies to NxParser? and Yars, only use Sesame
model/parsers/writers.
Original issue reported on code.google.com by andreas.josef.wagner
on 22 Nov 2013 at 12:18
Upgrade to Sesame 2.7.11, see [1].
[1] https://openrdf.atlassian.net/browse/SES/fixforversion/11701
Original issue reported on code.google.com by andreas.josef.wagner
on 14 Apr 2014 at 11:14
Implement a Sesame HTTPRepository. See:
* org.openrdf.http.client.HTTPClient
*
http://answers.semanticweb.com/questions/22068/exposing-a-triple-store-as-a-sesa
me-http-repository
Original issue reported on code.google.com by andreas.josef.wagner
on 22 Jan 2014 at 9:04
Two enhancements are included in this issue:
1) Refactor code in order to use a more flexible and fast logging framework
(log4j or logback). At the moment JULI is used which is optimized in Tomcat but
relies on standard java util logging which offers a limited set of
capabilities.
2) A message catalog, that would consiste in an enumerative interface
(IMessageCatalog) where all CumulusRDF messages are defined. That would allow a
structured log with (for example) messages like this
...
2014-01-15 17:05:42,105 INFO <CRDF-00011> : CUMULUS-RDF 1.0.0 open for
e-business.
...
As you can see other than having all messages classified, we could associate a
code with each message and, for relevant messages (e.g. errors), we could
create a Wiki page with something like:
- Code: CDRF-000034
- Level: ERROR
- Message: Malformed configuration file.
- Suggested action: check your configuration file blabalbla
I know, that would require a more effort each time we need to write an
additional log message, but at the same time it will provide a very powerful
and meaningful log subsystem
Original issue reported on code.google.com by [email protected]
on 27 Jan 2014 at 10:11
What steps will reproduce the problem?
1. Proxy Mode
2. curl -x http://localhost:8080 -H "Accept: application/rdf+xml"
http://dbpedia.org/resource/Karlsruhe
ERROR /resource/Karlsruhe 404: resource not found
Want to have the full URI, including host part.
Original issue reported on code.google.com by [email protected]
on 2 May 2012 at 8:36
NodeDictionaryBase:136 creates literals assuming the n3 string has only the
value (no language no datatype).
In case of (example)
n3 = "2012-02-01T09:53:00Z"^^<http://www.w3.org/2001/XMLSchema#dateTime>
the Literal instance creation
Literal l = ValueFactory.createLiteral(n3)
leads to a wrong value because datatype (and language) part is seen has part of
the value. I mean, a new Literal is created with the following value:
""2012-02-01T09:53:00Z"^^<http://www.w3.org/2001/XMLSchema#dateTime>"
What is the expected output? What do you see instead?
I would expect a Literal correctly created, with value, language and datatype.
This blocks a lot of unit test because system index triples but is not able to
correctly return them as part of SELECT or DESCRIBE command
Original issue reported on code.google.com by [email protected]
on 2 Feb 2014 at 9:36
Make a proper shell based on our plain CLI functionality.
See also:
*
http://stackoverflow.com/questions/14080604/libraries-for-constructing-an-intera
ctive-shell-for-java-application
* http://java.dzone.com/announcements/clamshell-cli-framework
Original issue reported on code.google.com by andreas.josef.wagner
on 14 Mar 2014 at 12:10
LoadThread instances should be managed in a pool instead of creating new
threads for each bulk load.
Original issue reported on code.google.com by [email protected]
on 26 Feb 2014 at 12:09
Switch from Hector thrift client to Datastax CQL client.
Original issue reported on code.google.com by andreas.josef.wagner
on 22 Jan 2014 at 8:24
As discussed here [1], in order to enable several perspectives of the project
test suite, we should change the project layout a bit. The layout that comes
from the initial discussion [1] seems something like this:
cumulusrdf
--cumulusrdf-kernel
--cumulusrdf-integration-tests
--cumulusrdf-benchmark
--??
Where
a) cumulusrdf: a top level project with pom packaging
b) cumulusrdf-kernel: please suggest a more appropriate name :), this is the
current cumulusrdf module (war packaging). It includes sources and unit tests.
c) cumulusrdf-integration-tests: as the name suggests, this module includes
only integration / system tests
d) cumulusrdf-benchmark: a special test module dedicated to benchmarking the
corresponding release artifact
Another interesting module could be a "distribution", that uses the maven
assembly plugin to produce different kind of artifacts (e.g. onejar, war,
directory)
[1]
https://groups.google.com/forum/#!topicsearchin/cumulusrdf-dev-list/maven|sort:d
ate|spell:true/cumulusrdf-dev-list/z3JegSK17gY
Original issue reported on code.google.com by [email protected]
on 16 Feb 2014 at 4:15
CumulusRDF currently only support Cassandra 1.x. Add support for Cassandra 2.x.
Original issue reported on code.google.com by andreas.josef.wagner
on 22 Nov 2013 at 12:12
"$pageName" on load Web GUI page after successful upload.
Original issue reported on code.google.com by andreas.josef.wagner
on 7 Mar 2014 at 2:28
As discussed here
https://groups.google.com/forum/#!topic/cumulusrdf-dev-list/wRZ-2coKPs0
Value arrays will be replaced by SesameStatement(s)
Original issue reported on code.google.com by [email protected]
on 2 Feb 2014 at 3:21
CLI Loader does not load data ...
Original issue reported on code.google.com by andreas.josef.wagner
on 11 Dec 2013 at 1:42
What steps will reproduce the problem?
1. access a CumulusRDF URI with a complex accept header (e.g., using multiple
content types with preferences)
2. problem
What is the expected output? What do you see instead?
The client should get the correctly negotiated format.
Original issue reported on code.google.com by [email protected]
on 3 Feb 2013 at 3:34
What steps will reproduce the problem?
1. Run more than one unit test that uses cassandra-unit for starting Cassandra
What is the expected output? What do you see instead?
While I expect all tests correctly run, only the first will succeed because
from the second the embedded Cassandra complains about a duplicate index. This
seems to be related to cassandra-unit which doesn't provide a way to shutdown
the embedded instance between tests.
Original issue reported on code.google.com by [email protected]
on 31 Jan 2014 at 2:21
This is not really a bug. Instead, as discussed here
https://groups.google.com/forum/#!topic/cumulusrdf-dev-list/vOKdDAXJEqg
We could do some benchmark / test in order to see if we really need Composites.
We are already working with the low level form of serialization (byte arrays)
so maybe the abstraction and the "complexity" offered by Composites could be
avoided.
Original issue reported on code.google.com by [email protected]
on 16 Feb 2014 at 2:25
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.