Giter Site home page Giter Site logo

trellis-cassandra's People

Contributors

acoburn avatar ajs6f avatar gatos-jd avatar gregjan avatar jholleran avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

trellis-cassandra's Issues

Buffer RDF

To prevent slippage between asynchronous calls to C* and to simplify RDF management, we will change workflow to buffer RDF on resource retrieval (on ResourceService::get) instead of spooling it (onResource::stream).

Upgrade Cassandra driver

Several new versions of the Cassandra driver have been released since we selected 3.6. Upgrade to the current best choice.

Binary "contains" query

@gregjan, question for you--

The current query for checking whether a binary exists is:

SELECT identifier FROM Binarydata WHERE identifier = ? and chunk_index = 0;

Am I right in thinking that the and chunk_index = 0 is unnecessary and be removed? If there is any chunk, even if it isn't the first, we can answer the question with a "yes, this binary exists". Right?

Versioning

CassandraResourceService and CassandraBinaryService must support Memento action from the HTTP layer. This implies a new CassandraMementoService and corresponding schema changes.

Make reading bitstreams agnostic to chunk length

While it's necessary to have a chunk length in hand to write a bitstream, it's not clear that it is necessary to read one. If not, the configured value should not be used, to provide the future possibility of varying it more dynamically.

Replace .distinct() stream filtering with CQL

The .distinct() stream method will creating buffering on the front-end nodes that may be significant for containers with many modifications/mementos. Instead we can use CQL, perhaps adding a "LIMIT 1" to the query in question. Since any row will suffice to establish the contains relationship and since the containment table is partitioned by container id, it should work.

Binary resource GET throws 500 (no logged error)

While exploring my populated test repository, I encountered a 500 when trying to get back the non-RDF resources. I am seeing similar behavior as below for all such resources. Here is the example session:

jansen@X1:~$ curl -v http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090
*   Trying 128.8.216.153...
* TCP_NODELAY set
* Connected to ciber-vs1.umd.edu (128.8.216.153) port 10080 (#0)
> GET /srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090 HTTP/1.1
> Host: ciber-vs1.umd.edu:10080
> User-Agent: curl/7.58.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Link-Template: <http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090{?version}>; rel="http://mementoweb.org/ns#Memento"
< Link-Template: <http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090{?subject,predicate,object}>; rel="http://www.w3.org/ns/ldp#RDFSource"
< Accept-Patch: application/sparql-update
< Date: Tue, 12 Mar 2019 17:26:03 GMT
< Allow: GET,HEAD,OPTIONS,PATCH,PUT,DELETE,POST
< Connection: keep-alive
< ETag: W/"0a56b5f371e277b53e8d1e686148d9c2"
< Last-Modified: Tue, 12 Mar 2019 16:44:29 GMT
< Vary: Accept
< Vary: Prefer
< Vary: Accept-Datetime
< Accept-Post: text/turtle,application/ld+json,application/n-triples
< Transfer-Encoding: chunked
< Content-Type: text/turtle;charset=UTF-8
< Link: <http://www.w3.org/ns/ldp#BasicContainer>; rel="type"
< Link: <http://www.w3.org/ns/ldp#Container>; rel="type"
< Link: <http://www.w3.org/ns/ldp#RDFSource>; rel="type"
< Link: <http://www.w3.org/ns/ldp#Resource>; rel="type"
< Link: <http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090>; rel="original timegate"
< Link: <http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090?ext=timemap>; rel="timemap"; from="Tue, 12 Mar 2019 16:44:29 GMT"; until="Tue, 12 Mar 2019 16:44:29 GMT"; type="application/link-format"
< Link: <http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090?version=1552409069>; rel="memento"; datetime="Tue, 12 Mar 2019 16:44:29 GMT"
< Link: <http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090>; rel="self"
< 
<http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090>
        <http://purl.org/dc/terms/title>  "7090" ;
        <http://purl.org/dc/terms/extent>  "15" ;
        <http://www.w3.org/ns/ldp#contains>  <http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090/1d0e18fc-e03c-468a-9fb4-5340097e0a75> ;
        <http://www.w3.org/ns/ldp#contains>  <http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090/28c91744-c5c7-41f8-b6f0-ba79f2e2d9af> ;
        <http://www.w3.org/ns/ldp#contains>  <http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090/4222abbf-1e6c-4c5a-9d21-eac3acfba6ad> ;
        <http://www.w3.org/ns/ldp#contains>  <http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090/5a3650a6-ad6d-41c3-b308-5571b66804bb> ;
        <http://www.w3.org/ns/ldp#contains>  <http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090/1d0e18fc-e03c-468a-9fb4-5340097e0a75> ;
        <http://www.w3.org/ns/ldp#contains>  <http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090/28c91744-c5c7-41f8-b6f0-ba79f2e2d9af> ;
        <http://www.w3.org/ns/ldp#contains>  <http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090/4222abbf-1e6c-4c5a-9d21-eac3acfba6ad> ;
        <http://www.w3.org/ns/ldp#contains>  <http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090/5a3650a6-ad6d-41c3-b308-5571b66804bb> .
* Connection #0 to host ciber-vs1.umd.edu left intact
jansen@X1:~$ curl -v http://ciber-vs1.umd.edu:10080/srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090/4222abbf-1e6c-4c5a-9d21-eac3acfba6ad
*   Trying 128.8.216.153...
* TCP_NODELAY set
* Connected to ciber-vs1.umd.edu (128.8.216.153) port 10080 (#0)
> GET /srv/ciber/Transfer+Notes/nara1_vault10/National_Archives/Federal_Records/RG+255+-+Records+of+the+National+Aeronautics+and+Space+Administration/EOS+Data+Files/Crystal+Dynamics/pub/slr/data/fr/jason1/daily/7090/4222abbf-1e6c-4c5a-9d21-eac3acfba6ad HTTP/1.1
> Host: ciber-vs1.umd.edu:10080
> User-Agent: curl/7.58.0
> Accept: */*
> 
< HTTP/1.1 500 Internal Server Error
< Connection: keep-alive
< Content-Type: text/html;charset=UTF-8
< Content-Length: 80
< Date: Tue, 12 Mar 2019 17:26:28 GMT
< 
* Connection #0 to host ciber-vs1.umd.edu left intact
<html><head><title>Error</title></head><body>Internal Server Error</body></html>

Thorntail trying IP6 before IP4

I'm getting a startup error that is related to wildfly. It seems like it is missing a protocol dependency? Have you seen this before?

[email protected]    | 2018-11-26 16:19:02,222 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0013: Operation ("add") failed - address: ([
[email protected]    |     ("subsystem" => "undertow"),
[email protected]    |     ("server" => "default-server"),
[email protected]    |     ("http-listener" => "default")
[email protected]    | ]) - failure description: {"WFLYCTL0080: Failed services" => {"org.wildfly.undertow.listener.default" => "WFLYUT0082: Could not start 'default' listener.
[email protected]    |     Caused by: java.net.SocketException: Protocol family unavailable"}}

Stronger Memento design

The current Memento storage design uses a time series and incurs the cost of table scanning. We can do better by shunting Mementos to a separate table and using the main mutabledata table only for current information.

Conflict between WebDAV and tests?

Having added the WebDAV resource and two filters to my Trellis application, I'm now getting test failures:

[ERROR] Failures:
[ERROR]   LdpBasicContainerIT.testCreateContainerViaPut Check for an ldp:contains triple ==> expected: <false> but was: <true>
[ERROR]   MementoBinaryIT Check for a valid response to PUTting an LDP-NR ==> expected: <SUCCESSFUL> but was: <CLIENT_ERROR>
[ERROR]   MementoResourceIT Check for a valid response to PUTting an LDP-NR ==> expected: <SUCCESSFUL> but was: <CLIENT_ERROR>
[ERROR]   MementoTimeGateIT Check for a valid response to PUTting an LDP-NR ==> expected: <SUCCESSFUL> but was: <CLIENT_ERROR>
[ERROR]   MementoTimeMapIT Check for a valid response to PUTting an LDP-NR ==> expected: <SUCCESSFUL> but was: <CLIENT_ERROR>

(replace IT with Test). Do we have any information about how WebDAV and Memento APIs interact?

Configurable consistency levels

Consistency levels for write and read operations to Cassandra can be configured on a per-statement basis. After the first, a globally-constant consistency level can be configured for read and for write statements-- two settings in all.

Configurable chunk size for binaries

The chunking size for binaries persisted via CassandraBinaryService is currently fixed for injected services at 1MB. There is a ctor that accepts chunk length, but it is not injectable. There is no way to use Tamaya config to set the chunk length, and that is the purpose of this ticket.

Duplication possible of the contains relationship

I was running performance tests. What I found after two tests, between which I failed to reset the database, was that the root folder showed two identical ldp:contains relationships. Presumably there is only one contained resource, since the object of both triples was the same. I will look at the C* tables and report what I find there in a follow up comment.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.