Giter Site home page Giter Site logo

Comments (8)

afs avatar afs commented on July 17, 2024

HttpConnectionParams is org.apache.http.params.HttpConnectionParams.

Jena now uses java.net.http.HttpClient, not Apache HttpClient.

Apache HttpClient is only in the dependencies because jsonld-java uses it. If Titanium becomes the only JSON-LD provider, then it won't be needed. We can probably drop the explicit mention now. jena used to need to control the version.

In java.net.http, connection timeout is "TCP connection setup" timeout. See java.net.http.HttpClient.Builder.connectTimeout

In the case where a new connection needs to be established, ...
If a new connection does not need to be established, for example if a connection can be reused
from a previous request, then this timeout duration has no effect.

In the case of RDFConnectionit does reuse the java.net.http.HttpClient and hence connections.
Unless provided, there is only one HttpClient (for all HTTP activity). So likely only the first use of the RDFConnection has any chance of a connection setup to a new remote and hence timeout.

HTTP/2 : there is no TCP connection setup other that the one shared host-host connection.

HTTP/3 : different again.

I assumed the lack of timeout controls for java.net.http.HttpClient is looking to the future of HTTP/2 and HTTP/3 where things are different.


QueryExecution.setTimeout is deprecated. There is some compatibility code (a hidden delayed building) because there is a "build-use" pattern nowadays -- QueryExec does not allow changes of setup after building.

from jena.

Aklakan avatar Aklakan commented on July 17, 2024

Sigh, too many issues lately so I it happens I do things too hastily - I didn't notice that the stack overflow snippet referred to apache http client because the title referred to java http and I googled for java net http.

I see, the method now is java.net.http.HttpClient.Builder.connectTimeout (javadoc)

So I'll look into supplying my own HttpClient instance via RDFConnectionRemoteBuilder#httpClient then which would be good enough for my use case.

from jena.

afs avatar afs commented on July 17, 2024

Yes - setting up the HttpClient is the right way to do it.

It is the only way as it get frozen at that point. It is possible to manually clone the setting of an HttpClient but it's afresh setup and not having the same connection caching (HTTP/1.1).

from jena.

Aklakan avatar Aklakan commented on July 17, 2024

Hm again a twist: So with the connect timeout no longer present on the connection itself I have to set it on initialization.

However for this part I was using my own assembler-like setup which sets up a factory for RDFConnection objects from an RDF document. Back then I didn't realize that Jena's assembler system is generic so it can create any java object from RDF.

In my case, this allows for linking RDF benchmark results back to the connection setup and the involved thresholds.

Rather then extending my own system I'd rather integrate with Jena assemblers.
I think there is not yet an assembler for RDFConnection instances - is there?

So a rought draft would look like

#conn.ttl
<#httpClient> a ja:HttpClient ;
  ja:connectTimeout 10 . # seconds

<#conn> rdf:type ja:HttpRDFConnection;
     ja:destination <http://dbpedia.org/sparql> ;
     ja:httpClient <#httpClient> .

Followed by

Resource spec = ModelFactory.load("conn.ttl").getResource("#conn");
RDFConnection conn = RDFConnectionFactory.assemble(spec);

However, instead of returning only a single connection it may be better to create a factory for connections:

/** A factory/supplier of RDFConnection instances. Similar to a JDBC DataSource. */
public interface RDFDataSource
{
    RDFConnection getConnection();
}

RDFDataSourceFactory.assemble(...);

Would this be a reasonable approach? And would it make sense to add this functionality to Jena itself at some point?
(From my side it may be some time until I'd get around to do that because I'll try to resolve the other open issues first - so I am just asking for a perspective here)

from jena.

afs avatar afs commented on July 17, 2024

Reasonable yes. I use assembler to make things other than datasets quite often.

https://github.com/Telicent-io/jena-fuseki-kafka/blob/main/ex-connector.ttl

is a connector between Kafka and Jena.

Rather then extending my own system I'd rather integrate with Jena assemblers.

There is no need to be in Jena, nor use ja: just as there no need for vocabularies to be in Jena.Any code can register them. AssemblerUtils. This also means you can develop, try out and change based on your usage before it gets "frozen" by putting it in Jena.

Is it something general other users would want? From what you describe (RDFDataSource), it is part of a bigger picture.

Aside: I've been toying with making more of assembling at the graph level.

https://github.com/Telicent-io/jena-fuseki-kafka/blob/main/jena-kafka-connector/src/main/java/org/apache/jena/kafka/KafkaConnectorAssembler.java#L112

from jena.

afs avatar afs commented on July 17, 2024

If this issue is finished, could you close it please? It is very helpful to wrap things up - it helps track the state of the project.

We have the "discussions" area on GH for longer running and open ended discussions.

"Issues" are ideally actionable. There will more of them so longer discussions that involve general users may can get a little lost if mixed into a big "issues" pool.

from jena.

Aklakan avatar Aklakan commented on July 17, 2024

Is it something general other users would want?

Well it's the analogy to JDBC, where DataSource is the factory for connection instances.
Based on DataSource, there are systems such as Hikari that implement connection pooling, automatic stale connection detection and replay of statements that failed due to connection loss.
The pooling aspect is particularly nice, because it enables components to frequently acquire short-lived connections (OLTP).
Closing those connections make them go back into the local pool without actually closing the physical connection to the database.

Also in the LSQ system, I use the RDF-variant of that interface for connection recovery: If it turns out that a connection was lost (e.g. due to a system-under-test actually crashing on a query) I can call getConnection() in a loop in an attempt to reconnect (e.g. while the docker container restarts). using whatever settings were in place when the DataSource was created.

In the bigger picture, Spring has the EmbeddedDatabase interface which extends DataSource with a shutdown() method (Nowadays I'd prefer AutoClosable.close()). This approach allows for abstracting e.g. a local Fuseki and a Virtuoso instances as RDFDataSources/RDFDatabases). The close() method could then e.g. shutdown the Virtuoso docker container.

So this pattern with an (RDF)DataSource is not my invention - but I noticed it's not part of the RDFConnection API.
Conversely, having the interfaces in Jena might make it easier to provide interoperable implementations outside of Jena because applications could rely on those interfaces.

If you want more evidence I could also eventually write to the Jena dev mailing list about it.

This also means you can develop, try out and change based on your usage before it gets "frozen" by putting it in Jena.

Yes, sure, I don't mean to 'spam' Jena with my custom components. But I think an assembler for the already Jena native class RDFConnection+HttpClient (maybe later also RDFDataSource) would be nice. But for now I'll go ahead as you suggest with making the changes in my own code.

I leave the issue open in case you want to leave a comment on the RDFDataSource, other than that you can close it.

from jena.

afs avatar afs commented on July 17, 2024

I confused by whether you are arguing for RDFDataSource or the assembler.

If you are proposing RDFDataSource with all the sophistication you describe (caveat - inevitable overlaps with java.net.http.HttpClient), then great (at the RDFLink level).

An assembler on its own seems isolated and can wait until RDFDataSource.

I could also eventually write to the Jena dev mailing list about it.

That's a good idea.

And/or start a discussion.

from jena.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.