searchbox-io / jest Goto Github PK
View Code? Open in Web Editor NEWElasticsearch Java Rest Client.
License: Apache License 2.0
Elasticsearch Java Rest Client.
License: Apache License 2.0
Actually, I have to implement my own json parser to handle the elasticsearch json because I want to use the _id as value for my object's variable id. For me, it's the easy way to have an auto-increment id from elasticsearch that I can use as identifier for my doc.
The other way is :
Here a good example to do so by using the _version as auto-increment : http://blogs.perl.org/users/clinton_gormley/2011/10/elasticsearchsequence---a-blazing-fast-ticket-server.html
log4j.xml is present in src/main/resources so, it bypasses the log4j.properties of projects using jest.
Shouldn’t it be placed in src/test/resources ?
regards
When I do a bulk request.
It' seem like json format was formated twice. Json sended to server contains " and elastic search could not parse it.
Good luke
In the method JestResult.execute of version 0.0.1 I made a patch to support HEAD Request :
} else if (methodName.equalsIgnoreCase("HEAD")) {
HttpHead httpHead = new HttpHead(elasticSearchRestUrl);
log.debug("HEAD method created based on client request");
response = httpClient.execute(httpHead);
if (response.getStatusLine().getStatusCode() == HttpStatus.SC_OK) {
response.setEntity(new StringEntity("{"ok" : true, "found" : true}"));
}
else if (response.getStatusLine().getStatusCode() == HttpStatus.SC_NOT_FOUND) {
response.setEntity(new StringEntity("{"ok" : false, "found" : false}"));
}
}
It was a workaround because HEAD request only returns status code but no content :
http://www.elasticsearch.org/guide/reference/api/admin-indices-indices-exists.html
However the current version in the "master" do not allow me to propose a pull request. So I put the code directly in this issue.
How to return in the JestResult result = jestClient.execute (search) add highlight
CreateIndex does not supports creating mapping at same time
I want to execute several index actions in a bulk action.
Each index action contains data with some text fields with white spaces.
The problem is that fact that data which was indexed with bulk action loses all whitespaces.
Single index action works fine.
Here is piece of my code:
public void indexAll(List<EntityIndexData> entityIndexDataList) throws SearchRequestFailedException {
Bulk.Builder bulkBuilder = new Bulk.Builder();
for (EntityIndexData entityIndexData : entityIndexDataList) {
bulkBuilder.addAction(prepareIndex(entityIndexData));
}
this.client.execute(bulkBuilder.build());
}
private Index prepareIndex(EntityIndexData entityIndexData) {
return new Index.Builder(entityIndexData.getIndexContent())
.index(INDEX_NAME)
.type(entityIndexData.getEntityType())
.id(entityIndexData.getEntityId())
.build();
}
Please let me know if something is wrong in my code and suggest a way to fix this problem.
Thank you in advance.
The following code in JestHttpClient.java:
private JestResult deserializeResponse(HttpResponse response, Action clientRequest) throws IOException {
return createNewElasticSearchResult(EntityUtils.toString(response.getEntity()), response.getStatusLine(), clientRequest);
}
will throw an error if the response does not have an entity (eg. any 40x response code). A simple way to generate a 404 response from Elastic is to submit an IndiciesExists() request for a non-existent index. However, logically, this would also affect a 401 unauthorized response from an Elastic instance with some kind of Authentication plugin enabled.
Additionally, when the request is executed in async() mode, the failed() callback method is never fired, because the complete() handler doesn't call the failed handler on exception cases.
I think that we need:
a) any exception when dealing with the response to trigger a manual call to the "failed" method, not just a log statement.
b) useful handling for 40x/50x response codes (and perhaps, this is just point (a) above... though, obviously, the error handling would be better if it detected the lack of response entity first, rather than blindly attempting to use it and failing with an exception).
However, the correct solution for 40x response code and/or no-entity error handling maybe be different?
I have a list with sourcenames and I need to create a fuzzy search with this SourceList like
Say I have
sourceList ={"Bloomberg","reuters"};
I want to get query as
{"fuzzy":{"SourceName":{"value":"Bloomberg"}}},{"fuzzy":{"SourceName":{"value":"reuters"}}}
How can I append sorceNames t o existing query using queryBuilder ??
It would be great if you could add a setRouting method to add the routing parameter for the Index and Search Actions objects.
The default implementations of HttpClient and HttpAsynClient do not bother to check the system proxy settings - ie. http.proxyHost and https.proxyHost.
Jest may be used in environments where a proxy connection is required.
An approach that worked for me in my environment (where http.proxyHost / https.proxyHost are all fully set and required to be used) is below. However, I'm not certain that that'd be the approach you want to use (eg. vs. setting some manual proxy configurations, and I'm not certain that the implementation details of ProxySelectorAsyncRoutePlanner are totally correct)... so, I haven't raised a pull request for this change
Change HttpClient as follows, around line 72 of JestClientFactory.java:
// add proxy-aware route planner
ProxySelectorRoutePlanner routePlanner = new ProxySelectorRoutePlanner(
httpclient.getConnectionManager().getSchemeRegistry(),
ProxySelector.getDefault());
httpclient.setRoutePlanner(routePlanner);
client.setHttpClient(httpclient);
(where client is changed to be a DefaultHttpClient object, not a HttpClient object).
However, this solution does not seem to work for the HttpAsyncClient. I was able to make the HttpAsyncClient work via the following code around line 80 of JestClientFactory:
// add proxy-aware async route planner
ProxySelectorAsyncRoutePlanner aSyncRoutePlanner = new ProxySelectorAsyncRoutePlanner(
asyncClient.getConnectionManager().getSchemeRegistry(),
ProxySelector.getDefault());
asyncClient.setRoutePlanner(aSyncRoutePlanner);
where ProxySelectorAsyncRoutePlanner is a class copied from:
(this represents a wrapper library to HttpClient / HttpAsyncClient which allegedly improves on a few things... proxy support by default being one of them).
Perhaps you just want to depend on that Kolich HttpClient library? Perhaps you want to adopt the above implementation? Perhaps there is a different solution.
In the readme it says "..., but Jest fills a gap, it is the missing client for ElasticSearch Http Rest interface."
After the twitter conversation the security use-case was explained to me and I started to understand what gap jest closes.
I suggest adding these use-cases to the readme to make it more clear where you would use jest and not the native api.
We should let users decide to use es.jar, since it is just used for QueryBuilder.
how can I specifiy date format to use ISO datetime.
I have the following serializer and deserializer for java.util.Dates:
static JsonSerializer<Date> ser = new JsonSerializer<Date>() {
@Override
public JsonElement serialize(Date src, Type typeOfSrc,
JsonSerializationContext context) {
return src == null ? null : new JsonPrimitive(src.getTime());
}
};
static JsonDeserializer<Date> deser = new JsonDeserializer<Date>() {
@Override
public Date deserialize(JsonElement json, Type typeOfT,
JsonDeserializationContext context) throws JsonParseException {
return json == null ? null : new Date(json.getAsLong());
}
};
I then associate it with my Jest client as such:
Gson gson = new GsonBuilder()
.registerTypeAdapter(Date.class, ser)
.registerTypeAdapter(Date.class, deser).create();
ClientConfig clientConfig = new Builder(serverURIs).gson(gson).multiThreaded(true).build();
Indexing happens correctly (Dates are stored as milliseconds since epoch). Search queries do not produce milliseconds using my serializer/deserializer. I have the following serach query, where 'from' and 'to' are java.util.Dates:
ssb.query(QueryBuilders.rangeQuery("timestamp").from(from).to(to));
It generates the following query:
"range" : {
"timestamp" : {
"from" : "2013-07-01T22:46:23.286Z",
"to" : "2013-08-27T22:46:23.286Z",
"include_lower" : true,
"include_upper" : true
}
}
I expected the query to produce from' and 'to' into milliseconds, as shown below.
"from" : 1372718783286,
"to" : 1377643583286,
Am I doing this incorrectly, or is this feature not implemented?
I also saw this thread: http://stackoverflow.com/questions/7910734/gsonbuilder-setdateformat-for-2011-10-26t202959-0700 but that answer requires upgrading to Java 7, which will probably not be an option for me.
It does not appear that there is a way to specify a parent when bulk indexing.
When building the bulk request the builder could possibly look for a parent parameter on the index request
When i do press the button for CreateArticles I get HttpHostconnectException. I have also changed the connectionUrl to http://localhost:9200 Can you please tell me what am I doing wrong.
org.apache.http.conn.HttpHostConnectException: Connection to http://localhost:9200 refused
at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:190)
at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at io.searchbox.client.http.JestHttpClient.execute(JestHttpClient.java:59)
at io.searchbox.jest.sample.service.SearchService.indexSampleArticles(SearchService.java:50)
Can you please upload version 0.0.4 to maven central repository.
http://search.maven.org/#search|ga|1|g%3A%22io.searchbox%22
thanks in advance.
The code in Bulk creates a new Gson object for every Bulk action. This causes an issue where the JSON documents sent by Bulk are not the same as sent by individual Index actions where custom serialization is desired.
A getter and setter is provided for the Gson attribute, but the data is already serialized in the constructor so if you want to change the serialization, you have to do so twice.
Providing a mechanism to set the Gson object in the builder would be useful so it could use the same serializers across all actions.
when build index action with String-type source, function generateBulkPayload of Bulk will delete all whitespace of the source:
// line 95 in Bulk.java
if (source instanceof String) {
return StringUtils.deleteWhitespace((String) source);
}
if the soure is a json string like {"text": "a b c"}, the whitespace in the value will be deleted.
Should I not build Index with string-type source?
It doesn't seem to be already implemented. It would be very useful to me.
Thanks,
Riccardo
When I try to compile the project with mvn compile, I"ve got this :
Try downloading the file manually from the project website.
Then, install it using the command:
mvn install:install-file -DgroupId=fr.tlrx.elasticsearch -DartifactId=elasticsearch-test -Dversion=0.0.3 -Dpackaging=jar -Dfile=/path/to/file
Alternatively, if you host your own repository you can deploy the file there:
mvn deploy:deploy-file -DgroupId=fr.tlrx.elasticsearch -DartifactId=elasticsearch-test -Dversion=0.0.3 -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]
Path to dependency:
1) io.searchbox:JEST:jar:1.0-SNAPSHOT
2) fr.tlrx.elasticsearch:elasticsearch-test:jar:0.0.3
Where can I find the missing jar that I install it locally ?
why not use jackson json to speed up performance?
test the code like:
client.shutdownClient();
System.out.println("terminated!");
The main thread exists but the whole JVM still hold by the execution service.
Fix could be in the NodeChecker:
protected void shutDown() throws Exception{
super.shutDown();
client = null;
clientConfig = null;
executorService.shutdown();
}
protected ScheduledExecutorService executor() {
executorService =
Executors.newSingleThreadScheduledExecutor();
return executorService;
}
I am trying to get the search to show me less than 10 results that come by default from elasticsearch and from what I understand the way to do it is
search.addParameter("size", "5");
but the method is executed in AbstractAction class but then Search class overrides AbstractAction.getURI without calling super() or somehow ensure the parameters from the AbstractAction class are considered.
Just for reference my sample code:
QueryBuilder queryBuilder = QueryBuilders.wildcardQuery("brief", "jazz");
Search search = new Search(Search.createQueryWithBuilder(queryBuilder.toString()));
search.addIndex("music_reviews");
search.addType("review");
search.addParameter("size", "5");
JestResult result = client.execute(search);
Hi,
Currently I'm using the 0.0.2 version and is facing issue #15 .
Is it possible to push new version to the central? Or how soon will it be?
What's the side-effect of not shutting down a client?
Thanks,
Rui
I have updated the code to the last version and this method is missing - why? Is there any alternative?
Creating a JestClient through the JestClientFactory. When the value of isDiscoveryEnabled is true, a new nodeChecker will be started directly.
But @ this time, the HttpClient is not set yet. This will make the TASK fail and can not start again.
Direct result is the server url list will never got updated.
When trying to delete some documents with DeleteByQuery, the request sent is using POST Method (see DeleteByQuery.java line121).
According to ElasticSearch Guide, the http Method has to be DELETE (http://www.elasticsearch.org/guide/reference/api/delete-by-query.html)
Other problem I detect trying to deleteByQuery, is at JestHttpClient.java. On 'constructHttpMethod()' if the detected HttpMethod is DELETE, it does not put data on the request (as it does in POST option - line 117).
So, actually, if I create a DeleteByQuery(""match_all" : {}") in fact the request sent is something like
curl -XPOST [server]/[index]/[type]/_query -d '{ "match_all" : {}}'
which brings a response like
{
"ok" : true,
"_index" : "INDEX_NAME",
"_type" : "TYPE_NAME",
"_id" : "_query",
"_version" : 1
}
and delete nothing....
Instead, the generated request should be something like
curl -XDELETE '[server]/[index]/[type]/_query' -d '{
"match_all" : {}}'
Which brings a response like
{
"ok" : true,
"_indices" : {
"INDEX_NAME" : {
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
}
}
}
}
this one deletes the documents.
The code in AbstractJestClient seems to have some problem under high load.
Multiple calls to getElasticSearchServer on different threads gives the following stack trace :
Caused by: java.util.NoSuchElementException
at java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:392)
at java.util.LinkedHashMap$KeyIterator.next(LinkedHashMap.java:401)
at com.google.common.collect.Iterators$5.next(Iterators.java:458)
Maybe I'm missing something ?
Are there plans to include HTTPS support ?
When JestHttpClient is used in the multi-threaded environment, the NPE is thrown by the getElasticSearchServer method where a not thread-safe iterator provided by guava is used.
The simple solution is to add synchronized key word in the method modifier.
Hi, I need to pass basic HTTP authentication information (username, password) but I'm having trouble. I can't find any mention of authentication in the Jest source or in the Jest sample service. And I don't see anything in JestClientFactory I can use to specify the auth information.
I'm using version 0.0.2 of Jest because I was having some issues with 0.0.3, but if switching would make authentication easier I'm happy to.
Currently I'm setting a Map as the client config and it seems like I should be able to add a "basic_auth" Map with "username" and "password" keys, but that's not working.
Some example code is here: https://gist.github.com/colinpollock/5871423
Thanks for any help,
Colin
Using ES 0.19.4 and Jest last version.
When using Search search = new Search(query,sort);, if sort is an empty array I get this error:
{"error":"SearchPhaseExecutionException[Failed to execute phase [query], total failure; shardFailures {[-IKqdhmER6-b2NH7NtO_cA][INDEXNAME][0]: RemoteTransportException[[node2][inet[/192.168.1.111:9301]][search/phase/query]]; nested: SearchParseException[[INDEXNAME][0]: from[-1],size[-1]: Parse Failure [Failed to parse source [QUERY]]]; nested: JsonParseException[Unexpected character ('{' (code 123)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name\n at [Source: [B@3b31498d; line: 2, column: 2]]; }{[6-BtUm_CSzO_mww6k9XpNw][INDEXNAME][4]: SearchParseException[[INDEXNAME][4]: from[-1],size[-1]: Parse Failure [Failed to parse source [QUERY]]]; nested: JsonParseException[Unexpected character ('{' (code 123)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name\n at [Source: [B@6173b5be; line: 2, column: 2]]; }]","status":500}
When I return to
Search search = new Search(query) it works.
Anyone knows why? May be a version issue?
I created a ClientConfig like this
new ClientConfig.Builder(Arrays.asList("http://localhost:9200", "http://localhost:9201")
.multiThreaded(true)
.discoveryEnabled(true)
.discoveryFrequency(1, TimeUnit.MILLISECONDS)
.build();
When I turnned off the first node, I got this exception Caused by: java.net.ConnectException: Connection refused
When I turnned off the second node, everything works fine..
Any help will be great....
Thanks,
How to reproduce:
Result:
NullPointerException
Instead of just a constructor we should implement these via builder pattern.
Docids can contain chars that have meaning in a url, e.g. a docid might look like blablat/bladibla. The slash will cause a 400 error if the id is not properly url encoded before doing a request.
Jest should ask ES nodes it knows about all other nodes so it can use them.
Need to improve isOperationSucceed check, all requests should implement their own.
Is there a way to do a delete by query ?
http://www.elasticsearch.org/guide/reference/api/delete-by-query.html
Hi,
I am using Jest under Storm.
After I integrated Jest in the project, there was only output of Jest on console.
Perhaps the console appender of Storm is covered by Jest.
Could you give me some hint on how to fix this problem?
Thanks!
Rui
With a search like
{
"query" : { "query_string" : {"query" : "T*"} },
"facets" : {
"tags" : { "terms" : {"field" : "tags"} }
}
}
JestResult result = client.execute(search);
Rather than parse the jsonObject inside, it would be easier to have a support for the SearchResult similarly to this
Hi,
When indexing my documents everything seems to be fine, however I see some errors in the logs and I would like to know if maybe that's something I'm doing wrong on my end.
logs:
...
23:26:19.970::i.s.AbstractAction::DEBUG::Could not retrieve 'routing' parameter from action.
java.lang.NullPointerException: null
at io.searchbox.core.Bulk.generateBulkPayload(Bulk.java:71) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk.(Bulk.java:28) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk$Builder.build(Bulk.java:145) [jest-0.0.4.jar:na]
...
23:26:19.970::i.s.AbstractAction::DEBUG::Could not retrieve 'percolate' parameter from action.
java.lang.NullPointerException: null
at io.searchbox.core.Bulk.generateBulkPayload(Bulk.java:71) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk.(Bulk.java:28) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk$Builder.build(Bulk.java:145) [jest-0.0.4.jar:na]
...
23:26:19.970::i.s.AbstractAction::DEBUG::Could not retrieve 'parent' parameter from action.
java.lang.NullPointerException: null
at io.searchbox.core.Bulk.generateBulkPayload(Bulk.java:71) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk.(Bulk.java:28) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk$Builder.build(Bulk.java:145) [jest-0.0.4.jar:na]
...
23:26:19.970::i.s.AbstractAction::DEBUG::Could not retrieve 'timestamp' parameter from action.
java.lang.NullPointerException: null
at io.searchbox.core.Bulk.generateBulkPayload(Bulk.java:71) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk.(Bulk.java:28) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk$Builder.build(Bulk.java:145) [jest-0.0.4.jar:na]
...
23:26:19.971::i.s.AbstractAction::DEBUG::Could not retrieve 'ttl' parameter from action.
java.lang.NullPointerException: null
at io.searchbox.core.Bulk.generateBulkPayload(Bulk.java:71) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk.(Bulk.java:28) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk$Builder.build(Bulk.java:145) [jest-0.0.4.jar:na]
...
23:26:19.971::i.s.AbstractAction::DEBUG::Could not retrieve 'retry_on_conflict' parameter from action.
java.lang.NullPointerException: null
at io.searchbox.core.Bulk.generateBulkPayload(Bulk.java:71) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk.(Bulk.java:28) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk$Builder.build(Bulk.java:145) [jest-0.0.4.jar:na]
...
23:26:19.971::i.s.AbstractAction::DEBUG::Could not retrieve 'version' parameter from action.
java.lang.NullPointerException: null
at io.searchbox.core.Bulk.generateBulkPayload(Bulk.java:71) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk.(Bulk.java:28) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk$Builder.build(Bulk.java:145) [jest-0.0.4.jar:na]
...
23:26:19.971::i.s.AbstractAction::DEBUG::Could not retrieve 'version_type' parameter from action.
java.lang.NullPointerException: null
at io.searchbox.core.Bulk.generateBulkPayload(Bulk.java:71) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk.(Bulk.java:28) [jest-0.0.4.jar:na]
at io.searchbox.core.Bulk$Builder.build(Bulk.java:145) [jest-0.0.4.jar:na]
Any ideas?
Selam, öncelikle Elastic Search için başarılı bir Java rest client geliştirdiğiniz için teşekkür ederim. Sanırım Java için bu alanda geliştirilen ilk kütüphane sizinkisi. Çok uzatmadan issue'yu açıklayayım. Problem kaynağının kullanmakta olduğumuz httpclient-4.2.2.jar kütüphane versiyonumu yoksa kütüphane kodu içindeki bir hata mı olduğundan emin olamadık. Almakta olduğumz exception aşağıdadır:
java.lang.IllegalStateException: Invalid use of BasicClientConnManager: connection still allocated.
Make sure to release the connection before allocating another one.
at org.apache.http.impl.conn.BasicClientConnectionManager.getConnection(BasicClientConnectionManager.java:162)
at org.apache.http.impl.conn.BasicClientConnectionManager$1.getConnection(BasicClientConnectionManager.java:139)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:455)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at io.searchbox.client.http.JestHttpClient.execute(JestHttpClient.java:55)
at com.test1.elasticsearch.Search(ElasticSearch.java:114)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
Not: Concurrent şekilde yoğun search işlemi gerçekleştirdiğimizde bu hatayı sıkça almaktayız.
Teşekkür ederim. Başarılarınızın devamını dilerim.
0.90 will introduce new suggest api, we need to handle it's result and query.
Implement io.searchbox.cluster.Nodes* actions
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.