Giter Site home page Giter Site logo

elasticsearch-client's Introduction

Build Status codecov.io Join the chat at https://gitter.im/SumoLogic/elasticsearch-client

Elasticsearch Client

The Sumo Logic Elasticsearch library provides Elasticsearch bindings with a Scala DSL. Unlike other Scala libraries like elastic4s this library targets the REST API. The REST API has a two primary advantages:

  1. Ability to upgrade Elasticsearch without the need to atomically also upgrade the client.
  2. Ability to use hosted Elasticsearch such as the version provided by AWS.

This project is currently targeted at Elasticsearch 6.0.x. For ES 2.3 compatibility see version 3 (release-3.* branch).

Along with a basic Elasticsearch client (elasticsearch-core), helper functionality for using Elasticsearch with Akka (elasticssearch-akka) and AWS (elasticsearch-aws) is also provided. The goal of the DSL is to keep it as simple as possible, occasionally sacrifing some end-user boilerplate to maintain a DSL that is easy to modify and add to. The DSL attempts to be type-safe in that it should be impossible to create an invalid Elasticsearch query. Rather than be as compact as possible, the DSL aims to closely reflect the JSON it generates when reasonable. This makes it easier discover how to access functionality than a traditional maximally compact DSL.

Install / Download

The library components are offered a la carte:

  • elasticsearch-core_2.11 contains the basic Elasticsearch client and typesafe DSL
  • elasticsearch-aws_2.11 contains utilities for using AWS Hosted Elasticsearch.
  • elasticsearch-akka_2.11 contains Actors to use with Akka & Akka Streams
    <dependency>
      <groupId>com.sumologic.elasticsearch</groupId>
      <artifactId>elasticsearch-core_2.11</artifactId>
      <version>6.0.0</version>
    </dependency>

    <dependency>
      <groupId>com.sumologic.elasticsearch</groupId>
      <artifactId>elasticsearch-aws_2.11</artifactId>
      <version>6.0.0</version>
    </dependency>

    <dependency>
      <groupId>com.sumologic.elasticsearch</groupId>
      <artifactId>elasticsearch-akka_2.11</artifactId>
      <version>6.0.0</version>
    </dependency>

Usage

All client methods return futures that can be composed to perform multiple actions.

Basic Usage

val restClient = new RestlasticSearchClient(new StaticEndpoint(new Endpoint(host, port)))
val index = Index("index-name")
val tpe = Type("type")
val indexFuture = for {
  _ <- restClient.createIndex(index)
  indexResult <- restClient.index(index, tpe, Document("docId", Map("text" -> "Hello World!")))
}
Await.result(indexFuture, 10.seconds)
// Need to wait for a flush. In ElasticsearchIntegrationTest, you can just call "refresh()"
Thread.sleep(2000)
restClient.query(index, tpe, QueryRoot(TermQuery("text", "Hello World!"))).map { res =>
  println(res.sourceAsMap) // List(Map(text -> Hello World))
}

https://github.com/SumoLogic/elasticsearch-client/blob/master/elasticsearch-core/src/test/scala/com/sumologic/elasticsearch/restlastic/RestlasticSearchClientTest.scala provides other basic examples.

Using the BulkIndexerActor

NOTE: to use the BulkIndexerActor you must add a dependency to elasticsearch-akka.

The BulkIndexer actor provides a single-document style interface that actually delegates to the bulk API. This allows you to keep your code simple while still getting the performance benefits of bulk inserts and updates. The BulkIndexerActor has two configurable parameters:

  • maxDocuments: The number of documents at which a bulk request will be flushed
  • flushDuration: If the maxDocuments limit is not hit, it will flush after flushDuration.
val restClient: RestlasticSearchClient = ...
val (index, tpe) = (Index("i"), Type("t"))
// Designed for potentially dynamic configuration:
val config = new BulkConfig(
  flushDuration = () => FiniteDuration(2, TimeUnit.Seconds),
  maxDocuments = () => 2000)
val bulkActor = context.actorOf(BulkIndexerActor.props(restClient, config))
val sess = BulkSession.create()
val resultFuture = bulkActor ? CreateRequest(sess, index, tpe, Document("id", Map("k" -> "v")))
val resultFuture2 = bulkActor ? CreateRequest(sess, index, tpe, Document("id", Map("k" -> "v")))
// The resultFuture contains a `sessionId` you can use to match up replies with requests assuming you do not use
// the ask pattern as above.
// The two requests will be batched into a single bulk request and sent to Elasticsearch 

You can also use the Bulk api directly via the REST client:

restClient.bulkIndex(index, tpe, Seq(doc1, doc2, doc3))

Usage With AWS

One common way to configure AWS Elasticsearch is with IAM roles. This requires you to sign every request you send to Elasticsearch with your use key. The elasticsearch-aws module includes a request signer for this purpose:

import com.sumologic.elasticsearch.util.AwsRequestSigner
import com.amazonaws.auth.AWSCredentials
val awsCredentials = _ // Credentials for the AWS user that has permissions to access Elasticsearch
val signer = new AwsRequestSigner(awsCredentials, "REGION", "es")
// You can also create your own dynamic endpoint class based off runtime configuration or the AWS API.
val endpoint = new StaticEndpoint(new Endpoint("es.blahblahblah.amazon.com", 443))
val restClient = new RestlasticSearchClient(endpoint, Some(signer))

restClient will now sign every request automatically with your AWS credentials.

Contributing

Sumo Logic Elasticsearch uses Maven and the Maven GPG Plug-in for builds and testing. After cloning the repository make sure you have a GPG key created. Then run maven clean install.

[Dev] Building

To build project in default Scala version:

./gradlew build

To build project in any supported Scala version:

./gradlew build -PscalaVersion=2.12.8

[Dev] Testing

Tests in this project are run against local Elasticsearch servers es23 es63.

For testing, change your consumer pom.xml or gradle.properties to depend on the SNAPSHOT version generated. Make sure, your consumer can resolve artifacts from a local repository.

[Dev] Managing Scala versions

This project supports multiple versions of Scala. Supported versions are listed in gradle.properties.

  • supportedScalaVersions - list of supported versions (Gradle prevents building with versions from outside this list)
  • defaultScalaVersion - default version of Scala used for building - can be overridden with -PscalaVersion

[Dev] How to release new version

  1. Make sure you have all credentials - access to Open Source vault in 1Password.
    1. Can login as sumoapi https://oss.sonatype.org/index.html
    2. Can import and verify the signing key (api.private.key from OpenSource vault):
      gpg --import ~/Desktop/api.private.key
      gpg-agent --daemon
      touch a
      gpg --use-agent --sign a
      gpg -k
      
    3. Have nexus and signing credentials in ~/.gradle/gradle.properties
      nexus_username=sumoapi
      nexus_password=${sumoapi_password_for_sonatype_nexus}
      signing.gnupg.executable=gpg
      signing.gnupg.keyName=${id_of_imported_sumoapi_key}
      signing.gnupg.passphrase=${password_for_imported_sumoapi_key}
      
  2. Remove -SNAPSHOT suffix from version in build.gradle
  3. Make a release branch with Scala version and project version, ex. elasticsearch-client-7.1.10:
    export RELEASE_VERSION=elasticsearch-client-7.1.10
    git checkout -b ${RELEASE_VERSION}
    git add build.gradle
    git commit -m "[release] ${RELEASE_VERSION}"
    
  4. Perform a release in selected Scala versions (make sure both commands pass without any errors, otherwise go to the link below, drop created repo(s) and try again):
    ./gradlew build publish -PscalaVersion=2.11.12
    ./gradlew build publish -PscalaVersion=2.12.8
    
  5. Go to https://oss.sonatype.org/index.html#stagingRepositories, search for com.sumologic, close and release your repo (there should only be one). NOTE: If you had to login, reload the URL. It doesn't take you to the right page post-login
  6. Update the README.md and CHANGELOG.md with the new version and set upcoming snapshot version in build.gradle, ex. 7.1.11-SNAPSHOT
  7. Commit the change and push as a PR:
    git add build.gradle README.md CHANGELOG.md
    git commit -m "[release] Updating version after release ${RELEASE_VERSION}"
    git push
    

elasticsearch-client's People

Contributors

a-kramarz avatar agearhart avatar ankurkkhurana avatar arminsumo avatar ben-barnes avatar betenkowski avatar cchesumo avatar cddude229 avatar claymccoy avatar cowa avatar davidandrzej avatar duchatran avatar fdaca avatar gitter-badger avatar hatutah avatar kasia-macioszek avatar kumar-avijit avatar mfatyga-sumo avatar mfiglus-sumo avatar michaljmatusiak92 avatar mkolodziejskisl avatar mvaal avatar ogrodnek avatar piotr-sumo avatar rcoh avatar seanpquig avatar tgimis avatar tomlous avatar wojciechatsumo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elasticsearch-client's Issues

Add supported ES version to readme

You should include the supported elastic search version numbers to the readme file. I spent too much time on this only to find it doesn't support the version of ES I'm using.

I'm on 5.2, after writing some code and running it I'm getting Received illegal response: The server-side HTTP version is not supported

Missing nested query DSL support

Hi,

We start to use your library (we were using an internal library) for the elasticsearch 2.3 and AWS support. We are really happy so far! 👍

I didn't find any DSL to create nested queries (Nested query allows to query nested objects / docs).
More information is available on the elasticsearch doc.

I will create in few minutes a pull request for the NestedQuery support.
Please, tell me what you think!

I'll create more issues / PR for missing functionalities that we need in the following days if you are interested.

Thanks!
Arnaud.

Swap out spray for Akka http

Akka HTTP is the successor for spray. With the first non-experimental release of Akka HTTP, spray has reached its end-of-life. http://doc.akka.io/docs/akka-http/current/scala/http/migration-guide/migration-from-spray.html We would like to swap out spray for Akka HTTP. There was an initial effort here #111.

Although all the tests passed, there seems to be a bug in the PR that was not caught. As a result, a request with the change results in the following error

com.sumologic.elasticsearch.restlastic.RestlasticSearchClient$ReturnTypes$ElasticErrorResponse: ElasticsearchError(status=403): JString({"message":"The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.\n\nThe Canonical String for this request should have been\n'GET\n/_count\n\nhost:search-cche-metrics-es-23-crilv2kbplmz3epf4ykurg3fda.us-west-1.es.amazonaws.com\nx-amz-date:20170710T043641Z\n\nhost;x-amz-date\nbaa6846b65b050d71831bb2e4cd6e6f1593902f6d82b16a6c1f9979d14cfcd12'\n\nThe String-to-Sign should have been\n'AWS4-HMAC-SHA256\n20170710T043641Z\n20170710/us-west-1/es/aws4_request\nb8a1845acaa117e0d7db6b981f53738e749b98b02bfe0140eee14faa2229afb5'\n"})
at com.sumologic.elasticsearch.restlastic.RestlasticSearchClient$$anonfun$runRawEsRequest$1.apply(RestlasticSearchClient.scala:246)
at com.sumologic.elasticsearch.restlastic.RestlasticSearchClient$$anonfun$runRawEsRequest$1.apply(RestlasticSearchClient.scala:240)

I spent half a day on it, all the changes seem legit and I was not able to figure out what was going wrong. A revert back to spray fix the problem. Since it is blocking our upgrade, I am going to keep the spray version and raise an issue to fix the bug and revert-revert changes in #111 here (cc @rcoh @seanpquig).

Make the client better support multiple different versions

Quote @davidcarltonsumo here in the 2.3 upgrade PR #126

So I think this is fine, especially since 1.6 is such an old version.

Having said that, given that the client is pure REST, it seems like we should be able to design it to be able to support multiple different versions, and that we might need to do that. I haven't looked in detail at which classes would have to move, but it feels to me like there's probably a tractable subset of the classes that we could put in packages whose names include a version number, and then we could expose a version of RestlasticSearchClient in that directory that people could use to interact with a specific version and also have a version in its current location that refers to the current version. And then, when we add support for a new version, we could just copy the files from the previous version wholesale to the package for the new version, not trying to be clever about reducing duplication or anything.

I'm not completely sure about the details, admittedly. E.g. #121 changes the state machine, so that makes me not completely confident that we'd be able to do this in a way that limits scope. (I guess the flip side is that it's also not completely obvious to me that it would be awful to copy basically everything when upgrading versions!) And it's not like there are that many serious changes between versions (at least if this plus #121 is representative of version changes), so maybe it's overkill - maybe we could just leave in support for old interfaces as well as new interfaces while using a uniform client. And, finally, there's the test issue - presumably when testing, we would have to specify a single elasticsearch version to test against, which would make it hard to detect regressions against old versions. (Hopefully we could do it in a way to make it easy enough to manually test against old versions, ideally by just temporarily editing one number in the pom, but who knows.)

Anyways, I'm fine merging this specific one, given that there hopefully aren't too many other people on 1.6 and given that we believe that the new version should work with 1.6, it just might be a little less performant. But it feels like something where we'll want to develop a strategy at some point, possibly even for the 5.1 change?

This is very useful suggestion. We need a clear story around how to make the client better support multiple different versions. Especially, we we go for the 5.1 change.

deleteDocument API can fail with obscure error

new test:

    "Support deleting a doc that doesn't exist" in {
      val delFut = restClient.deleteDocument(index, tpe, new QueryRoot(TermQuery("text7", "here7")))
      Await.result(delFut, 10.seconds) // May not need Await?
    }

Failure mode:

- should Support deleting a doc that doesn't exist *** FAILED ***
  com.sumologic.elasticsearch.restlastic.RestlasticSearchClient$ReturnTypes$ElasticErrorResponse: ElasticsearchError(status=400): JString({"error":{"root_cause":[{"type":"parse_exception","reason":"Failed to derive xcontent"}],"type":"parse_exception","reason":"Failed to derive xcontent"},"status":400})
  at com.sumologic.elasticsearch.restlastic.RestlasticSearchClient$$anonfun$runRawEsRequest$1.apply(RestlasticSearchClient.scala:254)
  at com.sumologic.elasticsearch.restlastic.RestlasticSearchClient$$anonfun$runRawEsRequest$1.apply(RestlasticSearchClient.scala:249)
  at scala.util.Success$$anonfun$map$1.apply(Try.scala:237)
  at scala.util.Try$.apply(Try.scala:192)
  at scala.util.Success.map(Try.scala:237)
  at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
  at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
  at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
  at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
  at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
  ...

My best guess is it's this bit:

      val documents = Await.result(query(index, tpe, deleteQuery, rawJsonStr = false), 10.seconds).rawSearchResponse.hits.hits.map(_._id)
      bulkDelete(index, tpe, documents.map(Document(_, Map()))).map(res => RawJsonResponse(res.toString))

Looks like we fetch matching documents (0 matches) and then delete them. Some googling (elastic/elasticsearch#8595 (comment)) tells me that the Failure to derive xcontent occurs when the body is empty - so I think that's the issue here.

I'd correct it myself, but I'm not actually sure what to do in the case documents is empty since the return type is Future[RawJsonResponse]

I think this was a regression introduced in #126 btw. cc @CCheSumo

Batch size is too large, size must be less than or equal to: [10000]

Hi,
Is there an example to make a request with scroll batch because when i try to retrieve all indexed data, i receive the error :

Exception in thread "main" org.scalatest.exceptions.TestFailedException: The future returned an exception of type: com.sumologic.elasticsearch.restlastic.RestlasticSearchClient$ReturnTypes$ElasticErrorResponse, with message: ElasticsearchError(status=500): JString({"error":{"root_cause":[{"type":"query_phase_execution_exception","reason":"Batch size is too large, size must be less than or equal to: [10000] but was [10029]. Scroll batch sizes cost as much memory as result windows so they are controlled by the [index.max_result_window] index level setting."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"materials","node":"zGodcjEaSrCXFO9_SJuvzg","reason":{"type":"query_phase_execution_exception","reason":"Batch size is too large, size must be less than or equal to: [10000] but was [10029]. Scroll batch sizes cost as much memory as result windows so they are controlled by the [index.max_result_window] index level setting."}}],"caused_by":{"type":"query_phase_execution_exception","reason":"Batch size is too large, size must be less than or equal to: [10000] but was [10029]. Scroll batch sizes cost as much memory as result windows so they are controlled by the [index.max_result_window] index level setting."}},"status":500}).

here my code


  var resFutQuery = restClient.query(index, tpe, new QueryRoot(MatchAll))


      var total = 10
      whenReady(resFutQuery , timeout(Span(10, Minutes))) { res =>
        total = res.rawSearchResponse.hits.total;
      }
    
    val resFutScroll = restClient.startScrollRequest(index, tpe, new QueryRoot(MatchAll, sizeOpt = Some(total)))
    whenReady(resFutScroll, timeout(Span(10, Minutes))) { res =>
      res._2.rawSearchResponse.hits.hits.foreach(
        x => {

        Process
        }
      )
    }

Thank you in advance

SearchResponse has empty jsonStr for queries with sourceFilter

Hi there,

I am using the elastic search client and it is quite nice, so thanks for your work.

I have a query that uses sourceFilter and to my surprise in the SearchResponse jsonStr is empty.

I found the cause

I can workaround the issue using the rawSearchResponse.

I just would like to know, if there is a reason for that line to be there.

Best regards,
Jonas.

deleteDocument API doesn't delete all the matching items

Test:

    "Support deleting more than 10 docs" in {
      val insertFutures = (0 to 11).map(i => restClient.index(index, tpe, Document(s"doc$i", Map("text7" -> "here7"))))
      val ir = Future.sequence(insertFutures)
      Await.result(ir, 10.seconds)
      refresh()

      val delFut = restClient.deleteDocument(index, tpe, new QueryRoot(MatchAll))
      Await.result(delFut, 10.seconds)
      refresh()

      val count = Await.result(restClient.count(index, tpe, new QueryRoot(MatchAll)), 10.seconds)
      count should be (0)
    }

Failure mode:

- should Support deleting more than 10 docs *** FAILED ***
  2 was not equal to 0 (RestlasticSearchClientTest.scala:314)

Error Json format Parsing

Hi,

The error json message mapping from elasticsearch seems incorrect.

We have the following stacktrace when a document is missing:

org.json4s.package$MappingException: No usable value for error
Do not know how to convert JObject(List((root_cause,JArray(List(JObject(List((type,JString(document_missing_exception)), (reason,JString([engine_product][37035513]: document missing)), (shard,JString(1)), (index,JString(product))))))), (type,JString(document_missing_exception)), (reason,JString([engine_product][37035513]: document missing)), (shard,JString(1)), (index,JString(product)))) into class java.lang.String
  at org.json4s.reflect.package$.fail(package.scala:93)
  at org.json4s.Extraction$ClassInstanceBuilder.org$json4s$Extraction$ClassInstanceBuilder$$buildCtorArg(Extraction.scala:509)
  at org.json4s.Extraction$ClassInstanceBuilder$$anonfun$14.apply(Extraction.scala:529)
  at org.json4s.Extraction$ClassInstanceBuilder$$anonfun$14.apply(Extraction.scala:529)
  at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
  at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
  at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
  at scala.collection.AbstractTraversable.map(Traversable.scala:104)

for the following error message:

{
    "error": {
        "root_cause": [{
            "type": "document_missing_exception",
            "reason": "[engine_product][37035513]: document missing",
            "shard": "1",
            "index": "product"
        }],
        "type": "document_missing_exception",
        "reason": "[engine_product][37035513]: document missing",
        "shard": "1",
        "index": "product"
    },
    "status": 404
}

The class ElasticErrorResponse can not be mapped by json4s because the error is not a String but an Object.

case class ElasticErrorResponse(error: String, status: Int) extends Exception(s"ElasticsearchError(status=$status): $error")

We can reproduce the error with the following test:

package com.sumologic.elasticsearch.restlastic

import com.sumologic.elasticsearch.restlastic.RestlasticSearchClient.ReturnTypes.ElasticErrorResponse
import org.json4s._
import org.json4s.native.JsonMethods._
import org.scalatest.{Matchers, WordSpec}

class ElasticErrorResponseTest extends WordSpec with Matchers {
  private implicit val formats = org.json4s.DefaultFormats

  "RestlasticSearchClient" should {
    "Be able to create an index and setup index setting with keyword lowercase analyzer" in {
      val jsonTree = parse(errorDocumentMissing)
      val errorMessage = jsonTree.extract[ElasticErrorResponse]
      errorMessage should be(ElasticErrorResponse("document_missing_exception", 404))
    }
  }
  val errorDocumentMissing = """{"error":{"root_cause":[{"type":"document_missing_exception","reason":"[engine_product][37035513]: document missing","shard":"1","index":"product"}],"type":"document_missing_exception","reason":"[engine_product][37035513]: document missing","shard":"1","index":"product"},"status":404} """
}

We can have also the following error from ES:

{"message":"The security token included in the request is expired"} 

So I suppose the error json format change regarding the http status code.

Arnaud.

Expired credentials provider

Hi,

We have an issue with the AWS credentials.

We are instantiating the service like in the README.md:

  private val signer = new AwsRequestSigner(awsCredentials, "us-east-1", "es")
  private val endpoint = new StaticEndpoint(Endpoint(elasticSearch.getHost, elasticSearch.getPort))
  private val restClient = new RestlasticSearchClient(endpoint, Some(signer))

We are using the credentials provider implementation that loads credentials from the Amazon EC2 Instance Metadata Service named InstanceProfileCredentialsProvider.
After few hours we have the following error message:

{"message":"The security token included in the request is expired"}

We should refresh the temporary credentials but it seems it's not possible with the current implementation of the client.

Do you have any advices about handling expiring tokens?

One option could be to instantiate a new instance of RestlasticSearchClient at each request but this will instantiate a new ActorSystem() each time. Which is a really heavy operation.
Furthermore, I do not think instantiating an actor system in the client is a good thing. As the akka doc says: An ActorSystem is a heavyweight structure that will allocate 1…N Threads, so create one per logical application.
One improvement could be to remove the creation of the ActorSystem within this class and add it as a dependency as the indexExecutionCtx: ExecutionContext or searchExecutionCtx: ExecutionContext. So we can have only one ActorSystem per application.

What do you think?

Some resources:

Thanks,
Arnaud.

Can't run raw request "/_cat/indices?bytes=m"

Hello,

I'm doing a raw request to list indices. My code looks like this:

client.runRawEsRequest("", "/_cat/indices?bytes=m", GET)

I get this warn log: RestlasticSearchClient$ - Failure response: {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"No feature for name [indices?bytes=m]"}],"type":"illegal_argument_exception","reason":"No feature for name [indices?bytes=m]"},"status":400}

It works when I'm using directly the ES endpoint (through Postman for example).

It seems it's the query parameter ?bytes=m which causes the error. Because this request works perfectly:

client.runRawEsRequest("", "/_cat/indices", GET)

QueryDsl QueryRoot.toJson incorrect for sort

Hello,
There appears to be a bug in your API. When giving multiple field sorts, only the last one is taken.
After debugging the issue, we found the issue in QueryDsl QueryRoot.toJson:
override def toJson: Map[String, Any] = { Map(_query -> query.toJson) ++ fromOpt.map(_from -> _) ++ sizeOpt.map(_size -> _) ++ timeout.map(t => _timeout -> s"${t}ms") ++ sort.map(_sort -> _.toJson) ++ sourceFilter.map(_source -> _) }
The sort is a Seq[Sort] rather than an Option[Seq[Sort]], so the mapping is overwriting the key when we give multiple values.

At a glance, this could be resolved by making the primary constructor of QueryRoot accept an Option[Seq[Sort]] instead of converting in the apply method.

Basic authentication over https

Can you please explain to me how I could achieve something as simple as https://<username>:<password>@<host>:<port> with this client? Is that even possible?

I've found that runRawEsRequest is worth a shot but the buildUri that it uses underneath is not ready to accept <username> and <password>.

Thanks!

Refactor BulkOperation

Currently, We have one API for BulkOperation and we construct different requests based on the operation type. The api gets awkward when supply different configurations for different bulk operations, e.g. the retryOnConflictOpt, upsertOpt for update operations.

When breaking changes are allowed, we should refactor this to BulkOperation as a trait, and break the implementations down to BulkUpdateOperation,BulkCreateOperation,BulkDeleteOperation etc.

Readme says "6.0.0" but this has not been published to Maven

In the readme under "Install/Download," it says you download the library as:

    <dependency>
      <groupId>com.sumologic.elasticsearch</groupId>
      <artifactId>elasticsearch-core</artifactId>
      <version>6.0.0</version>
    </dependency>

However, according to Maven the most recent version is 3.0.1. Will you either update the readme, publish 6.0.0 to Maven, or tell us where 6.0.0 has been published?

Missing multi match query DSL support

Hi,

I didn't find any DSL to create multi match queries (The multi_match query builds on the match query to allow multi-field queries).
More information is available on the elasticsearch doc.

I will create in few minutes a pull request for the MultiMatchQuery support.
Please, tell me what you think!

Thanks,
Arnaud.

Invalid security token in AWS request

I am creating a request in the following manner:

// Client creation function:
def createClient(hostAddress: String): RestlasticSearchClient = {
  val credentialsProvider = new DefaultAWSCredentialsProviderChain()
  val awsCredentials = credentialsProvider.getCredentials

  val signer = new AwsRequestSigner(awsCredentials, "us-west-2", "es")
  val endpoint = new StaticEndpoint(new Endpoint(hostAddress, 443))
  new RestlasticSearchClient(endpoint, Some(signer))
}

val client = createClient("our-es-endpoint-12345-etc.us-west-2.es.amazonaws.com")
val response = client.count(index, tpe, query)

When I execute the count query, I get an HTTP 403 response with the following message:

{"message":"The security token included in the request is invalid."}

I have a similar chunk of code running in Python, using the exact same credentials, endpoint and port, and it can access the Elasticsearch cluster without issue. It is using requests_aws4auth. Is there something I've missed? It seems there is a difference between how elasticsearch-aws and requests_aws4auth are forming the HTTP headers.

Readme says "targeted at ES 1.x"

pom.xml:

        <elasticsearch.version>2.3.5</elasticsearch.version>

README.md:

This project is currently targeted at Elasticsearch 1.x. Support for newer versions is planned but not yet built.

Entire README probably needs some TLC?

AWS security token included in the request is invalid.

Hi...i keep getting the following errors...

[info] application - Connecting to Elasticsearch using 'XXXXXXXX.eu-west-1.es.amazonaws.com' on port '443' [debug] c.s.e.r.RestlasticSearchClient$ - Got Es response: 403 Forbidden [warn] c.s.e.r.RestlasticSearchClient$ - Failure response: {"message":"The security token included in the request is invalid."} [warn] c.s.e.r.RestlasticSearchClient$ - Failing request: {"query":{"term":{"_id":"123"}}}

The code is

`scala
def signer: Option[AwsRequestSigner] = {
host match {
case h if h.contains("es.amazonaws.com") =>
val credentialsProviderChain = new DefaultAWSCredentialsProviderChain()
val region = Option(Regions.getCurrentRegion).getOrElse(Region.getRegion(Regions.EU_WEST_1))
Some(new AwsRequestSigner(credentialsProviderChain, region.getName, "es"))
case _ => None
}
}

override lazy val client = {
Logger.info(s"Connecting to Elasticsearch using '$host' on port '$port'")
new RestlasticSearchClient(new StaticEndpoint(Endpoint(host, port)), signer)
}
`

upgrade to Scala 2.12

Code that depends on this library will be held back from upgrading to 2.12.

It looks like that would require:
• ditching spray which is already an issue #84
• updating akka to 2.4.14 which is already a PR #90 (except to the 2.12 version)
• updating scalatest
• updating json4s-native

RestlasticSearchClient ActorSystem creation by default causing issues

We are running into an issue right now where we are only able to run a single RestlasticSearchClient on a machine because the ActorSystem being created is always default and we are unable to pass props.
implicit val system: ActorSystem = ActorSystem()
Normally I would just override the val for this and set my own value, but unfortunately with how Scala works, it is still calling the initialization and creating the ActorSystem prior to my value being set (so 2 actor systems are created).
Is there a work around for this?

Otherwise please consider making it configurable.
Possible solutions:
Lazy Init:
implicit lazy val system: ActorSystem = ActorSystem()
or implicit parameter pass in:
class RestlasticSearchClient(endpointProvider: EndpointProvider, signer: Option[RequestSigner] = None, indexExecutionCtx: ExecutionContext = ExecutionContext.Implicits.global, searchExecutionCtx: ExecutionContext = ExecutionContext.Implicits.global)(implicit val system: ActorSystem = ActorSystem()){}

Thanks for your time.

Create BreakingChanges.md

Create a BreakingChanges.md where we track breaking changes along with the rectification for release notes.

support Exists query

Does the current client support exists query? There was an old PR for adding support to Exists query, but it's been closed without merging.

Please advise. Thanks.

some tests depend on other test side effects

It is surprising when your passing test starts to fail when run with all tests in the file.
In RestlasticSearchClientTest the tests share state in ES and are isolated by the use of PhrasePrefixQuery on a field. It is not obvious when first working in the file.
A typical way of isolating test data in a database is to clear the data in some way before each new test run. I tried using OneInstancePerTest which would provide a clean ES for each test, but it caused a few tests to fail. Those tests can't be run on their own either. It appears that they depend the side effects of other tests to pass.
It seems worthwhile to make writing tests as easy as possible with few surprises in order to encourage outside contributors to write tests.

"search_type" -> "scan" is no longer supported

Attempting to run a startScrollRequest against an AWS Elasticsearch 5.1 instance results in an exception with the message

org.json4s.package$MappingException: No usable value for error
Do not know how to convert JObject(List((root_cause,JArray(List(JObject(List((type,JString(illegal_argument_exception)), (reason,JString(No search type for [scan]))))))), (type,JString(illegal_argument_exception)), (reason,JString(No search type for [scan])))) into class java.lang.String

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.