philippus / elastic4s Goto Github PK

View Code? Open in Web Editor NEW

1.6K 74.0 689.0 12.37 MB

Elasticsearch Scala Client - Reactive, Non Blocking, Type Safe, HTTP Client

License: Apache License 2.0

Scala 99.99% Java 0.01% Shell 0.01%

scala elasticsearch reactive-streams http rest-api circe

elastic4s's Introduction

elastic4s - Elasticsearch Scala Client

This is a community project - PRs will be accepted and releases published by the maintainer

Elastic4s is a concise, idiomatic, reactive, type safe Scala client for Elasticsearch. The official Elasticsearch Java client can of course be used in Scala, but due to Java's syntax it is more verbose and it naturally doesn't support classes in the core Scala core library nor Scala idioms such as typeclass support.

Elastic4s's DSL allows you to construct your requests programatically, with syntactic and semantic errors manifested at compile time, and uses standard Scala futures to enable you to easily integrate into an asynchronous workflow. The aim of the DSL is that requests are written in a builder-like way, while staying broadly similar to the Java API or Rest API. Each request is an immutable object, so you can create requests and safely reuse them, or further copy them for derived requests. Because each request is strongly typed your IDE or editor can use the type information to show you what operations are available for any request type.

Elastic4s supports Scala collections so you don't have to do tedious conversions from your Scala domain classes into Java collections. It also allows you to index and read classes directly using typeclasses so you don't have to set fields or json documents manually. These typeclasses are generated using your favourite json library - modules exist for Jackson, Circe, Json4s, PlayJson and Spray Json. The client also uses standard Scala durations to avoid the use of strings or primitives for duration lengths.

Key points

Type safe concise DSL
Integrates with standard Scala futures or other effects libraries
Uses Scala collections library over Java collections
Returns Option where the java methods would return null
Uses Scala Durations instead of strings/longs for time values
Supports typeclasses for indexing, updating, and search backed by Jackson, Circe, Json4s, PlayJson and Spray Json implementations
Supports Java and Scala HTTP clients such as Akka-Http
Provides reactive-streams implementation
Provides a testkit subproject ideal for your tests

Release

Current Elastic4s versions support Scala 2.12 and 2.13. Scala 2.10 support has been dropped starting with 5.0.x and Scala 2.11 has been dropped starting with 7.2.0. For releases that are compatible with earlier versions of Elasticsearch, search maven central.

Elastic Version	Scala 2.12	Scala 2.13	Scala 3
8.11.x
8.10.x
8.9.x
8.8.x
8.7.x
8.6.x
8.5.x
8.4.x
8.3.x
8.2.x
8.1.x
8.0.x
7.17.x
7.16.x
7.15.x
7.14.x
7.13.x
7.12.x
7.11.x
7.10.x
7.9.x
7.8.x
7.7.x
7.6.x
7.5.x
7.4.x
7.3.x
7.2.x
7.1.x
7.0.x

For releases prior to 7.0 search maven central.

Quick Start

We have created sample projects in both sbt, maven and gradle. Check them out here: https://github.com/sksamuel/elastic4s/tree/master/samples

To get started you will need to add a dependency:

elastic4s-client-esjava

// major.minor are in sync with the elasticsearch releases
val elastic4sVersion = "x.x.x"
libraryDependencies ++= Seq(
  // recommended client for beginners
  "com.sksamuel.elastic4s" %% "elastic4s-client-esjava" % elastic4sVersion,
  // test kit
  "com.sksamuel.elastic4s" %% "elastic4s-testkit" % elastic4sVersion % "test"
)

The basic usage is that you create an instance of a client and then invoke the execute method with the requests you want to perform. The execute method is asynchronous and will return a standard Scala Future[T] (or use one of the Alternative executors) where T is the response type appropriate for your request type. For example a search request will return a response of type SearchResponse which contains the results of the search.

To create an instance of the HTTP client, use the ElasticClient companion object methods. Requests are created using the elastic4s DSL. For example to create a search request, you would do:

search("index").query("findthistext")

The DSL methods are located in the ElasticDsl trait which needs to be imported or extended.

import com.sksamuel.elastic4s.ElasticDsl._

Creating a Client

The entry point in elastic4s is an instance of ElasticClient. This class is used to execute requests, such as SearchRequest, against an Elasticsearch cluster and returns a response type such as SearchResponse.

ElasticClient takes care of transforming the requests and responses, and handling success and failure, but the actual HTTP functions are delegated to a HTTP library. One such library is JavaClient which uses the http client provided by the offical Java elasticsearch library.

So, to connect to an ElasticSearch cluster, pass an instance of JavaClient to an ElasticClient. JavaClient is configured using ElasticProperties in which you can specify protocol, host, and port in a single string.

val props = ElasticProperties("http://host1:9200")
val client = ElasticClient(JavaClient(props))

For multiple nodes you can pass a comma-separated list of endpoints in a single string:

val nodes = ElasticProperties("http://host1:9200,host2:9200,host3:9200")
val client = ElasticClient(JavaClient(nodes))

There are several http libraries to choose from, or you can wrap any HTTP library you wish. For further details, and information on how to specify credentials and other options, see the full client documentation

Example Application

An example is worth 1000 characters so here is a quick example of how to connect to a node with a client, create an index and index a one field document. Then we will search for that document using a simple text query.

Note: As of version 0.7.x the LocalNode functionality has been removed. It is recommended that you stand up a local ElasticSearch Docker container for development. This is the same strategy used in the tests.

import com.sksamuel.elastic4s.fields.TextField
import com.sksamuel.elastic4s.http.JavaClient
import com.sksamuel.elastic4s.requests.common.RefreshPolicy
import com.sksamuel.elastic4s.requests.searches.SearchResponse

object ArtistIndex extends App {

  // in this example we create a client to a local Docker container at localhost:9200
  val client = ElasticClient(JavaClient(ElasticProperties(s"http://${sys.env.getOrElse("ES_HOST", "127.0.0.1")}:${sys.env.getOrElse("ES_PORT", "9200")}")))

  // we must import the dsl
  import com.sksamuel.elastic4s.ElasticDsl._

  // Next we create an index in advance ready to receive documents.
  // await is a helper method to make this operation synchronous instead of async
  // You would normally avoid doing this in a real program as it will block
  // the calling thread but is useful when testing
  client.execute {
    createIndex("artists").mapping(
      properties(
        TextField("name")
      )
    )
  }.await

  // Next we index a single document which is just the name of an Artist.
  // The RefreshPolicy.Immediate means that we want this document to flush to the disk immediately.
  // see the section on Eventual Consistency.
  client.execute {
    indexInto("artists").fields("name" -> "L.S. Lowry").refresh(RefreshPolicy.Immediate)
  }.await

  // now we can search for the document we just indexed
  val resp = client.execute {
    search("artists").query("lowry")
  }.await

  // resp is a Response[+U] ADT consisting of either a RequestFailure containing the
  // Elasticsearch error details, or a RequestSuccess[U] that depends on the type of request.
  // In this case it is a RequestSuccess[SearchResponse]

  println("---- Search Results ----")
  resp match {
    case failure: RequestFailure => println("We failed " + failure.error)
    case results: RequestSuccess[SearchResponse] => println(results.result.hits.hits.toList)
    case results: RequestSuccess[_] => println(results.result)
  }

  // Response also supports familiar combinators like map / flatMap / foreach:
  resp foreach (search => println(s"There were ${search.totalHits} total hits"))

  client.close()
}

Alternative Executors

By default, elastic4s uses scala Futures when returning responses, but any effect type can be supported.

If you wish to use ZIO, Cats-Effect, Monix or Scalaz, then read this page on alternative effects.

Index Refreshing

When you index a document in Elasticsearch, usually it is not immediately available to be searched, as a refresh has to happen to make it visible to the search API.

By default a refresh occurs every second but this can be changed if needed. Note that this only impacts the visibility of newly indexed documents and has nothing to do with data consistency and durability.

This setting can be controlled when creating an index or when indexed documents.

Create Index

All documents in Elasticsearch are stored in an index. We do not need to tell Elasticsearch in advance what an index will look like (eg what fields it will contain) as Elasticsearch will adapt the index dynamically as more documents are added, but we must at least create the index first.

To create an index called "places" that is fully dynamic we can simply use:

client.execute {
  createIndex("places")
}

We can optionally set the number of shards and/or replicas

client.execute {
  createIndex("places").shards(3).replicas(2)
}

Sometimes we want to specify the properties of the fields in the index in advance. This allows us to manually set the type of the field (where Elasticsearch might infer something else) or set the analyzer used, or multiple other options

To do this we add mappings:

client.execute {
    createIndex("cities").mapping(
        properties(
            keywordField("id"),
            textField("name").boost(4),
            textField("content"),
            keywordField("country"),
            keywordField("continent")
        )
    )
}

Then Elasticsearch is preconfigured with those mappings for those fields. It is still fully dynamic and other fields will be created as needed with default options. Only the fields specified will have their type preset.

More examples on the create index syntax can be found here.

Analyzers

Analyzers control how Elasticsearch parses the fields for indexing. For example, you might decide that you want whitespace to be important, so that "band of brothers" is indexed as a single "word" rather than the default which is to split on whitespace. There are many advanced options available in analayzers. Elasticsearch also allows us to create custom analyzers. For more details see the documentation on analyzers.

Indexing

To index a document we need to specify the index and type and optionally we can set an id. If we don't include an id then elasticsearch will generate one for us. We must also include at least one field. Fields are specified as standard tuples.

client.execute {
  indexInto("cities").id("123").fields(
    "name" -> "London",
    "country" -> "United Kingdom",
    "continent" -> "Europe",
    "status" -> "Awesome"
  )
}

There are many additional options we can set such as routing, version, parent, timestamp and op type. See official documentation for additional options, all of which exist in the DSL as keywords that reflect their name in the official API.

Indexable Typeclass

Sometimes it is useful to create documents directly from your domain model instead of manually creating maps of fields. To achieve this, elastic4s provides the Indexable typeclass.

If you provide an implicit instance of Indexable[T] in scope for any class T that you wish to index, and then you can invoke doc(t) on the IndexRequest.

For example:

// a simple example of a domain model
case class Character(name: String, location: String)

// turn instances of characters into json
implicit object CharacterIndexable extends Indexable[Character] {
  override def json(t: Character): String = s""" { "name" : "${t.name}", "location" : "${t.location}" } """
}

// now index requests can directly use characters as docs
val jonsnow = Character("jon snow", "the wall")
client.execute {
  indexInto("gameofthrones").doc(jonsnow)
}

Some people prefer to write typeclasses manually for the types they need to support. Other people like to just have it done automagically. For the latter, elastic4s provides extensions for the well known Scala Json libraries that can be used to generate Json generically.

To use this, add the import for your chosen library below and bring the implicits into scope. Then you can pass any case class instance to doc and an Indexable will be derived automatically.

Library	Elastic4s Module	Import
Jackson	elastic4s-json-jackson	import ElasticJackson.Implicits._
Json4s	elastic4s-json-json4s	import ElasticJson4s.Implicits._
Circe	elastic4s-json-circe	import io.circe.generic.auto._ import com.sksamuel.elastic4s.circe._
PlayJson	elastic4s-json-play	import com.sksamuel.elastic4s.playjson._
Spray Json	elastic4s-json-spray	import com.sksamuel.elastic4s.sprayjson._
ZIO 1.0 Json	elastic4s-json-zio-1	import com.sksamuel.elastic4s.ziojson._
ZIO 2.0 Json	elastic4s-json-zio	import com.sksamuel.elastic4s.ziojson._

Searching

To execute a search in elastic4s, we need to pass an instance of SearchRequest to our client.

One way to do this is to invoke search and pass in the index name. From there, you can call query and pass in the type of query you want to perform.

For example, to perform a simple text search, where the query is parsed from a single string we can do:

client.execute {
  search("cities").query("London")
}

For full details on creating queries and other search capabilities such source filtering and aggregations, please read this.

Multisearch

Multiple search requests can be executed in a single call using the multisearch request type. This is the search equivilent of the bulk request.

HitReader Typeclass

By default Elasticsearch search responses contain an array of SearchHit instances which contain things like the id, index, type, version, etc as well as the document source as a string or map. Elastic4s provides a means to convert these back to meaningful domain types quite easily using the HitReader[T] typeclass.

Provide an implementation of this typeclass, as an in scope implicit, for whatever type you wish to marshall search responses into, and then you can call to[T] or safeTo[T] on the response. The difference between to and safeTo is that to will drop any errors and just return successful conversions, whereas safeTo returns a sequence of Either[Throwable, T].

A full example:

case class Character(name: String, location: String)

implicit object CharacterHitReader extends HitReader[Character] {
  override def read(hit: Hit): Either[Throwable, Character] = {
    val source = hit.sourceAsMap
    Right(Character(source("name").toString, source("location").toString))
  }
}

val resp = client.execute {
  search("gameofthrones").query("kings landing")
}.await // don't block in real code

// .to[Character] will look for an implicit HitReader[Character] in scope
// and then convert all the hits into Characters for us.
val characters: Seq[Character] = resp.result.to[Character]

This is basically the inverse of the Indexable typeclass. And just like Indexable, the json modules provide implementations out of the box for any types. The imports are the same as for the Indexable typeclasses.

As a bonus feature of the Jackson implementation, if your domain object has fields called _timestamp, _id, _type, _index, or _version then those special fields will be automatically populated as well.

Highlighting

Elasticsearch can annotate results to show which part of the results matched the queries by using highlighting. Just think when you're in google and you see the snippets underneath your results - that's what highlighting does.

We can use this very easily, just add a highlighting definition to your search request, where you set the field or fields to be highlighted. Viz:

search("music").query("kate bush").highlighting (
  highlight("body").fragmentSize(20)
)

All very straightforward. There are many options you can use to tweak the results. In the example above I have simply set the snippets to be taken from the field called "body" and to have max length 20. You can set the number of fragments to return, seperate queries to generate them and other things. See the elasticsearch page on highlighting for more info.

Get / Multiget

A get request allows us to retrieve a document directly by id.

client.execute {
  get("bands", "coldplay")
}

We can fetch multiple documents at once using the multiget request.

Deleting

In elasticsearch we can delete based on an id, or based on a query (which can match multiple documents).

See more about delete.

Updates

We can update existing documents without having to do a full index, by updating a partial set of fields. We can update-by-id or update-by-query.

For more details see the update page.

More like this

If you want to return documents that are "similar" to a current document we can do that very easily with the more like this query.

client.execute {
  search("drinks").query {
    moreLikeThisQuery("name").likeTexts("coors", "beer", "molson").minTermFreq(1).minDocFreq(1)
  }
}

For all the options see here.

Count

A count request executes a query and returns a count of the number of matching documents for that query.

Bulk Operations

Elasticsearch is fast. Roundtrips are not. Sometimes we want to wrestle every last inch of performance and a useful way to do this is to batch up requests. We can do this in elasticsearch via the bulk API. A bulk request wraps index, delete and update requests in a single request.

client.execute {
  bulk(
    indexInto("bands").fields("name" -> "coldplay"), // one index request
    deleteById("bands", "123"), // a delete by id request
    indexInto("bands").fields( // second index request
      "name" -> "elton john",
      "best_album" -> "tumbleweed connection"
    )
  )
}

A single HTTP request is now needed for 3 operations. In addition Elasticsearch can now optimize the requests, by combining inserts or using aggressive caching.

For full details see the docs on bulk operations.

Show Query JSON

It can be useful to see the json output of requests in case you wish to tinker with the request in a REST client or your browser. It can be much easier to tweak a complicated query when you have the instant feedback of the HTTP interface.

Elastic4s makes it easy to get this json where possible. Simply invoke the show method on the client with a request to get back a json string. Eg:

val json = client.show {
  search("music").query("coldplay")
}
println(json)

Not all requests have a json body. For example get-by-id is modelled purely by http query parameters, there is no json body to output. And some requests aren't supported by the show method - you will get an implicit not found error during compliation if that is the case

Aliases

An index alias is a logical name used to reference one or more indices. Most Elasticsearch APIs accept an index alias in place of an index name.

For elastic4s syntax for aliases click here.

Explain

An explain request computes a score explanation for a query and a specific document. This can give useful feedback whether a document matches or didn’t match a specific query.

For elastic4s syntax for explain click here.

Validate Query

The validate query request type allows you to check a query is valid before executing it.

See the syntax here.

Force Merge

Merging reduces the number of segments in each shard by merging some of them together, and also frees up the space used by deleted documents. Merging normally happens automatically, but sometimes it is useful to trigger a merge manually.

See the syntax here.

Cluster APIs

Elasticsearch supports querying the state of the cluster itself, to find out information on nodes, shards, indices, tasks and so on. See the range of cluster APIs here.

Search Iterator

Sometimes you may wish to iterate over all the results in a search, without worrying too much about handling futures, and re-requesting via a scroll. The SearchIterator will do this for you, although it will block between requests. A search iterator is just an implementation of scala.collection.Iterator backed by elasticsearch queries.

To create one, use the iterate method on the companion object, passing in the http client, and a search request to execute. The search request must specify a keep alive value (which is used by elasticsearch for scrolling).

implicit val reader : HitReader[MyType] =  ...
val iterator = SearchIterator.iterate[MyType](client, search(index).matchAllQuery.keepAlive("1m").size(50))
iterator.foreach(println)

For instance, in the above we are bringing back all documents in the index, 50 results at a time, marshalled into instances of MyType using the implicit HitReader (see the section on HitReaders). If you want just the raw elasticsearch Hit object, then use SearchIterator.hits

Note: Whenever the results in a particular batch have been iterated on, the SearchIterator will then execute another query for the next batch and block waiting on that query. So if you are looking for a pure non blocking solution, consider the reactive streams implementation. However, if you just want a quick and simple way to iterate over some data without bringing back all the results at once SearchIterator is perfect.

Elastic Reactive Streams

Elastic4s has an implementation of the reactive streams api for both publishing and subscribing that is built using Akka. To use this, you need to add a dependency on the elastic4s-streams module.

There are two things you can do with the reactive streams implementation. You can create an elastic subscriber, and have that stream data from some publisher into elasticsearch. Or you can create an elastic publisher and have documents streamed out to subscribers.

For full details read the streams documentation

Using Elastic4s in your project

For gradle users, add (replace 2.12 with 2.13 for Scala 2.13):

compile 'com.sksamuel.elastic4s:elastic4s-core_2.12:x.x.x'

For SBT users simply add:

libraryDependencies += "com.sksamuel.elastic4s" %% "elastic4s-core" % "x.x.x"

For Maven users simply add (replace 2.12 with 2.13 for Scala 2.13):

<dependency>
    <groupId>com.sksamuel.elastic4s</groupId>
    <artifactId>elastic4s-core_2.12</artifactId>
    <version>x.x.x</version>
</dependency>

Check for the latest released versions on maven central

Building and Testing

This project is built with SBT. So to build with:

sbt compile

And to test:

sbt test

The project is currently cross-built against Scala 2.12 and 2.13, when preparing a pull request the above commands should be run with the sbt + modifier to compile and test against both versions. For example: sbt +compile.

For the tests to work you will need to run a local elastic instance on port 39227, with security enabled. One easy way of doing this is to use docker (via docker-compose): docker-compose up

Used By

Barclays Bank
HSBC
Shazaam
Lenses
Iterable
Graphflow
Hotel Urbano
Immobilien Scout
Deutsche Bank
Goldman Sachs
HMRC
Canal+
AOE
Starmind
ShopRunner
Soundcloud
Rostelecom-Solar
Shoprunner
Twitter
bluerootlabs.io
mapp.com
Jusbrasil

Raise a PR to add your company here

YourKit supports open source projects with its full-featured Java Profiler. YourKit, LLC is the creator of YourKit Java Profiler and YourKit .NET Profiler, innovative and intelligent tools for profiling Java and .NET applications.

Contributions

Contributions to elastic4s are always welcome. Good ways to contribute include:

Raising bugs and feature requests
Fixing bugs and enhancing the DSL
Improving the performance of elastic4s
Adding to the documentation

License

This software is licensed under the Apache 2 license, quoted below.

Copyright 2013-2016 Stephen Samuel

Licensed under the Apache License, Version 2.0 (the "License"); you may not
use this file except in compliance with the License. You may obtain a copy of
the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations under
the License.

elastic4s's People

Contributors

Stargazers

Watchers

Forkers

aparo jameshoare bfritz guersam fehmicansaglam lalford beiske tovbinm hura jlebleu liujiuwu aslanvaroqua nypias testvidya11 docteurz stringbean ziguane fayvor nicoo pyppe instanceofme hugovalk patwhite sumit-gupta-sgt jacobiz jackcviers vbehar benburton fernandomora shazam foxgrover mresposito yooneo olegych ashishtomar123 thesamet astrac liorhar joprice tylerprete marcovzla dinduks hayssams jeffsteinmetz schibsted-ada phillro jdutton stanback maxcom benusher fredrikroos kretes synhaptein alexbalonperin vidyar andrewh42 chiappone ashalynd chris9871 oris4ecm hfgiii jarlakxen fabienpennequin fmasion derfloscher bluesky4485 tempbottle nadavsr kornel hdpter andyxukq flaviotruzzi algd lukestewart13 mpollmeier rowhit mindis jfenc91 nonontb muuki88 petro-rudenko rdelaoc kuhnen darkseed andrestc l15k4 drizham pcting mlnick k4200 mmilewski bitsofinfo bigwheel gregsilin mositz johntbush yunskang xxbedy leonardehrenfried jargote

elastic4s's Issues

Update id ... in "idx/type" should accept a map as well

Works:

update id 5 in "scifi/startrek"

Doesn't work:

update id 5 in "scifi" -> "startrek"

Filtering executed queries in percolate API

I'd like to add support to filter executed queries in percolate API as described here - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-percolate.html#_filtering_executed_queries

Any suggestions / considerations before I code it?

Completion Type for Mapping Creation

Hi!
Thanks for the amazing work you're doing!
Could it be possible to specify the "completion" type useful for the completion suggester (http://www.elasticsearch.org/blog/you-complete-me/). Or is there another way to do it? I need to write a mapping that looks like (see the field "name_autocomplete"):

{
    "products": {
        "properties": {
            "id": {
                "type": "string"
            },
            ....,
            "name_autocomplete": {
                "type": "completion",
                "payloads": true
            }
        }
    }
}

Thanks!

Make Jackson Dependency Optional

There is no need to have a hard dependency on Jackson. Also alternatively you can completely remove the Jackson dependency and use the Jackson that's embedded in ElasticSearch.

Documentation about bulk requests is a bit off

The documentation still refers to client.bulk whereas it seems that anything in the same client.execute block will be executed in bulk.

Non-blocking Future

Hi, I'm looking for an ES client and your DSL-based approach looks very promising. Thanks in advance :)

One thing makes me worry is about the current execute implementation.

https://github.com/sksamuel/elastic4s/blob/master/src/main/scala/com/sksamuel/elastic4s/Client.scala#L39-L41

It takes per-client ExecutionContext and returns a Future by wrapping blocking actionGet() with a future {} block. It doesn't block the current thread but actually blocks the dedicated one, so it might reduce overall throughput and be a bottleneck.

Suggestion is attaching an ActionListener[T] using this and returning a Future from our own Promise, for example (not tested, just a concept):

def execute[A <: ActionRequest, B <: ActionResponse, C <: ActionRequestBuilder[A, B, C]](action: Action[A, B, C], req: A): Future[B] = {
  val p = Promise[B]()
  client.execute(action, req, new ActionListener[B] {
    def onResponse(response: B) = p.success(response)
    def onFailure(e: Throwable) = p.failure(e)
  })
  p.future
}

It'll make the API require an ExecutionContext for every request. If you accept this change I'll open a PR when time allows.

Documentation on how to create a client is a little off

In docs:

// single node
val client = ElastiClient.remote("host1", 9300)

But in fact this doesn't work because #remote expects paramters of another type. Tried to pass a tuple as follows ElastiClient.remote(("host1", 9300)) it compiles but throws an error

Exception in thread "main" java.lang.IllegalArgumentException: requirement failed
at scala.Predef$.require(Predef.scala:221)
at com.sksamuel.elastic4s.ElasticClient$.remote(Client.scala:228)
at com.sksamuel.elastic4s.ElasticClient$.remote(Client.scala:226)

Same error if I'm trying to create a client like this: ElastiClient.remote("host1" -> 9300).

Towards ES 1.0

Just figured I'd drop this here since it'll probably affect you:

elastic/elasticsearch#4634

Type is missing when mapping nested fields

When creating a mapping with nested fields the following error is generated, i.e:

curl -XPUT 'http://localhost:9200/twitter/' -d '
{
    "settings": {
        "index": {
            "number_of_shards": 2,
            "number_of_replicas": 1
        }
    },
    "mappings": {
        "tweets": {
            "_source": {
                "enabled": true
            },
            "properties": {
                "_id": {
                    "type": "string",
                },
                "user": {
                    "name": {
                        "type": "string"
                    }
                }
            }
        }
    }
}
'

Fails with an error:

org.elasticsearch.index.mapper.MapperParsingException: mapping [trucks]
    at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:311)
    at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:298)
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:135)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
    at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.index.mapper.MapperParsingException: No type specified for property [menu]
    at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:255)
    at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parse(ObjectMapper.java:219)
    at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:177)
    at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:387)
    at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:195)
    at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:308)
    ... 5 more

Fix: adding "type": "nested" to users the request succeeds, i.e:

curl -XPUT 'http://localhost:9200/twitter/' -d '
{
    "settings": {
        "index": {
            "number_of_shards": 2,
            "number_of_replicas": 1
        }
    },
    "mappings": {
        "tweets": {
            "_source": {
                "enabled": true
            },          
            "properties": {
                "_id": {
                    "type": "string",
                },
                "user": {
                    "type": "nested",
                    "name": {
                        "type": "string"
                    }
                }
            }
        }
    }
}
'

Deprecate numeric_range

docasupsert support

It looks like elastic4s doesn't support docasupsert yet.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-update.html

or at least the DSL doesn't mention it and there are no tests for it.

Looks like a nice DSL otherwise. Hopefully there'll be some other places where I can use it.

Provide OSGi friendly distribution

It'd be great to have an OSGi friendly distribution of elastic4s to allow deployment in an OSGi container.

Is there public method for getting builder or json string?

Quick question: you use req._builder in SearchDslTest to get the json string from the query. I'd like to do this in my tests. Is there another way to get the json that doesn't use a private method?

https://github.com/sksamuel/elastic4s/blob/6d4b9e906880dc7b2c479d2a2e64ab61a133af7d/src/test/scala/com/sksamuel/elastic4s/SearchDslTest.scala

Thanks-
-f

FuzzyDefinition does not expose setting prefix_length.

According to Fuzzy Query The default of prefix_length should be set to something other than 0 to allow for better performance. Although FuzzyDefinition exposes max_expansions, prefix_length is not exposed. A workaround is to use the elasitcsearch builder directly, but exposing this through the query dsl would be preferable.

Prepare release for 0.90.4

Was at the London Elasticsearch meetup last night and Clinton said 0.90.4 is 2 weeks away. This issue is to prep a release for that version, with the new suggestors and function scorer (need to confirm these are 0.90.4 features and not 1.0.0).

Specify tokenfilter when creating indices

Is there any way of creating an index that uses an edgeNGram tokenfilter? I want to be able to search for users, which have username and name, both by full words and prefixes. So far I've managed only to create the index:

client.execute {
    create index "things" mappings (
      "users" as (
        field("username") typed StringType,
        field("name") typed StringType
      )
    )
}

I believe what I need to use is the Edge N-Gram token filter but I can't figure out how to inject that with the elastic4s DSL.

support mapping of nested objects

It does not seem possible to map nested objects[1] using the index creation DSL. If that's out-of-scope for the DSL, allowing the mapping to be specified in JSON would be handy.

[1] http://www.elasticsearch.org/guide/reference/mapping/object-type/

Specify Routing During Index Creation?

Hi,
Is there a way to specify the routing for a mapping during index creation? Looked through the source code and didn't see anything, but could be missing. I suppose that would go in the mapping definition if I wanted to add it?

Thanks!
-PW

Is it possible to use this client on Heroku?

I love the DSL of this client but I haven't been able to connect to a remote cluster over HTTP.

As far as I can determine this client does NOT work over HTTP but joins the cluster over TCP. This requires an extra listening port to be opened, which is not allowed on Heroku.

Is this assumption correct? If so, it would be nice if that was a bit more explicitly documented...

If this assumption is not correct, I'd value an example of how you can connect to a remote cluster that you have only a HTTP connection string for.

I got IndexMissingException when I specified type along with index name in "search in"

val client = ElasticClient.remote("localhost", 9300)
  val responseF = client execute { index into "bands/singers" fields "name"->"chris martin" }
  val response = Await.result(responseF, 3 seconds)
  println(response.getId())
  val singersF = client execute { search in "bands/singers" query "Chris" }
  val singers = Await.result(singersF, 3 seconds)
  println(singers);

It seems like it should work but unfortunately I got IndexMissingException.
I used exactly same syntax as it appears in README.md. However, if I omit
the type name "singers" from the "search in" like the following

 val singersF = client execute { search in "bands" query "Chris" }

The query will work just fine.

My environment is
OS = OSX 10.8.5
scala = 2.10.2
elastic4s = 0.90.5.1
elasticserver = 0.90.5

Create a 1.0.0-SNAPSHOT release and update it frequently

Things are fixed/changed very frequently and we need to use/test those fixes in our projects. What do you think about doing a SNAPSHOT release and updating it frequently?

API change suggestions

Current implementation have nested wrapper classes in public traits, which prevents proper equality test (even though it's seldom used) and implicit conversions in companion object.

Before covering all ES requests, I'd like to suggest:

change object ElasticDsl to package dsl and bring inner classes to top-level
give an ActionRequest instance to the default constructors of RequestDefinitionLike subclasses, and move auxiliary constructors into companion objects
move implicit conversions to companion objects
provide implicit conversions wrapping raw ES request for more generic execute

Benefits:

provide consistent rule to deal with every request
some wrappers can be value classes which decreases runtime overhead
implicit conversion without import tax
a few more...

Add aggs support

http://www.elasticsearch.org/blog/1-0-0-beta2-released/#aggs

Could it be posible to have "text" query

Hi, can it be posible to have the "text" query? I have this issue, where I need to filter bi certain phrases, the search should be what ever field that with one of those phrases in certain field. I think I can only solve it with "text"

thanks

Upgrade to Elasticsearch 1.0.0

I can make the change in build.sbt but do I need to update the pom.xml as well?

compile errors in index creation example from README

I'm trying to follow through the examples in the README and I am running into trouble creating an index with mappings. The below code throws compile errors with Scala 2.10.2 and using version 0.90.2.8 of elastic4s.

import com.sksamuel.elastic4s.ElasticClient
import com.sksamuel.elastic4s.ElasticDsl._
object Hello extends App {
    val client = ElasticClient.local
    client.execute {
        create index "places" mappings (
            "cities" as (
                "id" typed IntegerType,
                "name" boost 4,
                "content" analyzer StopAnalyzer
            )
         )
    }
}

The errors I'm getting are:

[error] .../src/main/scala/example/Hello.scala:14: type mismatch;
[error]  found   : String("name")
[error]  required: ?{def boost(x$1: ? >: Int(4)): ?}
[error] Note that implicit conversions are not applicable because they are ambiguous:
[error]  both method string2query in trait QueryDsl of type (string: String)com.sksamuel.elastic4s.StringQueryDefinition
[error]  and method field in trait CreateIndexDsl of type (name: String)com.sksamuel.elastic4s.ElasticDsl.FieldDefinition
[error]  are possible conversion functions from String("name") to ?{def boost(x$1: ? >: Int(4)): ?}
[error]                 "name" boost 4,
[error]                 ^
[error] .../src/main/scala/example/Hello.scala:13: not found: value IntegerType
[error]                 "id" typed IntegerType,
[error]                            ^
[error] two errors found
[error] (compile:compile) Compilation failed

Calling all users

With version 1 due out soon, I want to lock down the DSL for our own 1.0 release.
Does anyone have any suggestions for improvements, things they'd like to change, breaking backwards compatibility.

Missing Boolean Type

It's not possible to define a boolean type in mapping.

first example on README fix

i think it lacks import to
import com.sksamuel.elastic4s.ElasticClient

and it should be
val client = ElasticClient.local

specify fields to be returned for queries

Hi,

Thanks for this fantastic work :)

I would like to ask for a new feature. In elastic search, as part of search query, user can specify the fields to be returned, it appears this feature is not yet implemented and when a search performed using elastic4s, all the fields of matching documents are returned.

Is there a way to drop in "raw JSON" for queries vs falling back to java APIs?

In some cases we'd like to pass through "raw" queries (for example, queries stored externally in their JSON form). I can do this via the java API, but haven't found a way to do it in the elastic4s DSL.

e.g. I'd like to do this but stay in the elastic4s DSL:

val q = """{ "match": { "drummer" : "will champion" } }"""
val r = client.java.prepareSearch("music").setQuery(q).execute().actionGet()
println(r)

Add ability to use a phrase_match query

Is it possible to add the functionality to do a phrase_match query as outline at http://www.elasticsearch.org/guide/reference/query-dsl/match-query/?

Add support for boost_factor score function

Hi, I would like to see a BoostFactorScoreDefinition in addition to RandomScore, ScriptScore etc.

Here it is in the ES docs:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#_boost_factor

Cheers,
-Fayvor

Tests are failing for the master branch

[info] *** 7 TESTS FAILED ***
[error] Failed tests:
[error]         com.sksamuel.elastic4s.DeleteTest
[error]         com.sksamuel.elastic4s.CountTest
[error]         com.sksamuel.elastic4s.SearchTest
[error]         com.sksamuel.elastic4s.SearchDslTest
[error] Error during tests:
[error]         com.sksamuel.elastic4s.PercolateTest

minimum_should_match in a list of shoulds

Hi, I haven't been able to find how to do this:

{
    "bool" : {
        "must" : {
            "term" : { "user" : "kimchy" }
        },
        "must_not" : {
            "range" : {
                "age" : { "from" : 10, "to" : 20 }
            }
        },
        "should" : [
            {
                "term" : { "tag" : "wow" }
            },
            {
                "term" : { "tag" : "elasticsearch" }
            }
        ],
        "minimum_should_match" : 1,
        "boost" : 1.0
    }
}

Any suggestion of how to achieve that request?

Upgrade to Elasticsearch 1.0.1

http://www.elasticsearch.org/blog/elasticsearch-0-90-12-1-0-1-released/

Scroll query

It appears there is no way to specify a scroll duration when executing ElasticClient.searchScroll(scrollId : String). This makes elasticsearch skip the scrolll id in the result which is needed to fetch the next batch. I guess the most elegant solution would be extending the dsl to create a SearchScrollRequestBuilder, but a simple additonal Scroll parameter would also suffice.

Bug in facets DSL

Apparently, it is possible to write this:

  search in "foo" query {
      matchall
  } facets {
      facet terms "facet1" field "field1"
      facet terms "facet2" field "field2"
  }

(notice the lack of comma between the two FacetDefinitions)

This will result in a query that only contains the last facet. Trying to write it with , won't work (haven't quite figured out why, given there's an appropriate method for this notation in the code).

So this is a little confusing. Writing the above by wrapping the definitions in a Seq works.

Add doc types support

http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/mapping-core-types.html#_doc_values_format

Add support for "enabled" field

Hi, I can't see support for the enabled field in the mappings definition. Have I missed it or is it NYI?

creating nested object in mappings causes error

I tried to create the nested object in mappings like the following

create index "myindex" mappings (
  "mytype" as (
      field("problem") nested (
          field("attr1") typed StringType,
          field("attr2") typed StringType
      )
  )
)

I got error
org.elasticsearch.index.mapper.MapperParsingException: No type specified for property [problem]

So, I tried to put type StringType on field("problem") and it worked but the result
was not correct. Actually, it was correct because the StringType overrides the nested type. My problem is that nested object doesn't work.

Did I miss something?
Thank you.

Put type mapping

Add a way to set a type mapping without creating an index. http://www.elasticsearch.org/guide/reference/api/admin-indices-put-mapping/

No way to shutdown local node

I want to you use (fantastic) client in a play app and for simplicity want to run an embedded (local) ES node. However, upon reloading (and soft restarting) my play app it throws me an exception saying it could not get a lock on the lucene files.

I think this is because the local node created is never closed. Only restarting my JVM (the sbt instance) fixes it.

Would it make sense to return a tuple containing the node and the client?

What is the suggested way of indexing nested documents?

I am trying to index a document which includes a completion type field named "ac". I am yet to index it in a convenient way using the index DSL. Any suggestions?

{
  "name": "Fyodor Dostoevsky",
  "ac": {
    "input": [
      "fyodor",
      "dostoevsky"
    ],
    "output": "Fyodor Dostoevsky",
    "payload": {
      "books": [
        "123456",
        "123457"
      ]
    }
  }
}

Update for redesigned percolator

http://www.elasticsearch.org/blog/percolator-redesign-blog-post/

Cannot connect to local node

I can't connect to a local node using val client = ElasticClient.remote("localhost", 9300). When I try to I get a org.elasticsearch.client.transport.NoNodeAvailableException: No node available error. I am not sure what is causing this but the node does not seem to even see the connection attempt.

Add suport for script sorting

Hello, I only have found the script option for score queries, but it would be nice to be able to do this:

"sort" : {
    "_script" : { 
        "script" : "Math.random()",
        "type" : "number",
        "params" : {},
        "order" : "asc"
    }
  }

Any idea of a workaround?

How do I delete an Index programmatically?

Thanks for this library!

Please how do I delete an index programmatically? Can't seem find a the api for this.

query_string with multiple fields

Hello, I haven't been able to do a query like this:

"query_string" : {
        "fields" : ["content", "name^5"],
        "query" : "this AND that OR thus",
        "use_dis_max" : true
    }

the closest thing I've found is:

          "multi_match": {
            "query": "pants blue",
            "fields": [
              "specs.color^12",
              "title^8",
              "description^7",
              "keywords^6",
              "brand_name^3",
              "_all^1"
            ],
            "analyzer": "whitespace"
          }

Is there a way to make a query_string with multiple fields?

thanks

Support for disabling dynamic creation of mappings for unmapped types

As described here (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-dynamic-mapping.html) there is an ability to disable dynamic creation of mappings for unmapped types, i.e;

curl -XPUT 'http://localhost:9200/twitter/tweet/_mapping' -d '
{
    "tweet" : {
        "dynamic": "strict",
        "properties" : {
            "message" : {"type" : "string", "store" : "yes"}
        }
    }
}
'

Proposed syntax:

create index "twitter" mappings(
  "tweet" as (
      //...
   ) dynamicDisabled true
)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.