With version 1 due out soon, I want to lock down the DSL for our own 1.0 release.

As discussed in <a class="issue-link js-issue-link" data-error-text="Failed to load ti

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Calling all users about elastic4s HOT 16 CLOSED

philippus commented on May 20, 2024

Calling all users

from elastic4s.

Comments (16)

sksamuel commented on May 20, 2024

One thing I'm not sure which I prefer is whether to have generic execute methods, or overloaded specifically named ones.

Eg,

client.execute {
      ...
    }

client.update {
    ....
    }

The latter is shorter, but the problem is structuring the DSL to support things where you only have an id. eg, in the former

client.execute {
      update(14).in("scifi/starwars").doc(anyOldObject).docAsUpsert
    }

the update is the keyword that is used to begin the update DSL. If it was just

client.update {
      14.in("scifi/starwars").doc(anyOldObject).docAsUpsert
    }

Then you have to overload an int with multiple implicits (for other DSL types) and then it becomes ambiguous.

from elastic4s.

caoilte commented on May 20, 2024

As discussed in #51 the current indexing DSL uses the "source" operator to allow you to index json (and other) documents directly but is a little confusing because in Elastic Search parlance "_source" is a field that stores a full copy of the underlying document (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-source-field.html).

It would be less confusing (for me anyway) if elastic4s renamed "source" to be "doc" as this is closer to the way elastic search talks about the process of indexing (which is covered in the "document APIs" section of the documentation).

On client.update vs client.execute the only points I would add is,

it would be good to keep the single op API consistent with the bulk api. "execute" does this as you can drop in replace it with "batch" I assume.
i think IDs are strings so you would actually need to overload that (or both). Another option is

client.update {
      id(14).in("scifi/starwars").doc(anyOldObject).docAsUpsert
    }

from elastic4s.

sksamuel commented on May 20, 2024

The problem with overloading the strings, or providing an id method is that you'd use the same id thing for delete, update, get, etc and the compiler won't know which one you want (it would need to support some king of crazy method call lookahead).

from elastic4s.

ppat commented on May 20, 2024

One thing I'd like to see is to be able to work with data structures that may or may not have values. Options, empty collections, etc.

If we have:

        val userId: Long = 111111111L
        val since: Option[DateTime] = None
        val exclusions: Array[Long] = Array()
        bool {
          must ( termFilter("user_id", userId) )
          must ( rangeFilter("date").gte(since)) )
          not ( termsFilter("record_id", exclusions: _*) )
        }

So if we have an empty array for exclusions, it won't add the not ( termsFilter ) as it doesn't make sense. Similary, the optional date is None so the must ( rangeFilter ) does not need to be added. Also it would be nice if we can automatically convert to ES standard datetime from JodaTime or java.util.Date (but that I'm not very interested in). Without this functionality, it clutters up an otherwise really nice and easy to read DSL with too many conditionals. Thoughts?

Also AND/OR filters should be nice as they have different performance characteristics where sometimes they make more sense than BOOL. I'd be happy to help out and submit a pull request for any/all of these, but want to hear your thoughts.

from elastic4s.

sksamuel commented on May 20, 2024

I think that's a great idea. I think the empty array stuff should be mostly handled by ES already, but I could be wrong on that. Definitely support for Option and Dates isn't supported. Similarly most Scala collections aren't supported, as at the moment we tend to force people to use var args through : _*

from elastic4s.

ppat commented on May 20, 2024

@sksamuel, I'd be happy to take a crack at this. Sorry it took a while to get around to it (mostly because we're only upgrading to 1.x now from 0.90).

from elastic4s.

sksamuel commented on May 20, 2024

Yeah that would be wicked thanks.

On 20 March 2014 18:13, ppat [email protected] wrote:

@sksamuel https://github.com/sksamuel, I'd be happy to take a crack at
this. Sorry it took a while to get around to it (mostly because we're only
upgrading to 1.x now from 0.90).

Reply to this email directly or view it on GitHubhttps://github.com//issues/44#issuecomment-38202060
.

from elastic4s.

ppat commented on May 20, 2024

There are some quirks we should discuss though. We should happen if an optional value is not present? Should that filter or query segment not be added? Does that make sense in all scenarios? What are the exceptions?

  must (
    termFilter("user_id", userId),
    termFilter("name", name)    
    termFilter("email", email)
  )

In this case, if userId is None, we can remove that termFilter. But what happens all three values are None, should the parent level must filter be also be not included? Does that even make sense at that point query wise?

I'll think more about it but if you or anyone else has ideas, feel free to share.

from elastic4s.

sksamuel commented on May 20, 2024

I think the java client will ignore it if you put in an empty filter. Eg if
all 3 terms were none we'd be setting an empty array, or shouldn't set
anything at all.

On 20 March 2014 18:17, ppat [email protected] wrote:

There are some quirks we should discuss though. We should happen if an
optional value is not present? Should that filter or query segment not be
added? Does that make sense in all scenarios? What are the exceptions?

must (
termFilter("user_id", userId),
termFilter("name", name)
termFilter("email", email)
)

In this case, if userId is None, we can remove that termFilter. But what
happens all three values are None, should the parent level must filter be
also be not included? Does that even make sense at that point query wise?

I'll think more about it but if you or anyone else has ideas, feel free to
share.

Reply to this email directly or view it on GitHubhttps://github.com//issues/44#issuecomment-38202586
.

from elastic4s.

k0ala commented on May 20, 2024

If the call for improvements is still open, the thing that I would like the most is more functionality on the result handling side of things. i.e., once my operation (written in beautiful dsl) has completed, it would be great to be able to inspect the results just as easily, without having to deal with the ES Java API too much.

from elastic4s.

sksamuel commented on May 20, 2024

The reason I never did responses in Scala originally was because I found
the Java ones to be fairly good already. Although that was back in release
0.19. I think it was release 1.0 when a lot of the names started to be
getXXX rather than just xxx like Scala. Which ones in particular are you
unhappy with ?

On 27 May 2014 10:44, k0ala [email protected] wrote:

If the call for improvements is still open, the thing that I would like
the most is more functionality on the result handling side of things. i.e.,
once my operation (written in beautiful dsl) has completed, it would be
great to be able to inspect the results just as easily, without having to
deal with the ES Java API too much.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/44#issuecomment-44254632
.

from elastic4s.

k0ala commented on May 20, 2024

For instance, I find the results from aggregation searches a bit cumbersome to inspect.

from elastic4s.

sksamuel commented on May 20, 2024

I don't use them personally in my projects so have not ran across your frustrations. Can you give me a concrete example of before and your ideal after?

from elastic4s.

k0ala commented on May 20, 2024

Frustrations is too big a word :) I just think it reduces the beauty of the dsl a bit.

For instance, consider the following nested aggregation, getting for each country the most frequent terms in city descriptions:

val freqWords = client.execute { 
    search in "places/cities" aggs { 
        aggregation terms "by_country" field "country" aggs {
            aggregation terms "frequent_words" field "content"
        }
    } 
}

Currently I typically have to find out about elasticsearch internals, and do some casting:

freqWords onComplete {
    case Success(value) => { 
        val byCountry = value.getAggregations.get("by_country").asInstanceOf[Terms] 
        byCountry.getBuckets.asScala foreach (country => {
            val countryWords = country.getAggregations.get("frequent_words").asInstanceOf[Terms] 
            print(country.getKey + ": ")
            countryWords.getBuckets.asScala foreach (countryWord => print(countryWord.getKey + " "))
            println("")
        })  
    }
    case Failure(excep) => { println("Query failed"); excep.printStackTrace }
}

The following would have a nicer feel:

freqWords onComplete {
    case Success(value) => { 
        value.aggs("by_country").buckets foreach (country => {
            print(country.key + ": ")
            country.aggs("frequent_words").buckets foreach (countryWord => print(" " + countryWord.key)
            println("")
        })  
    }
    case Failure(excep) => { println("Query failed"); excep.printStackTrace }
}

Although perhaps I am doing something wrong (relatively new to Scala and very new to elasticsearch).

from elastic4s.

sksamuel commented on May 20, 2024

I'll have a play.

from elastic4s.

sksamuel commented on May 20, 2024

Actually you can help by making a new ticket with details of what you'd like to see like this and then I'll add it.

from elastic4s.

Calling all users about elastic4s HOT 16 CLOSED

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent