Giter Site home page Giter Site logo

foundatiofx / foundatio.parsers Goto Github PK

View Code? Open in Web Editor NEW
65.0 13.0 20.0 1.4 MB

A lucene style query parser that is extensible and allows modifying the query.

Home Page: https://www.nuget.org/packages/Foundatio.Parsers.LuceneQueries/

License: Apache License 2.0

C# 100.00%
lucene parsers parse elasticsearch aggregation query foundatio c-sharp pegasus peg aliases macros

foundatio.parsers's Introduction

FoundatioFoundatio

Build status NuGet Version feedz.io Discord

A lucene style query parser that is extensible and allows additional syntax features. Also includes an Elasticsearch query_string query replacement that greatly enhances its capabilities for dynamic queries.

Getting Started (Development)

This package can be installed via the NuGet package manager. If you need help, please contact us via in-app support or open an issue. We’re always here to help if you have any questions!

  1. You will need to have Visual Studio Code installed.
  2. Open the Foundatio.Parsers.sln Visual Studio solution file.

Using LuceneQueryParser

Below is a small sampling of the things you can accomplish with LuceneQueryParser, so check it out! We use this library extensively in Exceptionless!

In the sample below we will parse a query and output it's structure using the DebugQueryVisitor and then generate the same exact query using the parse result.

using Foundatio.Parsers.LuceneQueries;
using Foundatio.Parsers.LuceneQueries.Visitors;

var parser = new LuceneQueryParser();
var result = parser.Parse("field:[1 TO 2]");
Debug.WriteLine(DebugQueryVisitor.Run(result));

Here is the parse result as shown from the DebugQueryVisitor

Group:
  Left - Term: 
      TermMax: 2
      TermMin: 1
      MinInclusive: True
      MaxInclusive: True
      Field: 
          Name: field

Finally, lets translate the parse result back into the original query.

var generatedQuery = GenerateQueryVisitor.Run(result);
System.Diagnostics.Debug.Assert(query == generatedQuery);

Features

  • Lucene Query Syntax Parser
  • Field Aliases (static and dynamic)
  • Query Includes
    • Define stored queries that can be included inside other queries as macros that will be expanded
  • Validation
    • Validate query syntax
    • Restrict access to specific fields
    • Restrict the number of operations allowed
    • Restrict nesting depth
  • Elasticsearch
    • Elastic query string query replacement on steriods
    • Dynamic search and filter expressions
    • Dynamic aggregation expressions
      • Supported bucket aggregations: terms, geo grid, date histogram, numeric histogram
        • Bucket aggregations allow nesting other dynamic aggregations inside
      • Supported metric aggregations: min, max, avg, sum, stats, extended stats, cardinality, missing, percentiles
    • Dynamic sort expressions
    • Dynamic expressions can be exposed to end users to allow for custom searches, filters, sorting and aggregations
      • Enables allowing users to build custom views, charts and dashboards
      • Enables powerful APIs that allow users to do things you never thought of
    • Supports geo queries (proximity and radius)
      • mygeo:75044~75mi
        • Returns all documents that have a value in the mygeo field that is within a 75 mile radius of the 75044 zip code
    • Supports nested document mappings
    • Automatically resolves non-analyzed keyword sub-fields for sorting and aggregations
    • Aliases can be defined right on your NEST mappings
      • Supports both root and inner field name aliases

Thanks to all the people who have contributed

contributors

foundatio.parsers's People

Contributors

azure-pipelines[bot] avatar dependabot-preview[bot] avatar dependabot[bot] avatar derekpitt avatar ejsmith avatar eugene-g avatar jimmysoares avatar lucidsage avatar niemyjski avatar randylsu avatar richardcockerill avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

foundatio.parsers's Issues

DefaultOperator not applied to all terms in GenerateQueryVisitor

If I set the default operator on the context on a call to GenerateQueryVisitor.Run then the operator is only being applied to the left side of the first GroupNode in the graph.

eg

IQueryNode parsedQuery = await parser.ParseAsync("term1 term2 term3);
var context = new QueryVisitorContext { DefaultOperator = GroupOperator.And };
string result = GenerateQueryVisitor.Run(parsedQuery, context); 

produces "term1 AND term2 term3" but should be "term1 AND term2 AND term3"

GenerateQueryVisitor needs to apply the operator to every GroupNode but this is not happening because GenerateQueryVisitor delegates to GroupNode.ToString(GroupOperator defaultOperator) which then just invokes ToString on the Left and Right.

Tests and a fix attached

GenerateQueryVisitorDefaultOperator.patch

Wrong field displayed by DebugQueryVisitor after using FieldResolverQueryVisitor

Hello,

First of all, thanks for your amazing query parser !
I've noticed something strange in the DebugQueryVisitor with Lucene Queries.
Consider the following code :

string query = "Paris AND SU:(Physique OR Musique)";
LuceneQueryParser luceneQueryParser = new LuceneQueryParser();

var parsedQuery = luceneQueryParser.Parse(query);
var dico = new Dictionary<string, string>()
{
	{ "SU", "Subject" }
};
var resolvedParsedQuery = FieldResolverQueryVisitor.Run(parsedQuery, dico);
Console.WriteLine(DebugQueryVisitor.Run(resolvedParsedQuery));

Now, resolvedParsedQuery contains a node that is pretty much the same as parsedQuery except that the "SU" field has been changed to "Subject". Still, running the DebugQueryVisitor on resolvedParsedQuery displays the following :

Group:
    Left - Term:
        IsQuoted: False
        Term: Paris
    Right - Group:
        IsNegated: False
        Field: SU
        Left - Term:
            IsQuoted: False
            Term: Physique
        Right - Term:
            IsQuoted: False
            Term: Musique
        Operator: Or
        Parens: true
        Data:
            @OriginalField: SU
    Operator: And

"Field" now has the value "Subject" in the resolvedParsedQuery graph (checked with Visual Studio Spy). Which is as expected. But the DebugQueryVisitor output still displays "SU" as Field value. While "@OriginalField" is right.

Thanks for your attention,
Guillaume

Unable to parse query with escaped characters

ElasticSearch / Lucene allows escaping characters with a backslash:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html

Feeding queries with backslashed characters in into a ElasticQueryParser them fails to parse with an exception. The same query string works when used in ElasticSearch directly. Test case below:

using System.Diagnostics;
using System.Threading.Tasks;
using Foundatio.Parsers.ElasticQueries;
using Foundatio.Parsers.ElasticQueries.Visitors;
using Foundatio.Parsers.LuceneQueries.Visitors;

namespace QueryEscapingTest
{
    class Program
    {
        static void Main(string[] args)
        {
            var simple = ParseAndRewrite("normal").Result;
            Debug.Assert(simple == "normal");

            var escaped = ParseAndRewrite("\\\"escaped").Result;
            Debug.Assert(escaped == "\\\"escaped");
        }

        static async Task<string> ParseAndRewrite(string query)
        {
            QueryFieldResolver resolver = (field) =>
            {
                return field;
            };

            ElasticQueryVisitorContext context = new ElasticQueryVisitorContext { QueryType = QueryType.Query };

            var parser = new ElasticQueryParser(conf => 
                conf.UseFieldResolver(resolver)
                    .UseValidation(async info => true)
            );

            var queryNode = await parser.ParseAsync(query, context);
            return queryNode.ToString();
        }

    }
}

Project file:
QueryEscapingTest.zip

Look into Date Range parsing issue

There is an issue in this lib or in the DateRangeExtensions project where the following can't be parsed: processor.BuildQueryAsync("field5:[2017-01-01T00:00:00Z TO 2017-01-31} OR field1:value1", ctx) We also cannot parse: 2019-12-01T00:00:00Z-2019-12-02T00:00:00Z

cc: @randylsu

How to construct filtered elasticsearch query with aggregation

Hi, I'm working on building out an interface to an elasticsearch cluster, and we wanted to allow our users to specify a query, filter, and aggregation all for a single request using the lucene syntax.

This library seems perfect, but I can't quite figure out how to take those three separate lucene queries and construct a single elasticsearch query that describes the desired result in an efficient way.

An example would be something like:

query = "title:dog OR title:cat";
filter = "websiteId:9 formatClassification:1";
agg = "terms:subjects~50 terms:bisacCodes~50 terms:maturityLevel terms:format";

Converted to elasticsearch (might have some slightly off syntax here):

{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "title": {
              "value": "dog"
            }
          }
        },
        {
          "term": {
            "title": {
              "value": "cat"
            }
          }
        }
      ],
      "filter": [
        {
          "term": {
            "websiteId": {
              "value": "9"
            }
          }
        },
        {
          "term": {
            "formatClassification": {
              "value": "1"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "subjects": {
      "terms": {
        "field": "subjects"
      }
    },
    "bisac": {
      "terms": {
        "field": "bisacCodes"
      }
    },
    "maturityLevel": {
      "terms": {
        "field": "maturityLevel"
      }
    },
    "format": {
      "terms": {
        "field": "format"
      }
    }
  }
}

Build is failing when using version 7.17.2

When we try to build our application with the new version 7.12.2 the build fails because it can't find the package Exceptionless.DateTimeExtensions. It seams like the version 3.4.1 is not on the nuget feed.

error NU1102: Unable to find package Exceptionless.DateTimeExtensions with version (>= 3.4.1)

Invalid queries return as valid

Hi there me again, we are having some more issues with queries that elastic search does not like, but the parser of this library says they are valid, this time it has to do with unterminated regex sequences, and invalid proximity operators.

The problem is that sending these queries to ES throws an error, when this parser says they are OK.

I've made a repo with some failing tests of the cases: https://github.com/Issung/Foundatio.Parsers
And a pull request here: #64

Thanks for your speedy help on the last issue, it is much appreciated!

Can't generate a proper multi_match query

I'm trying to send the following query to ES using Foundatio Parser:

properties.prop*:666

so that any of the following matches:

properties.prop1:666
properties.prop2:666
...

The QueryContainer built contains the following:

{"bool":{"filter":[{"term":{"properties.prop*":{"value":"666"}}}]}}

which won't work because wildcard in index name requires a multi_match operator.

Trying to fiddle with the nodes in Visitors and setting the following:

node.Field = null;
node.Prefix = "properties.prop*"

The QueryContainer then contains:

{"bool":{"filter":[{"multi_match":{"query":"666"}}]}}

It's missing the "fields" property next to the query.

How can I build the proper query container which looks like:

{"bool":{"filter":[{"multi_match":{"query":"666", "fields": [ "properties.prop*" ]}}]}}

Thanks.

NEST 8 support

Thank you for the great library.

Do you have any plans on supporting Nest 8.*? πŸ˜ƒ

Lucene negation symbol "!" unrecognized as special symbol

Per the docs for Lucene 2.9.4. The query expression "jakarta apache" !"Apache Lucene" should be equivalent to "jakarta apache" NOT "Apache Lucene" but instead "!" is identified as a "term" instead of the negation of "Apache Lucene".

LuceneQueryParser fails to parse empty quoted strings (since version 7.10.2)

Hello,

I've built a quick Roslynpad sample to show you the problem :
When I try to parse a lucene query that contains an empty quoted string, I get an error with LuceneQueryParser v7.10.2 (and above). The same code is OK with LuceneQueryParser v7.10.1.

The following code is OK (version 7.10.1)

#r "nuget:Foundatio.Parsers.LuceneQueries/7.10.1"

var query = @"Author:Smith AND Title_idx:""""";

var foundatioParser = new Foundatio.Parsers.LuceneQueries.LuceneQueryParser();
var node1 = foundatioParser.Parse(query);
Console.WriteLine(node1.ToString());

The following code is not OK (see exception below) :

#r "nuget:Foundatio.Parsers.LuceneQueries/7.10.2"

var query = @"Author:Smith AND Title_idx:""""";

var foundatioParser = new Foundatio.Parsers.LuceneQueries.LuceneQueryParser();
var node1 = foundatioParser.Parse(query);
Console.WriteLine(node1.ToString());

image

Hope this helps !

dynamic mapping instead of typed class mapping

Does your solution supports in case of dynamic mapping and dynamic search. Because when i try to search something on dynamic mapping using dynamic type then search executes but it didnt return any hits.
Or provide any test case where you are creating dynamic mapping and trying to index dynamic type and search using dynamic.

200 POST http://localhost:9200/testmes/_search?pretty=true&typed_keys=false

something like
var actualResponse = client.Search<dynamic>(d => d.Index(index).Query(q => result));
on dynamic mapping,
instead of
var actualResponse = client.Search<MyType>(d => d.Index(index).Query(q => result));

ElasticQueryParser ParseAsync does not throw exception for query with odd number of quotation marks.

We are using ElasticQueryParser's ParseAsync to check if a query string is valid, if it throws then we know it is not. Though the parser does not throw for a string with an odd number of quotation marks, e.g. "MyFile. But when the query is sent off to Elastic Search a 500 error returns.
This method to check for other issues such as the search string test + other (which is invalid), among other tests do throw exceptions, but it seems to not catch these odd quotation mark errors, meaning that at the moment we are currently counting the amount of quotation marks and escaping them if odd manually, rather than just relying on the parser to catch that invalid syntax for us.

ElasticQueryVisitorContext Context = new ElasticQueryVisitorContext { QueryType = QueryType.Query, DefaultOperator = GroupOperator.Default };

ElasticQueryParser parser = new ElasticQueryParser(conf =>
  conf.UseValidation(info => Task.FromResult(ValidateQueryInfo(info)))
);

try
{
  IQueryNode queryNode = await parser.ParseAsync(query, Context);
}
catch (Exception ex)
{
  return false;
}

Let me know if you need anymore info!

Why async?

What is the rationale behind making everything async? It's a string parser, after all.

compatible Elastic search 6.0 version?

Hi, I run complex search filters on local installed ES 6.0 version and i cannot get right results. Did i configured something wrong or this library hasn't yet upgraded for ES 6.0.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.