Giter Site home page Giter Site logo

msfl's Introduction

Circle CI

MSFL

The Mattermark Semantic Filter Language is a language for filtering data. It allows users to construct filters on sets of data in a Venn diagram like fashion. More formally it allows the user to create subsets of data.

MSFL is different than other languages - it is not a query language. It has no notion of concepts like order, limit, offset, group_by, or having.

It does support faceted filtering through a vocabulary added in 1.2 which allows the user to define filters on the hyperset of the data. (In the non-hyper case all of the extra dimensions are simplified to constants.) If you don't know what faceted filtering is or this sounds overwhelming, just ignore this bit. When you get to the point that you need faceted filtering in your application then this will seem second nature.

Ruby Gem for the Mattermark Semantic Filter Language

Contains serializers and validators (and perhaps other) MSFL goodies

Converters

One aspect of the msfl gem is the conversion of standard msfl into what we internally call NMSFL (normalized Mattermark Semantic Filter Language). NMSFL serves as an intermediate language in which some of the human friendly conveniences are replaced with a syntax that is easier for machines to parse.

A noop conversion

require 'msfl'
# Require one of the test datasets
require 'msfl/datasets/car'

msfl      = { make: "Chevy" }
converter = MSFL::Converters::Operator.new
nmsfl     = converter.run_conversions msfl

=> {:make=>"Chevy"}

A conversion from an implicit AND to an explicit one

require 'msfl'
# Require one of the test datasets
require 'msfl/datasets/car'

msfl      = { make: "Chevy", year: { gt: 2000 } }
converter = MSFL::Converters::Operator.new
nmsfl     = converter.run_conversions msfl

=> {:and=>#<MSFL::Types::Set: {{:make=>"Chevy"}, {:year=>{:gt=>2000}}}>}

A conversion from between to gte / lte

require 'msfl'
# Require one of the test datasets
require 'msfl/datasets/car'

msfl      = { year: { between: { start: 2010, end: 2015 } } }
converter = MSFL::Converters::Operator.new
nmsfl     = converter.run_conversions msfl

=> {:and=>#<MSFL::Types::Set: {{:year=>{:gte=>2010}}, {:year=>{:lte=>2015}}}>}

EBNF

MSFL is a context-free language. The context-free grammar is defined below.

# EXPRESSIONS

filter          =   lc , { filter_expr } , rc ;

filter_expr     =   range_expr
                |   binary_expr
                |   set_expr
                |   partial_expr
                |   foreign_expr ;

range_expr      =   between ;

binary_expr     =   comparisons
                |   containment ;

set_expr        =   and
                |   or ;

partial_expr    =   partial_op , colon , partial ;

foreign_expr    =   foreign_op , colon , foreign_filter ;

foreign_filter  =   lc , dataset_expr , comma , partial_filter , rc ;

dataset_expr    =   dataset_op , colon , word ;

partial         =   lc , given_expr , comma , partial_filter , rc ;

given_expr      =   given_op , colon, filter ;

partial_filter  =   filter_op , colon , filter ;

between         =   value , colon , start_end
                |   value , colon , between_body ;

comparisons     =   comparison , { comma , comparison } ;

containment     =   word , colon , in_expr ;

and             =   and_op , colon , filters ;

or              =   or_op , colon , filters ;

comparison      =   word , colon , value
                |   word , colon , lc , comparison_list , rc ;

comparison_list =   comparison_expr , { comma , comparison_expr } ;

comparison_expr =   comparison_op , colon , value ;

in_expr         =   lc , in_op , colon , values , rc ;

filters         =   ls , { filter } , rs ;

values          =   ls , { value } , rs ;

between_body    =   lc , between_op , colon , start_end , rc ;

start_end       =   lc , start_expr , comma , end_expr , rc ;

start_expr      =   start_op , colon , range_value ;

end_expr        =   end_op , colon , range_value ;



# OPERATORS

partial_op      =   dq , "partial" , dq ;

given_op        =   dq , "given" , dq ;

filter_op       =   dq , "filter" , dq ;

in_op           =   dq , "in" , dq ;

between_op      =   dq , "between" , dq ;

start_op        =   dq , "start" , dq ;

end_op          =   dq , "end" , dq ;

comparison_op   =   lt_op
                |   gt_op
                |   lte_op
                |   gte_op
                |   eq_op ;

lt_op           =   dq , "lt" , dq ;

gt_op           =   dq , "gt" , dq ;

lte_op          =   dq , "lte" , dq ;

gte_op          =   dq , "gte" , dq ;

eq_op           =   dq , "eq" , dq ;

and_op          =   dq , "and" , dq ;

or_op           =   dq , "or" , dq ;

foreign_op      =   dq , "foreign" , dq ;

dataset_op      =   dq , "dataset" , dq ;



# VALUES AND TYPES

range_value     =   number
                |   date
                |   datetime
                |   time ;

value           =   word
                |   range_value
                |   boolean ;

word            =   dq , character , { character } , dq ;

number          =   integer | decimal ;

integer         =   [ hyphen ] , digit , { digit } ;

decimal         =   integer
                |   { integer } , dot , { digit } ;

boolean         =   true | false ;

true            =   "true"
                |   dq , "true" , dq
                |   "1"
                |   dq , "1" , dq ;

false           =   "false"
                |   dq , "false" , dq
                |   "0"
                |   dq , "0" , dq ;

date            =   ? ISO 8601 date format http://en.wikipedia.org/wiki/ISO_8601 ? ;

datetime        =   ? ISO 8601 combined date and time format http://en.wikipedia.org/wiki/ISO_8601 ? ;

time            =   ? ISO 8601 time format http://en.wikipedia.org/wiki/ISO_8601 ? ;

character       =   letter
                |   digit
                |   symbol ;

letter          =   "A" | "B" | "C" | "D" | "E" | "F" | "G"
                |   "H" | "I" | "J" | "K" | "L" | "M" | "N"
                |   "O" | "P" | "Q" | "R" | "S" | "T" | "U"
                |   "V" | "W" | "X" | "Y" | "Z"
                |   "a" | "b" | "c" | "d" | "e" | "f" | "g"
                |   "h" | "i" | "j" | "k" | "l" | "m" | "n"
                |   "o" | "p" | "q" | "r" | "s" | "t" | "u"
                |   "v" | "w" | "x" | "y" | "z" ;

digit           =   "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;

symbol          =   "'" | "~" | "." | "_" | "-" | ":" | "?" | "/" | "=" | "@" | "&" ;

left_curly      =   "{" ;

lc              =   left_curly ;

right_curly     =   "}" ;

rc              =   right_curly ;

left_square     =   "[" ;

ls              =   left_square ;

right_square    =   "]" ;

rs              =   right_square ;

comma           =   "," ;

hyphen          =   "-" ;

colon           =   ":" ;

double_quote    =   '"' ;

dq              =   double_quote ;

dot             =   "." ;

Configuration

All configuration options should be set in a block passed to MSFL.configure { |c| c.datasets = [] }

Naturally you should provide an appropriate array of the datasets you are supporting.

As additional configuration settings are added they will be set similarly.

Converters

The MSFL converters provide convenience methods for transforming a parsed MSFL tree to a different structure that is logically equivalent. The intent is to enable consumers of MSFL to easily manipulate parsed MSFL filters into the form that most easily or efficiently allows adaptation to the storage mechanism upon which the filtering is being effected.

Note that the order in which converters are run is controlled by the constant MSFL::Converters::CONVERSIONS and cannot be manipulated through configuration. This behavior is currently necessary for ease of implementation but is unlikely to continue to be status quo.

Datasets

The consumer of the MSFL gem defines one or more datasets. The dataset definition enumerates the supported fields and their types.

Parsers

Currently there is only a parser for the JSON encoding of MSFL filters. Any additional parsers will also be placed under this directory.

Types

Because of the behavioral limitations imposed on certain types (currently Sets are the only example) there is a folder for types to be defined.

Validators

After parsing a MSFL filter it can be validated. Currently the validation is primitive. The intent is to enable semantic validation on a per dataset basis. This will allow per attribute validations to be setup by the consumer of this gem, which will be run automatically during validation.

Validation works in the following order

  1. Grammar validation

  2. Dataset configured validation

  3. Dataset semantic validation

Frameworks

Sinatra

There are several helper methods for using this gem with Sinatra. You can register the helpers in your Sinatra app by adding the following inside of your application's class.

# This should actually be Sinatra::MSFL but there are some namespacing issues with MSFL currently that prevented
# this from being the v0 implementation. This will change in the near future.
register MSFL::Sinatra

msfl's People

Contributors

caldwecr avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

msfl's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.