Giter Site home page Giter Site logo

human_regex's Introduction

Hi, I'm Chris! (he/him)

I'm currently an Associate Professor of in the Department of Mechanical Engineering at Carnegie Mellon University. Here are a few of the things I'm excited about:

  • ๐Ÿค– machine learning for engineering design
  • ๐Ÿ—ฃ sociotechnical systems
  • ๐Ÿง‘โ€๐Ÿ’ป human-AI collaboration
  • ๐Ÿฆ€ Learning Rust

Let's talk! Send me an email to get in touch.

human_regex's People

Contributors

cmccomb avatar dylan-dpc avatar pacificbird avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

human_regex's Issues

Expand cookbook

The cookbook currently contains four examples, two of which are trivial (matching HTML tags and dates) and two of which are a little more interesting (capturing citation info, removing stopwords). What are other common cases to demonstrate?

Thanks you for the inspiration

I would have preferred to post this in a discussion section but it is not open in this repo.

I'm totally regex illiterate and I started a project thinking I'd learn a little while trying to build the library, I made a simplified project by adding the easy parts and I left the credit.

I made this project in C#, but I paid attention to the syntax being similar, congratulations on your project.

https://github.com/GroophyLifefor/HumanRegex.NET

Refactor generics

Standardize the interface to minimize generic use. Currently captures and repetitions use a generic (T: Into<String> + fmt::Display) which allows str, String, and HumanRegex as inputs. However, standardizing towards HumanRegex inputs only would enable more consistent usage - all raw text inputs would have to pass through the text or direct_regex functions, which would ensure more consistent escaping of special characters.

Adding negation?

One way would be to do this would be to return a unit struct from some functions and overload the ! operator. This should not break current API. You can also make the + operator convert the type to string immediately. E.g:

fn add<T: Into<String>>(_: T) -> Self {}

Intended usage:

let regex = hex() + !digit();

Change lazy implementation

The current implementation of lazy() as a method of HumanRegex has potential for misuse - adding lazy modifier in places that don't make sense, potentially leading to undefined behavior. This should updated to include direct functions that do lazy repetitions (i.e., lazy_one_or_more(target)) to ensure defined behavior.

Add support for unicode character classes

\pN           One-letter name Unicode character class
\p{Greek}     Unicode character class (general category or script)
\PN           Negated one-letter name Unicode character class
\P{Greek}     negated Unicode character class (general category or script)

Add flag capabilities

i     case-insensitive: letters match both upper and lower case
m     multi-line mode: ^ and $ match begin/end of line
s     allow . to match \n
U     swap the meaning of x* and x*?
u     Unicode support (enabled by default)
x     ignore whitespace and allow line comments (starting with `#`)

Implement negation on `HumanRegex<SymbolChain>`

Out of all the operations that can be done on regular languages, finding the complement is definitely the most difficult (or at least the least obvious). I think having one that doesn't rely on runtime lookarounds (which technically make regular expressions not regular, and aren't supported by the Regex crate anyways) would be an extremely unique feature that I'm not sure if any other Regex library out there has.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.