Giter Site home page Giter Site logo

sql-regex's Introduction

Regular Expressions from "Large Scale Analysis of GitHub and CVEs to Determine Prevalence of SQL Concatenations"

This repository contains the regular expressions used to identify SQL concatenation in Java, PHP, and C# projects posted on GitHub. This file cannot be run individually but instead can be used as part of a larger project to build the complex regular expressions needed for our analysis.

Broadly speaking, there are three "types" of regular expressions: language-specific regex (e.g., identifying Java variables), SQL regex (e.g., for identifying SQL keywords such as SELECT), and dynamically generated language-specific regex combines the two other categories to create complex regex that identify concatenation.

For an example of a dynamically generated regex, take the following regex expression:

public static final String CONCAT_VAR = WHITESPACE + QUOTE + "(?=" + WHITESPACE + "%s" + WHITESPACE + "%s" + ")";

The two "%s" strings are format string placeholders that can be replaced with language-specific regex to create the complex CONCAT_VAR regex expression that recognizes one type of concatenation with a SQL string. In this example, the first %s is substituted with a regex recognizing the target language's concatenation symbols (e.g., "+" in Java or "." in PHP), and the second %s is substituted with a regex for the target language's variable names (e.g., alphanumeric strings starting with a $ in PHP).

This CONCAT_VAR regex expression can then be combined with a multitude of SQL regex to recognize concatenation in various locations. For example, by combing with the SELECT regex, we can recognize locations where a column identifier is being concatenated into a select statement.

The regular expressions are also designed to account for other abnormalities, including unusual whitespace or string interpolation, by mixing simple regex into larger, more complex expressions.

sql-regex's People

Contributors

ktrio3 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.