Giter Site home page Giter Site logo

adrian-thurston / colm Goto Github PK

View Code? Open in Web Editor NEW
162.0 11.0 29.0 17.47 MB

The Colm Programming Language

License: MIT License

CMake 0.34% Makefile 1.82% C++ 69.02% C 12.96% Shell 0.33% M4 0.34% Ragel 14.93% Ruby 0.07% Python 0.02% Go 0.03% Vim Script 0.13%
ragel colm

colm's Introduction

Colm = COmputer Language Machinery

Colm is a programming language designed for the analysis and transformation of computer languages.
Colm is influenced primarily by TXL.

What is a transformation language?

A transformation language has a type system based on formal languages.
Rather than defining classes or data structures, one defines grammars.

A parser is constructed automatically from the grammar, and the parser is used for two purposes:

  • to parse the input language,
  • and to parse the structural patterns in the program that performs the analysis.

In this setting, grammar-based parsing is critical because it guarantees that both the input and the structural patterns are parsed into trees from the same set of types, allowing comparison.

Colm's features

Colm is not-your-typical-scripting-language™:

  • Colm's main contribution lies in the parsing method.
    Colm's parsing engine is generalized, but it also allows for the construction of arbitrary global data structures that can be queried during parsing. In other generalized methods, construction of global data requires some very careful consideration because of inherent concurrency in the parsing method. It is such a tricky task that it is often avoided altogether and the problem is deferred to a post-parse disambiguation of the parse forest.
  • By default Colm will create an elf executable that can be used standalone for that actual transformations.
  • Colm is a static and strong typed scripting language.
  • Colm is very tiny and fast and can easily be embedded/linked with c/cpp programs.
  • Colm's runtime is a stackbased VM that starts with the bare minimum of the language and bootstraps itself.

Examples

This is how Colm is greeting the world (hello_world.lm):

print "hello world\n"

Here's a Colm program implementing a little assignment language (assign.lm) and its parse tree synthesis afterwards.

lex
	token id / ('a' .. 'z' | 'A' .. 'Z' ) + /
	token number / ( '0' .. '9' )+ /
	literal `= `;
	ignore / [ \t\n]+ /
end

def value
	[id] | [number]

def assignment
	[id `= value `;]

def assignment_list
	[assignment assignment_list]
|	[assignment]
|	[]

parse Simple: assignment_list[ stdin ]

if ( ! Simple ) {
	print( "[error]\n" )
	exit( 1 )
}
else {
	for I:assignment in Simple {
		print( $I.id, "->", $I.value, "\n" )
	}
}

More real-world programs parsing several languages implemented in Colm can be found in the grammar/-folder.

Usage

To immediatelly compile and run e.g. the hello_world.lm program from above, call

$ colm -r hello_world.lm
hello world

Run colm --help for help on further options.

$ colm --help
usage: colm [options] file
general:
   -h, -H, -?, --help   print this usage and exit
   -v --version         print version information and exit
   -b <ident>           use <ident> as name of C object encapulaing the program
   -o <file>            if -c given, write C parse object to <file>,
                        otherwise write binary to <file>
   -p <file>            write C parse object to <file>
   -e <file>            write C++ export header to <file>
   -x <file>            write C++ export code to <file>
   -m <file>            write C++ commit code to <file>
   -a <file>            additional code file to include in output program
   -E N=V               set a string value available in the program
   -I <path>            additional include path for the compiler
   -i                   activate branchpoint information
   -L <path>            additional library path for the linker
   -l                   activate logging
   -r                   run output program and replace process
   -c                   compile only (don't produce binary)
   -V                   print dot format (graphiz)
   -d                   print verbose debug information

Building

To build Colm on your own, see the following dependencies and build instructions.

Dependencies

This package has no external dependencies, other than usual autotools and C/C++ compiler programs.

For the program:

  • make
  • libtool
  • gcc
  • g++
  • autoconf
  • automake

For the documentation, install asciidoc and fig2dev as well.

Building

Colm is built in the usual autotool way:

$ ./autogen.sh
$ ./configure
$ make
$ make install

Run-time dependencies

The colm program depends on GCC at runtime. It produces a C program as output, then compiles and links it with a runtime library. The compiled program depends on the colm library.

To find the includes and the runtime library to pass to GCC, colm looks at argv[0] to decide if it is running out of the source tree. If it is, then the compile and link flags are derived from argv[0]. Otherwise, it uses the install location (prefix) to construct the flags.

Syntax highlighting

There is a vim syntax definition file colm.vim.

License

Colm is free software under the MIT license.
Please see the COPYING file for more details.

colm's People

Contributors

adrian-thurston avatar antage avatar bicycle1885 avatar collosi avatar computerquip avatar flameeyes avatar grommish avatar halcanary avatar hengestone avatar jengelh avatar jungshik avatar kbrow1i avatar kramelec avatar mingodad avatar phorward avatar podsvirov avatar rreverser avatar salzmdan avatar srl295 avatar vaporoid avatar verpeteren avatar viccie30 avatar ygrek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

colm's Issues

CLANG 9.01 warnings

Hello !
Trying to compile colm on an Android device with Termux we have some errors and warnings:

gcc -DHAVE_CONFIG_H -I. -I../src  -I../aapl  -Iinclude  -Wall -DPREFIX='"/data/data/com.termux/files/home/local/colm"' -Wall -g -MT gen/bootstrap2-parse2.o -MD -MP -MF gen/.deps/bootstrap2-parse2.Tpo -c -o gen/bootstrap2-parse2.o `test -f 'gen/parse2.c' || echo './'`gen/parse2.c
mv -f gen/.deps/bootstrap2-parse2.Tpo gen/.deps/bootstrap2-parse2.Po
In file included from loadboot2.cc:2:
./loadfinal.cc:1682:8: warning: comparison of two values with different enumeration types in switch statement ('enum prod_name' and 'string_el::prod_name') [-Wenum-compare-switch]
                case string_el::Sq: {
                     ^~~~~~~~~~~~~
./loadfinal.cc:2712:8: warning: comparison of two values with different enumeration types in switch statement ('enum prod_name' and 'struct_item::prod_name') [-Wenum-compare-switch]
                case struct_item::InHost:
                     ^~~~~~~~~~~~~~~~~~~
./loadfinal.cc:2793:8: warning: comparison of two values with different enumeration types in switch statement ('enum prod_name' and 'struct_item::prod_name') [-Wenum-compare-switch]
                case struct_item::InHost:
                     ^~~~~~~~~~~~~~~~~~~
./loadfinal.cc:2769:8: warning: comparison of two values with different enumeration types in switch statement ('enum prod_name' and 'root_item::prod_name') [-Wenum-compare-switch]
                case root_item::IgnoreCollector:
                     ^~~~~~~~~~~~~~~~~~~~~~~~~~
mv -f gen/.deps/bootstrap2-if2.Tpo gen/.deps/bootstrap2-if2.Po
4 warnings generated.
/bin/sh ../libtool  --tag=CXX   --mode=compile g++ -DHAVE_CONFIG_H -I. -I../src -I../colm  -I../aapl -I../colm/include -DBINDIR='"/data/data/com.termux/files/home/local/colm/bin"'   -Wall -g -MT libragel_la-longest.lo -MD -MP -MF .deps/libragel_la-longest.Tpo -c -o libragel_la-longest.lo `test -f 'longest.cc' || echo './'`longest.cc
libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I../src -I../colm -I../aapl -I../colm/include -DBINDIR=\"/data/data/com.termux/files/home/local/colm/bin\" -Wall -g -MT libragel_la-longest.lo -MD -MP -MF .deps/libragel_la-longest.Tpo -c longest.cc  -fPIC -DPIC -o .libs/libragel_la-longest.o
parsetree.cc:494:21: warning: variables 'm' and 'numMachines' used in loop condition not modified in loop body [-Wfor-loop-analysis]
                        for ( int m = 0; m < numMachines; )
                                         ^   ~~~~~~~~~~~
mv -f .deps/ragel-main.Tpo .deps/ragel-main.Po
/bin/sh ../libtool  --tag=CXX   --mode=compile g++ -DHAVE_CONFIG_H -I. -I../src -I../colm  -I../aapl   -Wall -g -MT libfsm_la-fsmcond.lo -MD -MP -MF .deps/libfsm_la-fsmcond.Tpo -c -o libfsm_la-fsmcond.lo `test -f 'fsmcond.cc' || echo './'`fsmcond.cc
libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I../src -I../colm -I../aapl -Wall -g -MT libfsm_la-fsmcond.lo -MD -MP -MF .deps/libfsm_la-fsmcond.Tpo -c fsmcond.cc  -fPIC -DPIC -o .libs/libfsm_la-fsmcond.o
fsmap.cc:144:19: warning: 'this' pointer cannot be null in well-defined C++ code; pointer may be assumed to always convert to true [-Wundefined-bool-conversion]
        afterOpMinimize( this );
        ~~~~~~~~~~~~~~~  ^~~~
fsmap.cc:266:19: warning: 'this' pointer cannot be null in well-defined C++ code; pointer may be assumed to always convert to true [-Wundefined-bool-conversion]
        afterOpMinimize( this );
        ~~~~~~~~~~~~~~~  ^~~~
fsmap.cc:548:19: warning: 'this' pointer cannot be null in well-defined C++ code; pointer may be assumed to always convert to true [-Wundefined-bool-conversion]
        afterOpMinimize( this );
        ~~~~~~~~~~~~~~~  ^~~~
fsmap.cc:603:19: warning: 'this' pointer cannot be null in well-defined C++ code; pointer may be assumed to always convert to true [-Wundefined-bool-conversion]
        afterOpMinimize( this );
        ~~~~~~~~~~~~~~~  ^~~~
fsmap.cc:660:19: warning: 'this' pointer cannot be null in well-defined C++ code; pointer may be assumed to always convert to true [-Wundefined-bool-conversion]
        afterOpMinimize( this );
        ~~~~~~~~~~~~~~~  ^~~~
fsmap.cc:716:19: warning: 'this' pointer cannot be null in well-defined C++ code; pointer may be assumed to always convert to true [-Wundefined-bool-conversion]
        afterOpMinimize( this );
        ~~~~~~~~~~~~~~~  ^~~~
fsmap.cc:965:8: warning: comparison of two values with different enumeration types in switch statement ('ValPairIter<PiList<CondAp>, PiList<CondAp> >::UserState' and 'ValPairIter<CondAp, CondAp>::UserState') [-Wenum-compare-switch]
                case ValPairIter<CondAp>::RangeOverlap: {
                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fsmap.cc:959:8: warning: comparison of two values with different enumeration types in switch statement ('ValPairIter<PiList<CondAp>, PiList<CondAp> >::UserState' and 'ValPairIter<CondAp, CondAp>::UserState') [-Wenum-compare-switch]
                case ValPairIter<CondAp>::RangeInS2: {
                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fsmap.cc:953:8: warning: comparison of two values with different enumeration types in switch statement ('ValPairIter<PiList<CondAp>, PiList<CondAp> >::UserState' and 'ValPairIter<CondAp, CondAp>::UserState') [-Wenum-compare-switch]
                case ValPairIter<CondAp>::RangeInS1: {
                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fsmap.cc:1010:9: warning: comparison of two values with different enumeration types in switch statement ('ValPairIter<PiList<CondAp>, PiList<CondAp> >::UserState' and 'ValPairIter<CondAp, CondAp>::UserState') [-Wenum-compare-switch]
                        case ValPairIter<CondAp>::RangeOverlap: {
                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fsmap.cc:1004:9: warning: comparison of two values with different enumeration types in switch statement ('ValPairIter<PiList<CondAp>, PiList<CondAp> >::UserState' and 'ValPairIter<CondAp, CondAp>::UserState') [-Wenum-compare-switch]
                        case ValPairIter<CondAp>::RangeInS2: {
                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fsmap.cc:998:9: warning: comparison of two values with different enumeration types in switch statement ('ValPairIter<PiList<CondAp>, PiList<CondAp> >::UserState' and 'ValPairIter<CondAp, CondAp>::UserState') [-Wenum-compare-switch]
                        case ValPairIter<CondAp>::RangeInS1: {
                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fsmap.cc:1178:19: warning: 'this' pointer cannot be null in well-defined C++ code; pointer may be assumed to always convert to true [-Wundefined-bool-conversion]
        afterOpMinimize( this );
        ~~~~~~~~~~~~~~~  ^~~~
13 warnings generated.
libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I../src -I../colm -I../aapl -Wall -g -MT libfsm_la-gendata.lo -MD -MP -MF .deps/libfsm_la-gendata.Tpo -c gendata.cc -o libfsm_la-gendata.o >/dev/null 2>&1
In file included from allocgen.cc:33:
In file included from ./bingoto.h:26:
In file included from ./binary.h:27:
./codegen.h:156:1: warning: 'CodeGen' defined as a class here but previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Wmismatched-tags]
class CodeGen : public CodeGenData
^
./codegen.h:84:1: note: did you mean class here?
struct CodeGen;
^~~~~~
class
./codegen.h:170:9: warning: class 'TableArray' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Wmismatched-tags]
        friend class TableArray;
               ^
./codegen.h:86:8: note: previous use is here
struct TableArray
       ^
./codegen.h:170:9: note: did you mean struct here?
        friend class TableArray;
               ^~~~~
               struct
In file included from allocgen.cc:41:
./switchvar.h:23:9: warning: 'RAGEL_SWITCHVAR_H' is used as a header guard here, followed by #define of a different macro [-Wheader-guard]
#ifndef RAGEL_SWITCHVAR_H
        ^~~~~~~~~~~~~~~~~
./switchvar.h:24:9: note: 'RAGEL_BINVAR_H' is defined here; did you mean 'RAGEL_SWITCHVAR_H'?
#define RAGEL_BINVAR_H
        ^~~~~~~~~~~~~~
        RAGEL_SWITCHVAR_H
In file included from allocgen.cc:44:
./ipgoto.h:106:7: warning: 'IpGoto::NFA_PUSH' hides overloaded virtual function [-Woverloaded-virtual]
        void NFA_PUSH( RedStateAp *state );
             ^
./codegen.h:447:15: note: hidden overloaded virtual function 'CodeGen::NFA_PUSH' declared here: type mismatch at 1st parameter ('std::string' (aka 'basic_string<char, char_traits<char>, allocator<char> >') vs 'RedStateAp *')
        virtual void NFA_PUSH( std::string );
                     ^
In file included from allocgen.cc:45:
./asm.h:63:1: warning: 'AsmCodeGen' defined as a class here but previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Wmismatched-tags]
class AsmCodeGen : public CodeGenData
^
./asm.h:53:1: note: did you mean class here?
struct AsmCodeGen;
^~~~~~
class
In file included from codegen.cc:23:
./codegen.h:156:1: warning: 'CodeGen' defined as a class here but previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Wmismatched-tags]
class CodeGen : public CodeGenData
^
./codegen.h:84:1: note: did you mean class here?
struct CodeGen;
^~~~~~
class
./codegen.h:170:9: warning: class 'TableArray' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Wmismatched-tags]
        friend class TableArray;
               ^
./codegen.h:86:8: note: previous use is here
struct TableArray
       ^
./codegen.h:170:9: note: did you mean struct here?
        friend class TableArray;
               ^~~~~
               struct
mv -f .deps/libfsm_la-redfsm.Tpo .deps/libfsm_la-redfsm.Plo
5 warnings generated.
libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I../src -I../colm -I../aapl -Wall -g -MT libfsm_la-tabgoto.lo -MD -MP -MF .deps/libfsm_la-tabgoto.Tpo -c tabgoto.cc  -fPIC -DPIC -o .libs/libfsm_la-tabgoto.o
In file included from binvar.cc:23:
In file included from ./binvar.h:26:
In file included from ./binary.h:27:
./codegen.h:156:1: warning: 'CodeGen' defined as a class here but previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Wmismatched-tags]
class CodeGen : public CodeGenData
^
./codegen.h:84:1: note: did you mean class here?
struct CodeGen;
^~~~~~
class
./codegen.h:170:9: warning: class 'TableArray' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Wmismatched-tags]
        friend class TableArray;
               ^
./codegen.h:86:8: note: previous use is here
struct TableArray
       ^
./codegen.h:170:9: note: did you mean struct here?
        friend class TableArray;
               ^~~~~
               struct
binvar.cc:95:97: error: no viable conversion from 'Variable' to 'std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >'
                        "       " << ckeys << " = " << OFFSET( ARR_REF( condKeys ), ARR_REF( transOffsets ) + "[" + string(trans) + "]" ) << ";\n"
                                                                                                                           ^~~~~
/data/data/com.termux/files/usr/bin/../include/c++/v1/string:800:5: note: candidate constructor not viable: no known conversion from 'Variable' to 'const std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > &' for 1st argument
    basic_string(const basic_string& __str);
    ^
/data/data/com.termux/files/usr/bin/../include/c++/v1/string:805:5: note: candidate constructor not viable: no known conversion from 'Variable' to 'std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > &&' for 1st argument
    basic_string(basic_string&& __str)
    ^
/data/data/com.termux/files/usr/bin/../include/c++/v1/string:820:5: note: candidate constructor not viable: no known conversion from 'Variable' to 'const char *' for 1st argument
    basic_string(const _CharT* __s) {
    ^
/data/data/com.termux/files/usr/bin/../include/c++/v1/string:874:5: note: candidate constructor not viable: no known conversion from 'Variable' to 'initializer_list<char>' for 1st argument
    basic_string(initializer_list<_CharT> __il);
    ^
./codegen.h:61:2: note: candidate function
        operator const std::string() { isReferenced = true; return name; }
        ^
/data/data/com.termux/files/usr/bin/../include/c++/v1/string:805:33: note: passing argument to parameter '__str' here
    basic_string(basic_string&& __str)
                                ^
binvar.cc:104:57: error: no viable conversion from 'Variable' to 'std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >'
                        COND_EXEC( ARR_REF( transCondSpaces ) + "[" + string(trans) + "]" );
                                                                             ^~~~~
/data/data/com.termux/files/usr/bin/../include/c++/v1/string:800:5: note: candidate constructor not viable: no known conversion from 'Variable' to 'const std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > &' for 1st argument
    basic_string(const basic_string& __str);
    ^
/data/data/com.termux/files/usr/bin/../include/c++/v1/string:805:5: note: candidate constructor not viable: no known conversion from 'Variable' to 'std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > &&' for 1st argument
    basic_string(basic_string&& __str)
    ^
/data/data/com.termux/files/usr/bin/../include/c++/v1/string:820:5: note: candidate constructor not viable: no known conversion from 'Variable' to 'const char *' for 1st argument
    basic_string(const _CharT* __s) {
    ^
/data/data/com.termux/files/usr/bin/../include/c++/v1/string:874:5: note: candidate constructor not viable: no known conversion from 'Variable' to 'initializer_list<char>' for 1st argument
    basic_string(initializer_list<_CharT> __il);
    ^
./codegen.h:61:2: note: candidate function
        operator const std::string() { isReferenced = true; return name; }
        ^
/data/data/com.termux/files/usr/bin/../include/c++/v1/string:805:33: note: passing argument to parameter '__str' here
    basic_string(basic_string&& __str)
                                ^
2 warnings and 2 errors generated.
make[3]: Entering directory '/data/data/com.termux/files/home/dev/colm-suite-dad/ragel'
g++ -DHAVE_CONFIG_H -I. -I../src -I../colm  -I../aapl -I../colm/include   -Wall -g -MT ragel-main.o -MD -MP -MF .deps/ragel-main.Tpo -c -o ragel-main.o `test -f 'main.cc' || echo './'`main.cc
gcc -DHAVE_CONFIG_H -I. -I../src -I../colm  -I../aapl -I../colm/include   -Wall -g -MT ragel-parse.o -MD -MP -MF .deps/ragel-parse.Tpo -c -o ragel-parse.o `test -f 'parse.c' || echo './'`parse.c
g++ -DHAVE_CONFIG_H -I. -I../src -I../colm  -I../aapl -I../colm/include   -Wall -g -MT ragel-rlreduce.o -MD -MP -MF .deps/ragel-rlreduce.Tpo -c -o ragel-rlreduce.o `test -f 'rlreduce.cc' || echo './'`rlreduce.cc
/bin/sh ../libtool  --tag=CXX   --mode=compile g++ -DHAVE_CONFIG_H -I. -I../src -I../colm  -I../aapl -I../colm/include -DBINDIR='"/data/data/com.termux/files/home/local/colm/bin"'   -Wall -g -MT libragel_la-parsetree.lo -MD -MP -MF .deps/libragel_la-parsetree.Tpo -c -o libragel_la-parsetree.lo `test -f 'parsetree.cc' || echo './'`parsetree.cc
libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I../src -I../colm -I../aapl -I../colm/include -DBINDIR=\"/data/data/com.termux/files/home/local/colm/bin\" -Wall -g -MT libragel_la-parsetree.lo -MD -MP -MF .deps/libragel_la-parsetree.Tpo -c parsetree.cc  -fPIC -DPIC -o .libs/libragel_la-parsetree.o
parse.c:7973:20: warning: result of comparison of constant -17 with expression of type 'char' is always false [-Wtautological-constant-out-of-range-compare]
        if ( (*pdaRun->p) == -17 )
             ~~~~~~~~~~~~ ^  ~~~
parse.c:7980:20: warning: result of comparison of constant -69 with expression of type 'char' is always false [-Wtautological-constant-out-of-range-compare]
        if ( (*pdaRun->p) == -69 )
             ~~~~~~~~~~~~ ^  ~~~
parse.c:7987:20: warning: result of comparison of constant -65 with expression of type 'char' is always false [-Wtautological-constant-out-of-range-compare]
        if ( (*pdaRun->p) == -65 )
             ~~~~~~~~~~~~ ^  ~~~
parse.c:8001:20: warning: result of comparison of constant -17 with expression of type 'char' is always false [-Wtautological-constant-out-of-range-compare]
        if ( (*pdaRun->p) == -17 )
             ~~~~~~~~~~~~ ^  ~~~
parse.c:8008:20: warning: result of comparison of constant -69 with expression of type 'char' is always false [-Wtautological-constant-out-of-range-compare]
        if ( (*pdaRun->p) == -69 )
             ~~~~~~~~~~~~ ^  ~~~
parse.c:8015:20: warning: result of comparison of constant -65 with expression of type 'char' is always false [-Wtautological-constant-out-of-range-compare]
        if ( (*pdaRun->p) == -65 )
             ~~~~~~~~~~~~ ^  ~~~
6 warnings generated.
libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I../src -I../colm -I../aapl -Wall -g -MT libfsm_la-gendata.lo -MD -MP -MF .deps/libfsm_la-gendata.Tpo -c gendata.cc -o libfsm_la-gendata.o >/dev/null 2>&1
In file included from allocgen.cc:44:
./ipgoto.h:106:7: warning: 'IpGoto::NFA_PUSH' hides overloaded virtual function [-Woverloaded-virtual]
        void NFA_PUSH( RedStateAp *state );
             ^
./codegen.h:447:15: note: hidden overloaded virtual function 'CodeGen::NFA_PUSH' declared here: type mismatch at 1st parameter ('std::string' (aka 'basic_string<char, char_traits<char>, allocator<char> >') vs 'RedStateAp *')
        virtual void NFA_PUSH( std::string );
                     ^
1 warning generated.
libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I../src -I../colm -I../aapl -Wall -g -MT libfsm_la-ipgoto.lo -MD -MP -MF .deps/libfsm_la-ipgoto.Tpo -c ipgoto.cc -o libfsm_la-ipgoto.o >/dev/null 2>&1
In file included from dot.cc:24:
./dot.h:39:15: warning: 'GraphvizDotGenOrig::writeStatement' hides overloaded virtual function [-Woverloaded-virtual]
        virtual void writeStatement( InputLoc &, int, std::string * );
                     ^
./gendata.h:425:15: note: hidden overloaded virtual function 'CodeGenData::writeStatement' declared here: different number of parameters (5 vs 3)
        virtual void writeStatement( InputLoc &loc, int nargs,
                     ^
1 warning generated.

compilation question

Nothing urgent, but now that colm, ragel are all in a single repo, figured I'd try to give it a build.

Which branch and which tool should I be using?

master appears to be development activities.

The README states that it is autoconf-based, but after autogen, configure make, it doesn't find the "version.h" file. Tried again with cmake instead (out of source build), but that fails with not finding <colm/map.h>.

[colm] bring back string concatenations in match/cons/string top level

These were removed because of the ambiguity with bare sends. However, it's quite annoying to no longer be able to concat strings. Any kind of multi-line pattern/concat/string will no longer work. Also, we have enabled concat for accumulation, but not for pattern/concat/string, which is an apparent inconsistency.

[colm] one output only per run of the program

Generating multiple outputs from multiple inputs makes writing makefiles more difficult. Move to a model where we run colm (or some other program) multiple times, once for each output file.

Could either

  1. roll up multiple outputs of colm into some pack (tarball?) then use separate rules to unpack them.
  2. invoke colm multiple times ... the most costly thing (generating a parser) should only happen once.

Probably 1 above is the most simple approach. Colm runs once, but is careful to pack into multiple output files. An external program, or some run of colm unpacks them one by one.

colm scanner failure: longer pattern interferes with matching of shorter pattern

Test case showing scanner failure has been added to the test suite. Removing the '//' comment pattern will allow the test to pass. The comment prevents the single '/' token from matching properly. Failure is somewhere in the code that reverts to a previously matched pattern when a longer pattern fails. In ragel this is achieved by consuting the 'act' variable.

test/colm.d/scan1.lm

merge ragel into colm repository

The first order of business here on github is to unify the Colm and Ragel repositories. This is a long-term strategic decision.

My next major goal with these suite of tools is to bring the language-agnostic features of Ragel to Colm. Colm is currently based on C++. It generates C++ and allows you to write a reducer for your languages in C++. I would like to extend this to additional languages in the same way Ragel 7.0 has been extended to support multiple languages. This will entail quite a bit of code sharing, particularly the intermediate language and tools.

Another reason to merge is that Ragel and Colm already share much state machine building and generation code, however the code has been cloned. I would like to eliminate these clones for the sake of code quality. This include the scanner portion of Colm, but also the LALR(1) table construction. Both of these can use libfsm.

I'm probably going to merge Ragel into the Colm repository, considering that between Ragel and Colm, Colm is really the thing that comes first. Ragel now depends on Colm. I understand this could create confusion at first, considering that at this time, most people are interested in Ragel. However, the opposite doesn't make much sense to me.

[colm] the * and + operators should use left-recursion

Original motivation for right recursion was convenient lisp-like list deconstruction into head and tail. There is a place for right recursion, however, it makes more sense now to be consistent with the parsing algorithm and use left-recursion.

[colm] eliminate the circular dependency between toklen and tokend

There are times we set tokend from toklen and vice-versa. Really toklen needs to exist to track the size of partial token matches across buffer blocks, but it ended up getting used as the token length indicator in pdarun. Really this is what tokend is for (as is done in ragel). Need to tease these two vars apart. One: rename toklen and use it only for tracking match length coming from previous buffers. Two: eliminate use of toklen outside of the scanner. Use tokend instead as indication of match length.

[colm] go grammar results in huge output file

There are two factors leading to huge output files in go.

  1. using utf8 directly in the lexer <- affects more
  2. patterns for inserting semis

Would be nice to have an additional pass on the input before going to the parser that allows transformations. The same technique could be used both for transformation from utf8 to unicode and for inserting semis. it would be a chain either before or after lexing.

[colm] iterating and modifying a struct field does not leave field changed

The following modification does not work. From genf

    Packet->PacketDef = PacketDef
                
    Offset: int = 0
    for FD: pkt_field_def in Packet->PacketDef {
        switch FD.pkt_field_type
        case [`bool] {
            FD.Offset = Offset
            Offset = Offset + 1
        }           
        case [`long] {
            FD.Offset = Offset
            Offset = Offset + 8
        }
        case [`string] {
            FD.Offset = Offset
            Offset = Offset + 4
        }   
        case [list_type] {
            FD.Offset = Offset
            Offset = Offset + 4
        }   
    }

[colm] need to disable commit in recursive strucutures

Running into a case in statement commit where a sub-expression is causing a full commit too early.

This is occurring in closure expressions in the rust grammar

fn from_str() {
    fello();
    parse_i64( |x| -> a { if b { c } } )
}

[colm] single literal strings cannot be concatenated

Single literal strings cannot be concatenated because they are not LitPat tokens. They are Literals, and parsed differently in productions and patterns. But they are available in code. This inconsistency is somewhat confusing.

[colm] eradicate the use of replItemList

The replacement, accumulator, and string concat sections of the grammar all use this global (to Parser). This is problematic in the face of recursion. It should be replaced with code that passes the list up through the productions.

[colm] segfault while parsing a constructor

Segfault while parsing constructor. Removing the trailing " or removing the optional name both eliminate the segfault.

lex
    token DEF / 'def' /
    token id / ( 'a' .. 'z' ) + /
    token SQOPEN /'['/
    token SQCLOSE /']'/
    token COLON /':'/
    ignore / ( '\n' | ' ' )+ /
end

def opt_prod_el_name
    [id COLON] 
|   []

def prod_el
    [opt_prod_el_name id]

def prod
    [SQOPEN prod_el SQCLOSE]

def cfl_def
    [DEF id prod]

cons Def: cfl_def 
    "def x \[x\]"

String concat hello_world_ext.lm fail

Hello !
Just cloned colm-suite and going through the documentation and testing the "hello_world_ext.lm" we get this error:

colm "hello_world_ext.lm"
hello_world_ext.lm:1:26: hello_world_ext.lm: parse error: hello_world_ext.lm:1:26: parse error

Cheers !

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.