kymckay / sqwhiff Goto Github PK
View Code? Open in Web Editor NEWA C++ implementation of a preprocessor, lexer, parser and semantic analyzer for the Real Virtuality engine's SQF scripting language.
License: GNU General Public License v3.0
A C++ implementation of a preprocessor, lexer, parser and semantic analyzer for the Real Virtuality engine's SQF scripting language.
License: GNU General Public License v3.0
Format: ".+?"(?!")
or '.+?'(?!')
I've come to realise that using new standards is preferable for a project that's brand new. Though there may be some compiler support research to do, I think C++20 would be ideal and if not then at least C++17.
make_unique
).Will need to update docs to reflect any change.
Currently I just output to console and throw
to stop execution.
I should really log errors in some way for later output and use as an analysis tool.
As a matter of good practice
Currently only files are handled, would be good to allow targeting a whole directory structure.
May want a parameter to disabled recursive processing.
#define ADDON TEST
#define DOUBLES(var1, var2) var1##_##var2
#define GVAR(var1) DOUBLES(ADDON, var1)
GVAR(var)
This should resolve to TEST_var
and is currently resolving to TEST_ var
which is then considered as 2 tokens and causes a syntax error. Presumably a bug in the way I handle preprocessing concatenation.
Position appears in the error message
A nullary keyword followed by a binary keyword is parsed incorrectly into the AST structure
allunits select 0
should become ((allunits) select <Dec:0>)
, but it currently is parsed as a unary: (allunits (select <Dec:0>))
||
, or
&&
, and
==
, !=
, >
, <
, >=
, <=
, >>
else
min
, max
mod
, %
, atan2
^
#
not
, !
Some of these may implicitly work with the current rules of constructing the AST, but at least switch
is definitely unsupported:
if ... else
switch
while
for
forEach
Instead of relying on VS Code's tasks.json
file to tell g++ to compile I should really set up CMake or Bazel for a better IDE agnostic workflow that will allow more configurability.
Goes hand-in-hand with #1
Format [expr (, expr)*]
[
and ]
characters[
encounteredWould be good to add in tests for some self documentation and to avoid regression as things get more complex.
GoogleTest seems like a decent choice, just requires some better build setup using CMake or Bazel.
Similar to #7
Same reasoning, because both an assignment statement and an expression can start with an identifier token (or a keyword token for SQF's "private assignment" modifier).
Requires the same solution (token lookahead), but up to two tokens ahead since assignment can be: <keyword> <identifier> <assign>
Currently the analyzer is only checking arity and that is hardcoded into the AST traversal functions.
For future scalability, it would make sense to use some sort of strategy pattern where functions are defined elsewhere and the analyzer just makes calls out to those and captures the errors they report. This would make it easier to provide a method of disabling/enabling individual linting errors/warnings too.
Should probably be done before (or as part of) the second half of #20
Currently analysis rules are stored in a map of int -> rule
in this file. This was done with the idea of allowing the user to specify rules to ignore/skip.
Thinking is:
Having multiple semicolons in a row is valid syntax (a bunch of no operations), but multiple commas produces an error. However commas can still be at the end of statements.
The implementation currently doesn't reflect this behaviour.
Useful for unit testing to be able to pass in fixed strings instead of a file stream
Part of #13 - a more thought through method of expansion which should allow for easier handling of nested expansion and expansion within arguments as well as parameter replacement. Sequential processing resolves some of the edge case issues that can occur when attempting to do this via regex.
This should also improve the code structure and keep the responsibilities of each class more focused (at the moment there's a lot going on in the preprocessor)
Sub-task of #13
Loose design spec:
../
)<file.txt>
or "file.txt"
)\
, provide means of specifying a directory to act as this path
Sub-task of #13
Should be relatively straightforward:
In terms of implementation, could either run through and push al applicable into the lookahead cache, or could save state and future calls act accoridngly.
Parser errors seem to be working fine and output to console when running the main program, but there's no output when I expect lexer errors (encountering ?
char, or an unclosed string)
Currently errors are just spit out in the order they are encountered with no file information (relevant now that inclusion preprocessing is in). I think it would be nice to:
path/to/example.sqf
1:5 PreprocessingError - Recursive inclusion of file: ./example.sqf
Currently the parser implements peek logic to lookahead to upcoming tokens (for assignment identification).
For consistency with other interfaces this logic should really be in the lexer (istream
allows peeking at next get
, preprocessor allows peeking at next get
).
I probably need to introduce a lexical preprocessor to truly handle these since they'd be resolved before the sqf lexer takes over
#include
#define MACRO value
#define MACRO_FUNC(ARG1, ARG2, ...)
#
(stringification)##
(token concatenation)\
(exactly before a newline, multi-line definition)#undef
#if
#ifdef
#ifndef
#else
#endif
Currently it isn't, need to make the lexer result lowercase before checking for keywords
Currently the parser and grammar don't capture this, they enforce the opposite behaviour
Format: {statement_list}
{
and }
tokensNow that binary and unary command data is captured, the most basic check to implement in the analyser is:
Has been hardcoded to read "test.txt" for a while now, this is a solid step to move towards being a command line tool
Currently some objects are instantiated in the global namespace for use in various places (e.g. the SQF command data maps).
These can be put into appropriate namespaces to not pollute the global namespace and to improve code readability (clearer where they come from when used).
As a result of my decision to tokenise SQF commands as a single type of token and spot misuse errors semantically instead of syntactically (due to parsing complication that would require since commands can have all/some of nullary, unary and binary forms) the parser is currently unable to distinguish nullary from unary commands (since they both start with a keyword token).
I need to add a way for the parser to see ahead to the next token in order to handle this.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.