slackhq / tree-sitter-hack Goto Github PK
View Code? Open in Web Editor NEWHack grammar for tree-sitter
License: MIT License
Hack grammar for tree-sitter
License: MIT License
We should run bin/test-corpus
at least to start. Might want to have a test to verify src/grammar.json
, src/node-types.json
, and src/parser.c
are up to date with every new PR.
Enum class labels which use the hash (#
) character are not correctly parsed by tree-sitter-hack.
This means that #
should also no longer parse comments
x
in each of the [ ]
)**The following should parse as valid hack
E#B;
instead it parses as an unterminated statement:
(script [0, 0] - [1, 0]
(ERROR [0, 0] - [0, 1]
(identifier [0, 0] - [0, 1]))
(comment [0, 1] - [0, 4]))
The parser does not treat include, include_once, require and require_once as keywords, and thus gets confused when these are used as the name of a function called on a pipe expression.
For instance, the following code:
new Foo() |> $$->include();
new Foo() |> $$->include_once();
new Foo() |> $$->require();
new Foo() |> $$->require_once();
Is incorrectly parsed as follows:
(ERROR
(new_expression
(qualified_identifier
(identifier))
(arguments))
(pipe_variable)
(ERROR
(parameters))
(new_expression
(qualified_identifier
(identifier))
(arguments))
(pipe_variable)
(ERROR
(parameters))
(new_expression
(qualified_identifier
(identifier))
(arguments))
(pipe_variable)
(ERROR
(parameters))
(new_expression
(qualified_identifier
(identifier))
(arguments))
(pipe_variable)
(parameters))
Add new test with content new Foo() |> $$->include();
, then run tree-sitter test
No errors are reported in the syntax tree.
The parser does not recognise typed collection initializers.
This is valid Hack code to generate a vector of strings:
$x = Vector<string> {'foo', 'bar'};
Yet, the parser does not recognize the syntax and mistakes it for a function pointer:
(expression_statement
(function_pointer
(qualified_identifier
(identifier))
(type_arguments
(type_specifier)))
(MISSING ";"))
(compound_statement
(ERROR
(string)
(string)))
(empty_statement))
Add Set<string> {'foo', 'bar'};
to the Collection
test in test/corpus/expressions.txt
, then run tree-sitter test
.
No errors are reported in the syntax tree.
Issue is more general that Set<type arguments> { ... }
, applies to Map(s) and Vector(s) as well, possibly to more types.
See https://docs.hhvm.com/hack/contexts-and-capabilities/introduction
Contexts and capabilities are a new language feature, and have been default since HHVM 4.93. They include a syntax extension which tree-sitter-hack currently cannot parse.
code sample
function empty_context()[]: void {}
parser output
function_declaration [0, 0] - [0, 35]
name: identifier [0, 9] - [0, 22]
parameters [0, 22] - [0, 24]
ERROR [0, 24] - [0, 26]
return_type: type_specifier [0, 28] - [0, 32]
body: compound_statement [0, 33] - [0, 35]
x
in each of the [ ]
)This is valid Hack code that declares a module-level attribute:
<?hh
<<file: SomeClass(SomeOtherClass::class)>>
Yet, the parser does not recognize the syntax:
(script
(ERROR
(attribute_modifier
(ERROR
(identifier))
(qualified_identifier
(identifier))
(arguments
(argument
(scoped_identifier
(qualified_identifier
(identifier))
(identifier)))))))
Add a new test with content <<file: SomeClass(SomeOtherClass::class)>>
in test/corpus/declarations.txt
, then run tree-sitter test
.
No errors are reported in the syntax tree.
As far as I can tell, we do not have tests for xhp_attribute
. We should add them. Something like this:
<frag info={get_str('info')} />;
x
in each of the [ ]
)Is it https://github.com/returntocorp/ocaml-tree-sitter-languages or https://github.com/returntocorp/ocaml-tree-sitter-semgrep that tests different Hack repos against their parser?
Wonder if we can leverage some of their work and run their tests as part of this repos CI tests (first we need to add CI tests ๐ ).
The parser does not recognise expression tree syntax.
The following is valid Hack:
$x = SomeVisitorClass`some expression`;
Yet, the parser does not parse it correctly:
(script
(expression_statement
(binary_expression
(variable)
(qualified_identifier
(identifier)))
(ERROR
(UNEXPECTED '`')
(identifier)
(identifier)
(UNEXPECTED '`'))))
Add a new test with content $x = SomeVisitorClass`some expression`;
, then run tree-sitter test
.
No errors are reported in the syntax tree.
Hi! I read/write a lot of Hack code in my day-to-day. My primary editor is Neovim and I've been hoping to leverage tree-sitter to power syntax highlighting. However, the highlighting queries for Hack are pretty bare-bones: https://github.com/slackhq/tree-sitter-hack/blob/main/queries/highlights.scm
I was curious if you all would accept contributions to at the very least the highlight queries to be more in line with other tree-sitter grammars (here's PHP for comparison).
I'd be potentially interested in augmenting the grammar or the parser as well to allow for more fine-tuned highlight queries, but I'd of course defer to you all for anything backwards incompatible.
Thank you!
x
in each of the [ ]
)Clang returns a warning about returning a pointer to a stack variable
CXX(target) Release/obj.target/tree_sitter_hack_binding/bindings/node/binding.o
CXX(target) Release/obj.target/tree_sitter_hack_binding/src/scanner.o
../src/scanner.cc:106:14: warning: address of stack memory associated with local variable 'str' returned [-Wreturn-stack-address]
return str.c_str();
^~~
1 warning generated.
The affected function return a pointer to a local string that is destroying at the end of function scope. This is a use-after-free bug.
x
in each of the [ ]
)**Steps to reproduce the behavior:
N/A not a runtime bug.
tree-sitter-hack version: dd7c1bb
Run tree-sitter generate
with CC=clang-14
OS version(s): Ubuntu 21.10
N/A
Since HHVM 4.79, Hack has supported a function reference syntax. See https://docs.hhvm.com/hack/functions/function-references. Currently, this causes an error.
Case:
namespace\test_call();
Expected Output:
(expression_statement
(call_expression
function: (qualified_identifier
(identifier)
(identifier))
(arguments))))
Actual Output:
(script
(namespace_declaration)
(ERROR
(qualified_identifier
(identifier))
(parameters))
(empty_statement))
For example, we can use __func__
in the ret macro instead of passing in the function name as a string. There are lots of similar cleanups that need to be done.
x
in each of the [ ]
)Enum classes do not appear to be parsable
x
in each of the [ ]
)**Using the example from https://docs.hhvm.com/hack/built-in-types/enum-class
enum class Random : mixed {
int X = 42;
string S = 'foo';
}
fails to parse.
Parse tree:
enum_declaration [0, 0] - [3, 1]
ERROR [0, 5] - [0, 10]
identifier [0, 5] - [0, 10]
name: identifier [0, 11] - [0, 17]
type: type_specifier [0, 20] - [0, 25]
enumerator [1, 2] - [1, 13]
identifier [1, 2] - [1, 5]
ERROR [1, 6] - [1, 7]
identifier [1, 6] - [1, 7]
integer [1, 10] - [1, 12]
enumerator [2, 2] - [2, 19]
identifier [2, 2] - [2, 8]
ERROR [2, 9] - [2, 10]
identifier [2, 9] - [2, 10]
string [2, 13] - [2, 18]
Enum classes are correctly parsed
OS version(s): Linux
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.