Comments (9)
The issue here is that ?>
in PHP comes with an implicit semicolon, so from the parsers perspective there is a semicolon-statement (;
) after the docblock. As those are no-ops the parser doesn't generate nodes for them, so the doc block is dropped.
There are several ways this could be solved:
-
Fetching the file doc block using the tokenizer. Code would look similar to this:
<?php function getFileDocComment($sourceCode) { foreach (token_get_all($sourceCode) as $token) { if ($token[0] === T_DOC_COMMENT) { return $token[1]; } elseif ($token[0] !== T_OPEN_TAG && $token[0] !== T_WHITESPACE) { // only allow an opening tag and whitespace before file doc comment return false; } } return false; }
This has the advantage that it'll work even if there is nothing after the doc comment at all (i.e. the file contains only a doc comment and nothing else).
-
Instead of just dropping semicolon-statements I could return a node for them. Those would capture the doc comment. (This wouldn't work if the doc comment it the only thing in the file and there isn't even a closing
?>
php tag.) -
I could probably add the doc comment to the node following the semicolon (not yet sure how this would be done, but there probably is a way). (This wouldn't work if there is only a closing
?>
tag and no inline HTML after it.)
All variants seem somewhat ugly, so I'm not really sure what to do about this :/
from php-parser.
I agree with you that the above solutions are less than optimal. Let's see if another solution is possible.
For me to venture there I need some additional information regarding current behaviour. I have (mostly) read the Lexer and Parser to try and get an idea how the data is used but especially the parser is too hard to comprehensively study in a short amount of time.
As such, is the following use case correct behaviour?
Code
<?php
// this is a test
/**
* Short description
*/
/* this is a common multiline comment */
$a * $b == 1 + 2;
According to a var_dump of the nodes will 'Expr_Equal' node, the 'Expr_Mul' node and the 'Expr_Variable' $a ALL have the comment 'this is a test', 'Short description' and 'this is a common multiline' comment.
I would have expected that 'this is a test' and 'this is a common multiline comment' would have belonged to the file (shouldn't the file actually be a top parent node?) and that the Short description would only have belonged to the Expr_Equal node.
Example clarifying my last statement:
<?php
/** @var \SimpleXMLElement $a */
$a * $b == 1 + 2;
The above indicates that the Expr_Equal contains a variable $a which is of type SimpleXMLElement in this specific context. It may be assumed that $a is the same type in subsequent contexts but only here we know for sure.
from php-parser.
Hey @nikic,
I'd love to hear your thoughts on the above if you have time,
Thanks in advance.
from php-parser.
Hey @mvriel. I must have missed the notification for your comment, sorry :/
Regarding the comments, no, this is not the intended behavior and I didn't notice that this happens until now. Though thinking about it again, it's quite logical that it happens: Just like all (also nested) nodes get assigned a line number, nested nodes also get assigned the comments.
Fixing this will be rather hard though, as it can't really be done in a generic way, at least I don't see one. Rather I'd have to assign comments only to statements (and expressions in statement use). But it's probably still worth to do it for the memory savings.
from php-parser.
maybe a node / container / something with "lostandfound"-comments ? :) aka "not assignable" - comments?
from php-parser.
Ran into this today too.
All docblocks/comments get accumulated to the next node found or dropped if none.
This makes it really inconvenient to use the same visitor/parser concept for a mixed tree where some files have docblocks without comments, etc.
I wish there was a mode where docblock/comments where just emmited as individual nodes like anything else, without being required to be attached somewhere.
from php-parser.
Commit 7eac2cf adds a Nop statement if there any trailing comments in a statement list. This should also include the case of the original report.
I've create a separate issue for the duplicate comment assignment (#253), so we can close this issue at last :)
from php-parser.
Thanks @nikic! I had missed the addition of the NOP statement so that should enable me to introduce file-level docblocks. Thanks!
from php-parser.
@mvriel You didn't miss anything, I've only added them yesterday ;) The date info on that commit is off, because I rebased it off an older branch.
from php-parser.
Related Issues (20)
- [Question] How to add a class constant inside an array ? HOT 2
- Global symbols usage HOT 3
- v4.7.1 When parsing the array, the numeric key name was lost HOT 1
- php composer.phar problem HOT 1
- Removal of Stmt\Throw not mentioned in UPGRADING
- Comment attributes not being repeated for nested nodes is not in UPGRADING guide
- [5.0] Parser crashes on an empty file HOT 1
- [Format-preserving printer] How to get rid of trailing commas in function calls/parameter definitions/closure uses HOT 2
- [5.0] PropertyItem extends Stmt by mistake? HOT 1
- 5.0.0 - Call to undefined method PhpParser\ParserFactory::createForHostVersion() HOT 2
- Declaration of PhpParser\Parser\Multiple::parse HOT 1
- Upgrade Guide to v5.0 -- typo error on Changes to the lexer chapter
- TokenPolyfill tries to construct itself with an id which is a string HOT 5
- getPhpVersion() method has different return type in V5 HOT 2
- Should PhpParser\Node\Name extend PhpParser\Node\Expr ? HOT 2
- Parse arbitrary expressions HOT 4
- Appending new statements HOT 2
- Parser instance is not automatically garbage collected due to self references HOT 4
- TokenPolyfill issue on PHP 7.4 platform HOT 25
- Pretty printing removes leading whitespace from class declaration on first line HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from php-parser.