Giter Site home page Giter Site logo

seattlerb / ruby_parser Goto Github PK

View Code? Open in Web Editor NEW
474.0 19.0 102.0 2.5 MB

ruby_parser is a ruby parser written in pure ruby. It outputs s-expressions which can be manipulated and converted back to ruby via the ruby2ruby gem.

Home Page: http://www.zenspider.com/projects/ruby_parser.html

Ruby 60.55% Yacc 38.13% REXX 1.32%

ruby_parser's Introduction

ruby_parser

home

github.com/seattlerb/ruby_parser

bugs

github.com/seattlerb/ruby_parser/issues

rdoc

docs.seattlerb.org/ruby_parser

DESCRIPTION:

ruby_parser (RP) is a ruby parser written in pure ruby (utilizing racc–which does by default use a C extension). It outputs s-expressions which can be manipulated and converted back to ruby via the ruby2ruby gem.

As an example:

def conditional1 arg1
  return 1 if arg1 == 0
  return 0
end

becomes:

s(:defn, :conditional1, s(:args, :arg1),
  s(:if,
    s(:call, s(:lvar, :arg1), :==, s(:lit, 0)),
    s(:return, s(:lit, 1)),
    nil),
  s(:return, s(:lit, 0)))

Tested against 801,039 files from the latest of all rubygems (as of 2013-05):

  • 1.8 parser is at 99.9739% accuracy, 3.651 sigma

  • 1.9 parser is at 99.9940% accuracy, 4.013 sigma

  • 2.0 parser is at 99.9939% accuracy, 4.008 sigma

  • 2.6 parser is at 99.9972% accuracy, 4.191 sigma

  • 3.0 parser has a 100% parse rate.

    • Tested against 2,672,412 unique ruby files across 167k gems.

    • As do all the others now, basically.

FEATURES/PROBLEMS:

  • Pure ruby, no compiles.

  • Includes preceding comment data for defn/defs/class/module nodes!

  • Incredibly simple interface.

  • Output is 100% equivalent to ParseTree.

    • Can utilize PT’s SexpProcessor and UnifiedRuby for language processing.

  • Known Issue: Speed is now pretty good, but can always improve:

    • RP parses a corpus of 3702 files in 125s (avg 108 Kb/s)

    • MRI+PT parsed the same in 67.38s (avg 200.89 Kb/s)

  • Known Issue: Code is much better, but still has a long way to go.

  • Known Issue: Totally awesome.

  • Known Issue: line number values can be slightly off. Parsing LR sucks.

SYNOPSIS:

RubyParser.new.parse "1+1"
# => s(:call, s(:lit, 1), :+, s(:lit, 1))

You can also use Ruby19Parser, Ruby18Parser, or RubyParser.for_current_ruby:

RubyParser.for_current_ruby.parse "1+1"
# => s(:call, s(:lit, 1), :+, s(:lit, 1))

DEVELOPER NOTES:

To add a new version:

  • New parser should be generated from lib/ruby_parser.yy.

  • Extend lib/ruby_parser.yy with new class name.

  • Add new version number to V2/V3 in Rakefile for rule creation.

  • Add new ‘ruby_parse “x.y.z”` line to Rakefile for rake compare (line ~300).

  • Require generated parser in lib/ruby_parser.rb.

  • Add new V## = ::Ruby##Parser; end to ruby_parser.rb (bottom of file).

  • Add empty TestRubyParserShared##Plus module and TestRubyParserV## to test/test_ruby_parser.rb.

  • Extend Manifest.txt with generated file names.

  • Add new version number to sexp_processor’s pt_testcase.rb in all_versions.

Until all of these are done, you won’t have a clean test run.

REQUIREMENTS:

  • ruby. woot.

  • sexp_processor for Sexp and SexpProcessor classes, and testing.

  • racc full package for parser development (compiling .y to .rb).

INSTALL:

  • sudo gem install ruby_parser

LICENSE:

(The MIT License)

Copyright © Ryan Davis, seattle.rb

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the ‘Software’), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

ruby_parser's People

Contributors

evanphx avatar zenspider avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ruby_parser's Issues

Inconsistency in AST for classes and modules

If I use ruby_parser to parse a module with a single method defined, I get the following output:

s(:module, :M, s(:scope, s(:defn, :foo, s(:args), s(:scope, s(:block, s(:lit, 1))))))

If I define another method, the output changes to this:

s(:module, :M, s(:scope, s(:block, s(:defn, :foo, s(:args), s(:scope, s(:block, s(:lit, 1)))), s(:defn, :bar, s(:args), s(:scope, s(:block, s(:lit, 2)))))))

In addition to the second method definition, now the entire body of the module is wrapped in a s(:block). This means that when using ruby_parser to parse classes and modules, I need to guard against the case where I get back a single sexp. This seems like an inconsistency that could easily be fixed by always wrapping a class or module's body in a s(:block).

Broken nesting parsing in tokadd_string

Various parse issues are present in tokadd_string. Simple example, this parses:

%Q(
    {
      #{yield `baz`};
    }
)

but this doesnt:

%Q{
    {
      #{yield `baz`};
    }
}

The resulting error is that it is getting to the EOF without closing the string:

SyntaxError for _opal/test2.rb: unterminated string meets end of file. near line 1: ""
done

Observing the tokens gathered in #advance it looks like the "`" following "baz" is being parsed as string content.

No way to distinct between NODE_FCALL and NODE_VCALL

Best described by this snippet:

> Ruby19Parser.new.parse('a')
=> s(:call, nil, :a)
> Ruby19Parser.new.parse('a()')
=> s(:call, nil, :a)

The first invocation could result being treated as a local variable access in a eval context, whereas the second could not.

Does not complain about duplicate argument names

Demonstrated by this snippet:

[79] pry(main)> def a(x,x); end
SyntaxError: (eval):2: duplicated argument name
def a(x,x); end
          ^
[79] pry(main)> Ruby19Parser.new.parse('def a(x,x) end')
=> s(:defn, :a, s(:args, :x, :x), s(:nil))

IndexError

After upgrading to .a6 I get IndexErrors. e.g.

IndexError: index 18128 out of string

gems/ruby_parser-3.0.0.a6/lib/ruby_parser_extras.rb:60 • []=
gems/ruby_parser-3.0.0.a6/lib/ruby_parser_extras.rb:60 • unread_many
gems/ruby_parser-3.0.0.a6/lib/ruby_lexer.rb:137 • heredoc
gems/ruby_parser-3.0.0.a6/lib/ruby_lexer.rb:1442 • yylex_string
gems/ruby_parser-3.0.0.a6/lib/ruby_lexer.rb:671 • yylex
gems/ruby_parser-3.0.0.a6/lib/ruby_lexer.rb:92 • advance
gems/ruby_parser-3.0.0.a6/lib/ruby_parser_extras.rb:874 • next_token
(eval):3 • _racc_do_parse_c
(eval):3 • do_parse
gems/ruby_parser-3.0.0.a6/lib/ruby_parser_extras.rb:923 • process
lib/quality/ruby/parser.rb:20 • block in parse

here doc support?

looks like you're trying to deal with my here doc...
but not succeeding.

did I do something odd?

 def template_details
    <<-DOC
      Genomic: #{ranged_param_details(:genomic_template_ngul, "ng/µL")},
      Simple: #{ranged_param_details(:simple_template_ngul, "ng/µL")}
    DOC
  end
flog test.rb 

ERROR: parsing ruby file test.rb
ERROR! Aborting. You may want to run with --continue.
/Users/langhorst/.rvm/gems/ruby-1.9.2-p180/gems/ruby_parser-2.3.1/lib/ruby_lexer.rb:395:in `rb_compile_error': can't match /[ \t]*DOC(\r?\n|\z)/ anywhere in . near line 2: "" (SyntaxError)
    from /Users/langhorst/.rvm/gems/ruby-1.9.2-p180/gems/ruby_parser-2.3.1/lib/ruby_lexer.rb:146:in `heredoc'
    from /Users/langhorst/.rvm/gems/ruby-1.9.2-p180/gems/ruby_parser-2.3.1/lib/ruby_lexer.rb:1300:in `yylex_string'
    from /Users/langhorst/.rvm/gems/ruby-1.9.2-p180/gems/ruby_parser-2.3.1/lib/ruby_lexer.rb:642:in `yylex'
    from /Users/langhorst/.rvm/gems/ruby-1.9.2-p180/gems/ruby_parser-2.3.1/lib/ruby_lexer.rb:68:in `advance'
    from /Users/langhorst/.rvm/gems/ruby-1.9.2-p180/gems/ruby_parser-2.3.1/lib/ruby_parser_extras.rb:713:in `next_token'
    from /Users/langhorst/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/racc/parser.rb:99:in `_racc_do_parse_c'
    from /Users/langhorst/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/racc/parser.rb:99:in `do_parse'
    from /Users/langhorst/.rvm/gems/ruby-1.9.2-p180/gems/ruby_parser-2.3.1/lib/ruby_parser_extras.rb:750:in `process'
    from /Users/langhorst/.rvm/gems/ruby-1.9.2-p180/gems/flog-2.5.3/lib/flog.rb:241:in `block in flog'
    from /Users/langhorst/.rvm/gems/ruby-1.9.2-p180/gems/flog-2.5.3/lib/flog.rb:235:in `each'
    from /Users/langhorst/.rvm/gems/ruby-1.9.2-p180/gems/flog-2.5.3/lib/flog.rb:235:in `flog'
    from /Users/langhorst/.rvm/gems/ruby-1.9.2-p180/gems/flog-2.5.3/bin/flog:13:in `<top (required)>'
    from /Users/langhorst/.rvm/gems/ruby-1.9.2-p180/bin/flog:21:in `load'
    from /Users/langhorst/.rvm/gems/ruby-1.9.2-p180/bin/flog:21:in `<main>'

Incorrect :lasgn node for block argument in parens

Given this code:

foo do |(bar), baz|
end

Ruby Parser does this:

>> Ruby19Parser.new.parse "foo do |(bar), baz|\nend"
=> s(:iter, s(:call, nil, :foo), s(:masgn, s(:array, s(:lasgn, :bar, :baz))))

Note the :lasgn node, with two symbol literals. I (and Flog, I think) believe the expected form of :lasgn is that an optional Sexp should occupy the spot after :bar.

Line number error when call method without parentheses or parameters

There is a bug for method call:

examples:

code (1):
    def a
       p 'a'
    end
Parse result with line:

s(:defn, :a, s(:args), s(:call, nil, :p, s(:str, "a")))

code line
s(:defn, :a, 1
˙˙s(:args), 2
˙˙s(:call, nil, :p, 3(ERROR)
˙˙˙˙s(:str, :a), 2
##### code (2):
   def a
      p('a')
      b = 1
      p b
      c =1
  end
  a
Parse result with line:

s(:block, s(:defn, :a, s(:args), s(:call, nil, :p, s(:str, "a")), s(:lasgn, :b, s(:lit, 1)), s(:call, nil, :p, s(:lvar, :b)), s(:lasgn, :c, s(:lit, 1))), s(:call, nil, :a))

code line
s(:block, 1
˙˙s(:defn, :a, 1
˙˙˙˙s(:args), 2
˙˙˙˙s(:call, nil, :p, 2
˙˙˙˙˙˙s(:str, :a), 2
˙˙˙˙s(:lasgn, :b, 3
˙˙˙˙˙˙s(:lit, :1), 3
˙˙˙˙s(:call, nil, :p, 5(ERROR)
˙˙˙˙˙˙s(:lvar, :b), 5(ERROR)
˙˙˙˙s(:lasgn, :c, 5
˙˙˙˙˙˙s(:lit, :1), 5
˙˙s(:call, nil, :a), 8(ERROR)

Bad sexp on trailing comma in call with hash

# Bad (note :array literal)
>> Ruby19Parser.new.parse("foo(:bar, baz: nil,)")
=> s(:call, nil, :foo, s(:lit, :bar), :array, s(:lit, :baz), s(:nil))

# Good (wrapped in :hash node)
>> Ruby19Parser.new.parse("foo(:bar, baz: nil)")
=> s(:call, nil, :foo, s(:lit, :bar), s(:hash, s(:lit, :baz), s(:nil)))

Parsing method with dynamic number of attributes problem

suppose you have following method

def test(*arr, value)
  puts value
end

running ruby parser dies with following backtrace

# file = test.rb loc = 3
/usr/local/rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/racc/parser.rb:349:in `on_error' # for test.rb
  /usr/local/rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/racc/parser.rb:349:in `on_error'
  /usr/local/rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/racc/parser.rb:99:in `_racc_do_parse_c'
  /usr/local/rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/racc/parser.rb:99:in `do_parse'
  /usr/local/rvm/gems/ruby-1.9.2-p290/gems/ruby_parser-2.0.6/lib/ruby_parser_extras.rb:749:in `process'
  /usr/local/rvm/gems/ruby-1.9.2-p290/gems/ruby_parser-2.0.6/bin/ruby_parse:40:in `block in '
  /usr/local/rvm/gems/ruby-1.9.2-p290/gems/ruby_parser-2.0.6/bin/ruby_parse:21:in `each'
  /usr/local/rvm/gems/ruby-1.9.2-p290/gems/ruby_parser-2.0.6/bin/ruby_parse:21:in `'
  /usr/local/rvm/gems/ruby-1.9.2-p290/bin/ruby_parse:19:in `load'
  /usr/local/rvm/gems/ruby-1.9.2-p290/bin/ruby_parse:19:in `'
done

backtrace says I'm running 2.0.6 but I also tested 2.3.1 with same result

No handling of 1.9 array decomposition in def arguments

Best described by this snippet:

> Ruby19Parser.new.parse('def a(a=1, (b, f), d, &e); end')
RuntimeError: unhandled sexp: s(:masgn, s(:array, s(:lasgn, :b), s(:lasgn, :f)), :d)
from /home/whitequark/.rbenv/versions/1.9.3-p194-perf/lib/ruby/gems/1.9.1/gems/ruby_parser-3.0.0.a8/lib/ruby_parser_extras.rb:176:in `block in args19'

Fails to handle array decomposition in -> arguments

Demonstrated by this snippet:

[37] pry(main)> Ruby19Parser.new.parse("->((x,y)){}")
RuntimeError: unhandled sexp: s(:masgn, s(:array, s(:lasgn, :x), s(:lasgn, :y)))
from /home/whitequark/.rbenv/versions/1.9.3-p194-perf/lib/ruby/gems/1.9.1/gems/ruby_parser-3.0.0.a8/lib/ruby_parser_extras.rb:176:in `block in args19'
[38] pry(main)> Ruby19Parser.new.parse("lambda{ |(x,y)| }")
=> s(:iter,
 s(:call, nil, :lambda),
 s(:masgn, s(:array, s(:lasgn, :x), s(:lasgn, :y))))

Ruby 1.9 hash syntax cause problems when used as method arguments

If you use the new hash syntax causes an explosion:

# hash19.rb
foo(hash: :bar)

Results in this:

$ ruby_parse hash19.rb
# file = hash_test.rb loc = 0
/Users/matt/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/1.9.1/racc/parser.rb:351:in `on_error' #<Racc::ParseError:  parse error on value ":" (tCOLON)> for hash_test.rb
  /Users/matt/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/1.9.1/racc/parser.rb:351:in `on_error'
  (eval):3:in `_racc_do_parse_c'
  (eval):3:in `do_parse'
  /Users/matt/.rvm/gems/ruby-1.9.3-p0/gems/ruby_parser-2.3.1/lib/ruby_parser_extras.rb:750:in `process'
  /Users/matt/.rvm/gems/ruby-1.9.3-p0/gems/ruby_parser-2.3.1/bin/ruby_parse:40:in `block in <top (required)>'
  /Users/matt/.rvm/gems/ruby-1.9.3-p0/gems/ruby_parser-2.3.1/bin/ruby_parse:21:in `each'
  /Users/matt/.rvm/gems/ruby-1.9.3-p0/gems/ruby_parser-2.3.1/bin/ruby_parse:21:in `<top (required)>'
  /Users/matt/.rvm/gems/ruby-1.9.3-p0/bin/ruby_parse:19:in `load'
  /Users/matt/.rvm/gems/ruby-1.9.3-p0/bin/ruby_parse:19:in `<main>'
done

If you use the "old" syntax everything works fine:

# hash18.rb
foo(:hash => :bar)
$ ruby_parse hash18.rb
# file = hash_test.rb loc = 0
s(:call, nil, :foo, s(:arglist, s(:hash, s(:lit, :hash), s(:lit, :bar))))
done

It also works fine if you have curly braces around the hash:

# hash19_curly.rb
foo({hash: :bar})
$ ruby_parse hash_test.rb
# file = hash_test.rb loc = 0
nil
done

parser seems to blow up using 1.9 style keywords

Running "ruby_parse test.rb" gives:

/home/msquires/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/racc/parser.rb:351:in on_error' #<Racc::ParseError: parse error on value ":" (tCOLON)> for test.rb /home/msquires/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/racc/parser.rb:351:inon_error'
(eval):3:in _racc_do_parse_c' (eval):3:indo_parse'
/home/msquires/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/gems/ruby_parser-2.3.1/lib/ruby_parser_extras.rb:750:in `process'

Here is the test.rb file. Commenting out the "... json: @Brands" gets rid of the exception.

class FunkyController
def index
respond_to do |format|
format.json { render json: @Brands }

format.json { render (:json => @Brands) }

end

end
end

ParseError on 1.9 hashes with no space

Quick note from my testing tonight:

>> RubyParser.new.parse("{a:1}")
Racc::ParseError: 
parse error on value ":" (tSYMBEG)

Inserting a space between a: and 1 will work.

defn nodes don't contain scope and block nodes

With ParseTree:

>> ParseTree.new.process("def foo\nend")
=> s(:defn, :foo, s(:args), s(:scope, s(:block, s(:nil))))

With RubyParser master:

>> Ruby18Parser.new.parse("def foo\nend")
=> s(:defn, :foo, s(:args), s(:nil))
>> Ruby19Parser.new.parse("def foo\nend")
=> s(:defn, :foo, s(:args), s(:nil))

Is the omission of :scope and :block nodes intentional?

I was also depending on the inclusion of the :scope and :block nodes for line number information. (Previously, the :scope and :block #line would correspond to the last line of the scope. I'm not sure if that was intentional, but it allowed me to detect the end line of methods.)

-Bryan

ruby19 call syntax

In ruby19 you can omit 'call' method name:

class A
  def call
    puts "done"
  end
end

a = A.new
a.()
# => "done"

In this case, lparen and rparen are required.

While it is a valid ruby19 syntax, ruby_parse bails out with an error:

.../racc/parser.rb:351:in `on_error' #<Racc::ParseError:  parse error on value "(" (tLPAREN2)> for ...

  .../racc/parser.rb:351:in `on_error'
  (eval):3:in `_racc_do_parse_c'
  (eval):3:in `do_parse'
  .../ruby_parser-3.0.0.a1/lib/ruby_parser_extras.rb:874:in `process'
  .../ruby_parser-3.0.0.a1/bin/ruby_parse:48:in `block in <top (required)>'
  .../ruby_parser-3.0.0.a1/bin/ruby_parse:29:in `each'
  .../ruby_parser-3.0.0.a1/bin/ruby_parse:29:in `<top (required)>'
  ...bin/ruby_parse:19:in `load'
  .../bin/ruby_parse:19:in `<main>'
done

Racc::ParseError: parse error on value ["do", 1] (kDO_BLOCK)

Good:

>> Ruby19Parser.new.parse("foo -> { }")
=> s(:call, nil, :foo, s(:iter, s(:call, nil, :lambda), 0, nil))

Bad:

>> Ruby19Parser.new.parse("foo -> do\nend")
# ERROR: (string):1 :: parse error on value ["do", 1] (kDO_BLOCK)
Racc::ParseError: 
parse error on value ["do", 1] (kDO_BLOCK)

Equivalent of :strip_enclosure => true while migrating the code to Ruby19

Using the ParseTree with Ruby 1.8.7, the following method sexp = self.code.to_sexp(:strip_enclosure => true) returns the following:

s(:block,
 s(:call, nil, :NoOpV1, s(:arglist, s(:lvar, :tx))),
 s(:iter,
  s(:call,
   nil,
   :branch,
   s(:arglist, s(:call, s(:lvar, :tx), :aggregates, s(:arglist)))),
  s(:lasgn, :agg),
  s(:block,
   s(:call, nil, :label, s(:arglist, s(:lit, :APIContentV1))),
   s(:call, nil, :APIContentV1, s(:arglist, s(:lvar, :agg))),
   s(:call, nil, :label, s(:arglist, s(:lit, :APIReviewV1))),
   s(:call, nil, :APIReviewV1, s(:arglist, s(:lvar, :agg))),
   s(:call,
    nil,
    :ReputationBayes,
    s(:arglist,
     s(:lvar, :agg),
     s(:const, :APIContentV1),
     s(:const, :APIReviewV1))),
   s(:call, nil, :label, s(:arglist, s(:lit, :complete))))),
 s(:call, nil, :AdminCompleteV1, s(:arglist, s(:lvar, :tx))),
 s(:call, nil, :CallbackV3, s(:arglist, s(:lvar, :tx))))

But with the ruby 1.9.3 using this ruby_parser, RubyParser.new.parse(self.code), it generates:

s(:iter,
 s(:call, nil, :proc, s(:arglist)),
 s(:lasgn, :tx),
 s(:block,
  s(:call, nil, :NoOpV1, s(:arglist, s(:lvar, :tx))),
  s(:iter,
   s(:call,
    nil,
    :branch,
    s(:arglist, s(:call, s(:lvar, :tx), :aggregates, s(:arglist)))),
   s(:lasgn, :agg),
   s(:block,
    s(:call, nil, :label, s(:arglist, s(:lit, :APIContentV1))),
    s(:call, nil, :APIContentV1, s(:arglist, s(:lvar, :agg))),
    s(:call, nil, :label, s(:arglist, s(:lit, :APIReviewV1))),
    s(:call, nil, :APIReviewV1, s(:arglist, s(:lvar, :agg))),
    s(:call,
     nil,
     :ReputationBayes,
     s(:arglist,
      s(:lvar, :agg),
      s(:const, :APIContentV1),
      s(:const, :APIReviewV1))),
    s(:call, nil, :label, s(:arglist, s(:lit, :complete))))),
  s(:call, nil, :AdminCompleteV1, s(:arglist, s(:lvar, :tx))),
  s(:call, nil, :CallbackV3, s(:arglist, s(:lvar, :tx)))))

The difference is the first three lines:

s(:iter,
 s(:call, nil, :proc, s(:arglist)),
 s(:lasgn, :tx),
....

The above 3 lines is omitted passing the option :strip_enclosure => true.
So, what would be the equivalent of this in this ruby_parser library?

Ruby 1.9 Lambda issues

I noticed in PR #14 that 1.9 lambda's should (mostly?) work with the new ruby_parser 3 alpha. However, just now I got this error while trying the new version on some code in the office:

# ERROR: app/models/childminder.rb:43 :: parse error on value ["do", 43] (kDO_BLOCK)

  scope :of_franchisee, ->(f) do
    f.nil? ? nil : where(franchisee_agency_id: f.id)
  end

I'm not sure where the parser is choking, it talks about the do value, but I assume this has more to do with the lambda?

Does not properly handle 1-wide decomposition in block arguments

Demonstrated by this snippet:

[145] pry(main)> lambda { |a,(x), &b| p x }.call(1, [2, 3])
2
=> 2
[146] pry(main)> lambda { |a,x, &b| p x }.call(1, [2, 3])
[2, 3]
=> [2, 3]
[147] pry(main)> Ruby19Parser.new.parse('lambda { |a, (x), &b| y }')
=> s(:iter,
 s(:call, nil, :lambda),
 s(:masgn, s(:array, s(:lasgn, :a), s(:lasgn, :x), s(:lasgn, :"&b"))),
 s(:call, nil, :y))
[148] pry(main)> Ruby19Parser.new.parse('lambda { |a, x, &b| y }')
=> s(:iter,
 s(:call, nil, :lambda),
 s(:masgn, s(:array, s(:lasgn, :a), s(:lasgn, :x), s(:lasgn, :"&b"))),
 s(:call, nil, :y))

Add support for Timeouts

There are known bugs where certain Ruby source leads to significant parser slowdowns. People using RubyParser in operational settings need to wrap it in a timeout block to be safe.

My proposal is we add timeouts to RubyParser itself, with a default (configurable) of say 10 seconds. If the parse takes too long, raise a RubyParser::Timeout.

Thoughts?

Fails to handle .() arguments at all

Demonstrated by this snippet:

[88] pry(main)> Ruby19Parser.new.parse('f.(())')
=> s(:call, s(:call, nil, :f), :call)
[89] pry(main)> Ruby19Parser.new.parse('f.()')
=> s(:call, s(:call, nil, :f), :call)
[90] pry(main)> Ruby19Parser.new.parse('f.(test)')
=> s(:call, s(:call, nil, :f), :call)

tLABEL addition breaks some method calls

For example:

attr_reader:foo

I think this is an issue in the lexer. Comparing the tLABEL code I added to our lexer to JRubys, this is the diff:

-      if lex_state == :expr_beg || lex_state == :expr_arg || lex_state == :expr_cmdarg
+      if (lex_state == :expr_beg && !command_state) || lex_state == :expr_arg || lex_state == :expr_cmdarg

The above change will fix the parsing of attr_reader:foo but breaks the fcall_arglist_hash_colons test. I believe we want the !command_state check when lex_state == :expr_beg, however I haven't figured out why we can't keep it and still pass fcall_arglist_hash_colons.

parse error on value "=>" (tASSOC)

x = [:foo, :foo => "bar", :xxx => "yyy"]
puts x.inspect

➜  ruby evil.rb 
[:foo, {:foo=>"bar", :xxx=>"yyy"}]

➜  ruby_parse evil.rb
# file = evil.rb loc = 3
/Users/benmurphy/.rvm/rubies/ruby-1.9.2-p0/lib/ruby/1.9.1/racc/parser.rb:349:in `on_error' #<Racc::ParseError:  parse error on value "=>" (tASSOC)> for evil.rb
  /Users/benmurphy/.rvm/rubies/ruby-1.9.2-p0/lib/ruby/1.9.1/racc/parser.rb:349:in `on_error'
  /Users/benmurphy/.rvm/rubies/ruby-1.9.2-p0/lib/ruby/1.9.1/racc/parser.rb:99:in `_racc_do_parse_c'
  /Users/benmurphy/.rvm/rubies/ruby-1.9.2-p0/lib/ruby/1.9.1/racc/parser.rb:99:in `do_parse'

Problems with non-ascii character

$ find app -name \*.rb | xargs flog
ruby_parser-2.3.1/lib/ruby_lexer.rb:395:in `rb_compile_error': Invalid char "ç" in expression. near line 34: "ão => \"Adicionou Publicação\"})" (SyntaxError)

Bad sexp on block calls with masgn

# Old RubyParser (seems right):
RubyParser.new.parse("f { |(a,b),c| }")
s(:iter,
  s(:call, nil, :f, s(:arglist)),
  s(:masgn,
    s(:array,
      s(:masgn,
        s(:array, s(:lasgn, :a), s(:lasgn, :b))),
      s(:lasgn, :c))))

#3.0.0.a4 without parens (seems wrong):
Ruby19Parser.new.parse("f { |(a,b),c| }")
s(:iter,
  s(:call, nil, :f),
  s(:masgn,
    s(:array, s(:lasgn, :a), s(:lasgn, :b)), :c))

#3.0.0.a4 with parens (seems right):
Ruby19Parser.new.parse("f { |((a,b),c)| }")
s(:iter,
  s(:call, nil, :f),
  s(:masgn,
    s(:array,
      s(:masgn,
        s(:array, s(:lasgn, :a), s(:lasgn, :b))),
      s(:lasgn, :c))))

Does not handle splat in array decomposition in block arguments

Demonstrated by this snippet:

[122] pry(main)> Ruby19Parser.new.parse('lambda { |(a, *x)| }')
RuntimeError: no9: [s(:array, s(:lasgn, :a)), ",", "*", :x]
from /home/whitequark/.rbenv/versions/1.9.3-p194-perf/lib/ruby/gems/1.9.1/gems/ruby_parser-3.0.0.a8/lib/ruby19_parser.rb:4890:in `_reduce_359'

some test cases are failing with ruby 1.9.1

I'm building a debian package for ruby_parser and I got test failures during that, all tests pass with ruby 1.8 bu 6 tests fail with ruby 1.9.1. Details given below

Running tests for ruby1.8 with test file list from debian/ruby-test-files.yaml ...
Run options: --seed 57626

Running tests:

................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Finished tests in 0.295260s, 2330.1497 tests/s, 12903.8813 assertions/s.

688 tests, 3810 assertions, 0 failures, 0 errors, 0 skips

/usr/bin/ruby1.9.1 -I/usr/lib/ruby/vendor_ruby /usr/lib/ruby/vendor_ruby/gem2deb/test_runner.rb
Running tests for ruby1.9.1 with test file list from debian/ruby-test-files.yaml ...
Run options: --seed 7076

Running tests:

.............................................................................................................................................................................................................................................................................................................................E...............E..............................................E....................E........E.........................................................................F..................................E.............................................E................................E................F......F...FF............E..............F....................E......................

Finished tests in 0.214555s, 3257.9026 tests/s, 17869.5258 assertions/s.

  1. Error:
    test_lambda_args_2__19(TestRubyParser):
    Racc::ParseError:
    parse error on value ">" (tGT)
    /usr/lib/ruby/1.9.1/racc/parser.rb:349:in on_error' /usr/lib/ruby/1.9.1/racc/parser.rb:99:in_racc_do_parse_c'
    /usr/lib/ruby/1.9.1/racc/parser.rb:99:in do_parse' /media/forge/debian/diaspora/ruby-parser-2.2.0/debian/ruby-parser/usr/lib/ruby/vendor_ruby/ruby_parser_extras.rb:760:inprocess'
    /media/forge/debian/diaspora/ruby-parser-2.2.0/test/test_ruby_parser.rb:15:in `process'

  2. Error:
    test_defn_args_splat_mand__19(TestRubyParser):
    Racc::ParseError:
    parse error on value "mand" (tIDENTIFIER)
    /usr/lib/ruby/1.9.1/racc/parser.rb:349:in on_error' /usr/lib/ruby/1.9.1/racc/parser.rb:99:in_racc_do_parse_c'
    /usr/lib/ruby/1.9.1/racc/parser.rb:99:in do_parse' /media/forge/debian/diaspora/ruby-parser-2.2.0/debian/ruby-parser/usr/lib/ruby/vendor_ruby/ruby_parser_extras.rb:760:inprocess'
    /media/forge/debian/diaspora/ruby-parser-2.2.0/test/test_ruby_parser.rb:15:in `process'

  3. Error:
    test_defn_args_splat_middle__19(TestRubyParser):
    Racc::ParseError:
    parse error on value "last" (tIDENTIFIER)
    /usr/lib/ruby/1.9.1/racc/parser.rb:349:in on_error' /usr/lib/ruby/1.9.1/racc/parser.rb:99:in_racc_do_parse_c'
    /usr/lib/ruby/1.9.1/racc/parser.rb:99:in do_parse' /media/forge/debian/diaspora/ruby-parser-2.2.0/debian/ruby-parser/usr/lib/ruby/vendor_ruby/ruby_parser_extras.rb:760:inprocess'
    /media/forge/debian/diaspora/ruby-parser-2.2.0/test/test_ruby_parser.rb:15:in `process'

  4. Error:
    test_lambda_args_no__19(TestRubyParser):
    Racc::ParseError:
    parse error on value ">" (tGT)
    /usr/lib/ruby/1.9.1/racc/parser.rb:349:in on_error' /usr/lib/ruby/1.9.1/racc/parser.rb:99:in_racc_do_parse_c'
    /usr/lib/ruby/1.9.1/racc/parser.rb:99:in do_parse' /media/forge/debian/diaspora/ruby-parser-2.2.0/debian/ruby-parser/usr/lib/ruby/vendor_ruby/ruby_parser_extras.rb:760:inprocess'
    /media/forge/debian/diaspora/ruby-parser-2.2.0/test/test_ruby_parser.rb:15:in `process'

  5. Error:
    test_fcall_arglist_hash_colons__19(TestRubyParser):
    Racc::ParseError:
    parse error on value ":" (tCOLON)
    /usr/lib/ruby/1.9.1/racc/parser.rb:349:in on_error' /usr/lib/ruby/1.9.1/racc/parser.rb:99:in_racc_do_parse_c'
    /usr/lib/ruby/1.9.1/racc/parser.rb:99:in do_parse' /media/forge/debian/diaspora/ruby-parser-2.2.0/debian/ruby-parser/usr/lib/ruby/vendor_ruby/ruby_parser_extras.rb:760:inprocess'
    /media/forge/debian/diaspora/ruby-parser-2.2.0/test/test_ruby_parser.rb:15:in `process'

  6. Failure:
    test_str_question_literal__19(TestRubyParser) [/usr/lib/ruby/vendor_ruby/pt_testcase.rb:136]:
    failed on input: "?a".
    Expected s(:str, "a"), not s(:lit, 97).line(1).

  7. Error:
    test_lambda_args_0__19(TestRubyParser):
    Racc::ParseError:
    parse error on value ">" (tGT)
    /usr/lib/ruby/1.9.1/racc/parser.rb:349:in on_error' /usr/lib/ruby/1.9.1/racc/parser.rb:99:in_racc_do_parse_c'
    /usr/lib/ruby/1.9.1/racc/parser.rb:99:in do_parse' /media/forge/debian/diaspora/ruby-parser-2.2.0/debian/ruby-parser/usr/lib/ruby/vendor_ruby/ruby_parser_extras.rb:760:inprocess'
    /media/forge/debian/diaspora/ruby-parser-2.2.0/test/test_ruby_parser.rb:15:in `process'

  8. Error:
    test_lambda_args_2_no_parens__19(TestRubyParser):
    Racc::ParseError:
    parse error on value ">" (tGT)
    /usr/lib/ruby/1.9.1/racc/parser.rb:349:in on_error' /usr/lib/ruby/1.9.1/racc/parser.rb:99:in_racc_do_parse_c'
    /usr/lib/ruby/1.9.1/racc/parser.rb:99:in do_parse' /media/forge/debian/diaspora/ruby-parser-2.2.0/debian/ruby-parser/usr/lib/ruby/vendor_ruby/ruby_parser_extras.rb:760:inprocess'
    /media/forge/debian/diaspora/ruby-parser-2.2.0/test/test_ruby_parser.rb:15:in `process'

  9. Error:
    test_call_arglist_norm_hash_colons__19(TestRubyParser):
    Racc::ParseError:
    parse error on value ":" (tCOLON)
    /usr/lib/ruby/1.9.1/racc/parser.rb:349:in on_error' /usr/lib/ruby/1.9.1/racc/parser.rb:99:in_racc_do_parse_c'
    /usr/lib/ruby/1.9.1/racc/parser.rb:99:in do_parse' /media/forge/debian/diaspora/ruby-parser-2.2.0/debian/ruby-parser/usr/lib/ruby/vendor_ruby/ruby_parser_extras.rb:760:inprocess'
    /media/forge/debian/diaspora/ruby-parser-2.2.0/test/test_ruby_parser.rb:15:in `process'

  10. Failure:
    test_call_not_equal__19(TestRubyParser) [/usr/lib/ruby/vendor_ruby/pt_testcase.rb:136]:
    failed on input: "a != b".
    Expected s(:call, s(:call, nil, :a, s(:arglist)), :!=, s(:arglist, s(:call, nil, :b, s(:arglist)))), not s(:not, s(:call, s(:call, nil, :a, s(:arglist).line(1)).line(1), :==, s(:arglist, s(:call, nil, :b, s(:arglist).line(1)).line(1)).line(1)).line(1)).line(1).

  11. Failure:
    test_hash_new__19(TestRubyParser) [/usr/lib/ruby/vendor_ruby/pt_testcase.rb:136]:
    failed on input: "{ a: 1, b: 2 }".
    Expected s(:hash, s(:lit, :a), s(:lit, 1), s(:lit, :b), s(:lit, 2)), not nil.

  12. Failure:
    test_str_question_escape__19(TestRubyParser) [/usr/lib/ruby/vendor_ruby/pt_testcase.rb:136]:
    failed on input: "?\n".
    Expected s(:str, "\n"), not s(:lit, 10).line(1).

  13. Failure:
    test_str_question_control__19(TestRubyParser) [/usr/lib/ruby/vendor_ruby/pt_testcase.rb:136]:
    failed on input: "?\M-\C-a".
    Expected s(:str, "\x81"), not s(:lit, 129).line(1).

  14. Error:
    test_lambda_args_1__19(TestRubyParser):
    Racc::ParseError:
    parse error on value ">" (tGT)
    /usr/lib/ruby/1.9.1/racc/parser.rb:349:in on_error' /usr/lib/ruby/1.9.1/racc/parser.rb:99:in_racc_do_parse_c'
    /usr/lib/ruby/1.9.1/racc/parser.rb:99:in do_parse' /media/forge/debian/diaspora/ruby-parser-2.2.0/debian/ruby-parser/usr/lib/ruby/vendor_ruby/ruby_parser_extras.rb:760:inprocess'
    /media/forge/debian/diaspora/ruby-parser-2.2.0/test/test_ruby_parser.rb:15:in `process'

  15. Failure:
    test_call_unary_not__19(TestRubyParser) [/usr/lib/ruby/vendor_ruby/pt_testcase.rb:136]:
    failed on input: "!a".
    Expected s(:call, s(:call, nil, :a, s(:arglist)), :"!@", s(:arglist)), not s(:not, s(:call, nil, :a, s(:arglist).line(1)).line(1)).line(1).

  16. Error:
    test_splat_fcall_middle__19(TestRubyParser):
    Racc::ParseError:
    parse error on value 3 (tINTEGER)
    /usr/lib/ruby/1.9.1/racc/parser.rb:349:in on_error' /usr/lib/ruby/1.9.1/racc/parser.rb:99:in_racc_do_parse_c'
    /usr/lib/ruby/1.9.1/racc/parser.rb:99:in do_parse' /media/forge/debian/diaspora/ruby-parser-2.2.0/debian/ruby-parser/usr/lib/ruby/vendor_ruby/ruby_parser_extras.rb:760:inprocess'
    /media/forge/debian/diaspora/ruby-parser-2.2.0/test/test_ruby_parser.rb:15:in `process'

699 tests, 3834 assertions, 6 failures, 10 errors, 0 skips

Test "ruby1.9.1" failed. Continue building the package? (Y/N) n

Regression in Ruby 1.8 parsing symbol arguments with poor spacing

In Ruby 1.8 this is okay:

$ ruby -v
ruby 1.8.7 (2012-02-08 MBARI 8/0x6770 on patchlevel 358) [i686-darwin11.3.0], MBARI 0x6770, Ruby Enterprise Edition 2012.02
$ ruby -c -e "x if blah:y"

With ruby_parser 2.3.1 this is okay:

$ irb
1.9.3p125 :001 > require 'ruby_parser'
 => true 
1.9.3p125 :002 > RubyParser.new.parse "x if blah:y"
 => s(:if, s(:call, nil, :blah, s(:arglist, s(:lit, :y))), s(:call, nil, :x, s(:arglist)), nil) 

But with the master version of ruby_parser it's not okay:

$ irb
1.9.3p125 :001 > require 'ruby_parser'
 => true 
1.9.3p125 :002 > Ruby18Parser.new.parse "x if blah:y"
Racc::ParseError: 
parse error on value ["blah", 1] (error)
    from /Users/collins/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/racc/parser.rb:351:in `on_error'
    from (eval):3:in `_racc_do_parse_c'
    from (eval):3:in `do_parse'
    from /Users/collins/.rvm/gems/ruby-1.9.3-p125@ruby_parser/gems/ruby_parser-3.0.0a1/lib/ruby_parser_extras.rb:779:in `process'
    from (irb):2
    from /Users/collins/.rvm/rubies/ruby-1.9.3-p125/bin/irb:16:in `<main>'

I know this is terrible, but hopefully since this syntax was already supported before it isn't too hard to make sure it continues to be?

Multiple issues on valid Ruby 1.9 code...

[
  Date::DATE_FORMATS,
  Time::DATE_FORMATS,
  DateTime::DATE_FORMATS
].each do |formats|
  formats.merge!(
    period: '%Y-%m',
  )
end

First issue: It's not liking the 1.9-style hash syntax using an implicit hash (complains on the colon). However, if I add the relevant braces, it complains about the trailing comma. If I chase through all the errors I wind up with something like this:

[
  Date::DATE_FORMATS,
  Time::DATE_FORMATS,
  DateTime::DATE_FORMATS
].each { |formats|
  formats.merge!({ period: '%Y-%m' })
}

However, the output from the parser for this is nil.

parse error on value "." (tDOT)

According to this article Ruby 1.9.1+ supports method invocation syntax formatted like jQuery when chaining method calls, e.g.

result = foo
          .bar(i)
          .baz

but I get this error when ruby_parser hits some of my code which is formatted in that style:

/Users/ryan/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/racc/parser.rb:351:in `on_error':  (Racc::ParseError)
parse error on value "." (tDOT)
    from (eval):3:in `_racc_do_parse_c'
    from (eval):3:in `do_parse'
    from /Users/ryan/.rvm/gems/ruby-1.9.3-p194/gems/ruby_parser-2.3.1/lib/ruby_parser_extras.rb:750:in `process'
    from /Users/ryan/.rvm/gems/ruby-1.9.3-p194/gems/umlify-1.2.6/lib/umlify/parser_sexp.rb:39:in `parse_file'
    from /Users/ryan/.rvm/gems/ruby-1.9.3-p194/gems/umlify-1.2.6/lib/umlify/parser_sexp.rb:28:in `block in parse_sources!'
    from /Users/ryan/.rvm/gems/ruby-1.9.3-p194/gems/umlify-1.2.6/lib/umlify/parser_sexp.rb:26:in `each'
    from /Users/ryan/.rvm/gems/ruby-1.9.3-p194/gems/umlify-1.2.6/lib/umlify/parser_sexp.rb:26:in `parse_sources!'
    from /Users/ryan/.rvm/gems/ruby-1.9.3-p194/gems/umlify-1.2.6/lib/umlify/runner.rb:30:in `run'
    from /Users/ryan/.rvm/gems/ruby-1.9.3-p194/gems/umlify-1.2.6/lib/umlify.rb:47:in `execute'
    from /Users/ryan/.rvm/gems/ruby-1.9.3-p194/gems/umlify-1.2.6/bin/umlify:5:in `<top (required)>'
    from /Users/ryan/.rvm/gems/ruby-1.9.3-p194/bin/umlify:19:in `load'
    from /Users/ryan/.rvm/gems/ruby-1.9.3-p194/bin/umlify:19:in `<main>'
    from /Users/ryan/.rvm/gems/ruby-1.9.3-p194/bin/ruby_noexec_wrapper:14:in `eval'
    from /Users/ryan/.rvm/gems/ruby-1.9.3-p194/bin/ruby_noexec_wrapper:14:in `<main>'

Extremely slow parsing escape sequences followed by interpolation

Parsing this code:

"\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2#{}"

Results in this:

# file = test2.rb loc = 1
s(:dstr,
 "\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2",
 s(:evstr))
done


16.65s:     0.06 l/s:    0.01 Kb/s:    0 Kb:    1 loc:test2.rb

16.65s:     0.06 l/s:    0.01 Kb/s:    0 Kb:    1 loc:TOTAL

I doubled the length of the string:

"\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2\xE2#{}"

And let ruby_parse run for 11 minutes before killing it.

This is ruby_parser 2.3.1, Ruby 1.9.3.

Parse failure information sent to stderr instead of in the exception

When a parse error occurs, the most valuable output is sent to $stderr rather than being captured in the exception message. Specifically, the most relevant good line number information usually only comes out via stderr. Let me know if I'm missing something here.

I'd propose that RubyParser, by contract, never print to $stderr and instead always return all of the parse failure text in the exception instances. Thoughts?

Practically, this is an issue for me because I want to capture and surface the parser failures for Code Climate, and I'm having difficulty redirecting stderr to get them. But it seems RubyParser would be more well behaved overall to not print to stderr.

Does not handle |args; vars| in block arguments

Demonstrated by this snippet:

[94] pry(main)> z = 2; lambda { |x, y; z| z = 1 }.call(nil, nil); z
=> 2
[95] pry(main)> z = 2; lambda { |x, y| z = 1 }.call(nil, nil); z
=> 1
[96] pry(main)> Ruby19Parser.new.parse('z = 2; lambda { |x, y; z| z = 1 }; z')
TypeError: can't convert String into Array
from /home/whitequark/.rbenv/versions/1.9.3-p194-perf/lib/ruby/gems/1.9.1/gems/ruby_parser-3.0.0.a8/lib/ruby19_parser.rb:5046:in `concat'

Encoding::CompatibilityError regression introduced in 3.0.0.a6

This file parses on 3.0.0.a5:

https://github.com/mxcl/homebrew/blob/435b6787815a5c6e122e955c15b2b5abd7356a09/Library/Homebrew/keg_fix_install_names.rb

But on 3.0.0.a6 and later, it fails with:

Encoding::CompatibilityError: incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)

From my data, this regression affects a non-trivial percentage of Ruby files in the wild. I may have to downgrade to lock to 3.0.0.a5.

-Bryan

ruby_parse command line tool fails with NoMethodError

# file = config/application.rb loc = 68
/Users/bhelmkamp/.rbenv/versions/1.9.3-p194-perf/gemsets/codeclimate/gems/ruby_parser-3.0.0.a5/bin/ruby_parse:47:in `block in <top (required)>' #<NoMethodError: undefined method `reset' for #<RubyParser:0x007ff8939e00a0>> for config/application.rb
  /Users/bhelmkamp/.rbenv/versions/1.9.3-p194-perf/gemsets/codeclimate/gems/ruby_parser-3.0.0.a5/bin/ruby_parse:47:in `block in <top (required)>'
  /Users/bhelmkamp/.rbenv/versions/1.9.3-p194-perf/gemsets/codeclimate/gems/ruby_parser-3.0.0.a5/bin/ruby_parse:29:in `each'
  /Users/bhelmkamp/.rbenv/versions/1.9.3-p194-perf/gemsets/codeclimate/gems/ruby_parser-3.0.0.a5/bin/ruby_parse:29:in `<top (required)>'
  /Users/bhelmkamp/.rbenv/versions/1.9.3-p194-perf/gemsets/codeclimate/bin/ruby_parse:23:in `load'
  /Users/bhelmkamp/.rbenv/versions/1.9.3-p194-perf/gemsets/codeclimate/bin/ruby_parse:23:in `<main>'
done

Proposal: Introduce RubyParser::ParseFailure exception

Right now in Code Climate I have to rescue all of the following exceptions when I parse:

Racc::ParseError
ArgumentError
TypeError
RegexpError
RuntimeError
SyntaxError

What do you think of creating a contract where RubyParser raises a RubyParser::ParseFailure if it cannot parse?

Parsing issues with HAML

When trying to parse this file using rake gettext:find from the gettext gem:

https://github.com/zhdk/madek/blob/master/app/assets/javascripts/tmpl/media_resource/thumb_box/actions/menu.tmpl.haml

We get this from ruby_parser:

Can't change the value of true

And rake gettext:find then crashes. Where/who/what could help solve this? We've already removed all use of Ruby 1.9 hash syntax in all our files because ruby_parser can't seem to handle those in HAML, but we don't know what else to remove?

RuntimeError: no block_args19

This looks like it may be a known issue, but I didn't see a GitHub issue tracking it...

>> Ruby19Parser.new.parse("lambda {|x,y=false| puts x}")
RuntimeError: no block_args19 3 [s(:args, :x), ",", s(:lasgn, :y, s(:false)), nil]

RubyParser#parse sometimes mutates parameter

example: note the value of a after running RubyParser#parse is different to its value prior

>> a = "puts <<-FOO\nhi\nFOO\n"
=> "puts <<-FOO\nhi\nFOO\n"
>> RubyParser.new.parse(a)
=> s(:call, nil, :puts, s(:arglist, s(:str, "hi\n")))
>> a
=> "puts <<-FOO\nhi\nFOO\nFOO\n\n"

Constant re-initialization warning w/ R2R 1.3.1

Test case:

require 'rubygems'
require 'ruby2ruby'
require 'ruby_parser'

puts "ruby2ruby: #{Ruby2Ruby::VERSION}"
puts "ruby_parser: #{RubyParser::VERSION}"

Outputs:

/Users/neilc/.rvm/gems/ruby-1.8.7-p352/gems/ruby_parser-2.3.1/lib/ruby_parser_extras.rb:10: warning: already initialized constant ENC_NONE
/Users/neilc/.rvm/gems/ruby-1.8.7-p352/gems/ruby_parser-2.3.1/lib/ruby_parser_extras.rb:11: warning: already initialized constant ENC_EUC
/Users/neilc/.rvm/gems/ruby-1.8.7-p352/gems/ruby_parser-2.3.1/lib/ruby_parser_extras.rb:12: warning: already initialized constant ENC_SJIS
/Users/neilc/.rvm/gems/ruby-1.8.7-p352/gems/ruby_parser-2.3.1/lib/ruby_parser_extras.rb:13: warning: already initialized constant ENC_UTF8
ruby2ruby: 1.3.1
ruby_parser: 2.3.1

No warning is emitted if you require ruby_parser before ruby2ruby or if you use Ruby2Ruby 1.3.0.

Incorrect error message for unknown %? strings

Best described by the following snippet:

> Ruby19Parser.new.parse('%F(a #{f} b)')
SyntaxError: Bad %string type. Expected [QqwxrW], found 'F'.. near line 1: "a \#{f} b)"
from /home/whitequark/.rbenv/versions/1.9.3-p194-perf/lib/ruby/gems/1.9.1/gems/ruby_parser-3.0.0.a8/lib/ruby_lexer.rb:417:in `rb_compile_error'
> Ruby19Parser.new.parse('%s(a #{f} b)')
=> s(:lit, :"a \#{f} b")

The enumeration does not include s through it is accepted.

Empty hash incorrectly stripped from method call

Test case:

require 'rubygems'
require 'ruby_parser'
require 'ruby2ruby'
require 'pp'

ruby_src=<<eos                                                                                                                                                                                              
def foo                                                                                                                                                                                                     
  [1,2,3].reduce({}) do |memo, t|                                                                                                                                                                           
    memo[t] ||= 0                                                                                                                                                                                           
    memo[t] += t                                                                                                                                                                                            
    memo                                                                                                                                                                                                    
  end                                                                                                                                                                                                       
end                                                                                                                                                                                                         
eos

parser    = RubyParser.new
ruby2ruby = Ruby2Ruby.new
sexp      = parser.process(ruby_src)

p ruby2ruby.process(sexp)

Output:

"def foo\n  [1, 2, 3].reduce do |memo, t|\n    memo[t] ||= 0\n    memo[t] += t\n    memo\n  end\nend"

This is wrong: reduce and reduce({}) are not equivalent. The correct output is produced with ruby2ruby 1.3.0.

Uncertainty when reading from file

Related to #47, when using ruby_parser to parse the result of a newly read file, there is often the case of multiple statements in the file resulting in multiple sexps wrapped in a s(:block) versus a single statement returned alone. Fixing this would require always wrapping the return value of RubyParser#process to always wrap the contents which is horribly lame, so I propose adding a second interface when reading from files. Something like RubyParser#parse_file which knows to read in the contents of the file and wrap the returned AST in an s(:block). Reading from files seems to be a major use case since heckle, flay and flog all do so. What do you think?

Encoding::CompatibilityError

After upgrading to a6, I get Encoding::CompatibilityErrors...

incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)

Error Backtrace /data/worker/shared/bundle/ruby/1.9.1/gems/ruby_parser-3.0.0.a6/lib/ruby_lexer.rb:1246:in `check'
/data/worker/shared/bundle/ruby/1.9.1/gems/ruby_parser-3.0.0.a6/lib/ruby_lexer.rb:1246:in `block in yylex'
/data/worker/shared/bundle/ruby/1.9.1/gems/ruby_parser-3.0.0.a6/lib/ruby_lexer.rb:678:in `loop'
/data/worker/shared/bundle/ruby/1.9.1/gems/ruby_parser-3.0.0.a6/lib/ruby_lexer.rb:678:in `yylex'
/data/worker/shared/bundle/ruby/1.9.1/gems/ruby_parser-3.0.0.a6/lib/ruby_lexer.rb:92:in `advance'
/data/worker/shared/bundle/ruby/1.9.1/gems/ruby_parser-3.0.0.a6/lib/ruby_parser_extras.rb:874:in `next_token'
(eval):3:in `_racc_do_parse_c'
(eval):3:in `do_parse'
/data/worker/shared/bundle/ruby/1.9.1/gems/ruby_parser-3.0.0.a6/lib/ruby_parser_extras.rb:923:in `process'
/data/worker/releases/20120824192219/lib/quality/ruby/parser.rb:20:in `block in parse'

Parse error on tRPAREN

I can't figure out how to solve this...

I have an simple form.erb file

<%= form_for do %>
<% end %>

Then I run

erb = File.read "form.erb"  #=> "<%= form_for do %>\n<% end %>"

ruby = ERB.new(erb).src  #=> "_erbout = ''; _erbout.concat(( form_for do ).to_s); _erbout.concat \"\\n\"\n;  end ; _erbout" 

RubyParser.new.process ruby, "form.erb"

And then I get:

parse error on value ")" (tRPAREN)
    from /.../.rvm/rubies/ruby-1.8.7-p357/lib/ruby/1.8/racc/parser.rb:350:in `on_error'
    from /.../.rvm/rubies/ruby-1.8.7-p357/lib/ruby/1.8/racc/parser.rb:99:in `_racc_do_parse_c'
    from /.../.rvm/rubies/ruby-1.8.7-p357/lib/ruby/1.8/racc/parser.rb:99:in `__send__'
    from /.../.rvm/rubies/ruby-1.8.7-p357/lib/ruby/1.8/racc/parser.rb:99:in `do_parse'
    from /.../.rvm/gems/ruby-1.8.7-p357@ruby_parser/gems/ruby_parser-2.3.1/lib/ruby_parser_extras.rb:750:in `process'
    from (irb):6

How can I fix that ?

Error parsing ternary that contains a function call without parens

The following line of valid Ruby fails to parse with the error: parse error on value "user" (tIDENTIFIER)

user.email ? h user.email : "N/A"

It's complaining that it wasn't expecting "user" right after h. My guess is it thinks h is an identifier, not a function call. Wrapping user.email in parens parses just fine:

user.email ? h user.email : "N/A"

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.