Giter Site home page Giter Site logo

sgfparser's Introduction

SGFParser

Gitter Maintainability

Intro

I'm hoping that this is and remains the fastest SGF parser in Ruby. On my desktop, loading the SGF library and parsing Kogo's Joseki dictionary takes a little under six seconds. It's a 3MB file and the average SGF is maybe 10k, so on average it's rather snappy.

Are you using this gem? Is there functionality you wish it had? Is something hard to do? Does the documentation not make sense, and you know how to make it more helpful? Let me know and I'll make it possible, or easier!

Supported versions

SGF: FF4 - may support earlier ones as well, but untested. Ruby: >=2.1

Intro to SGF

According to the standard, An SGF file holds a Collection of one or more Gametree objects. Each of those is made of a tree of Node objects.

In other words: FILE (1 ↔ 1) Collection (1 ↔ ∞) Gametree (1 ↔ ∞) Node

Bringing in the code

Simplicity itself:

require 'sgf'

Basics of our data structure

In this implementation, when you parse a file, you get a Collection back. This object has a root Node used as the top-level node for all gametrees. The children of that node are the root nodes of the actual games.

Assuming a common SGF file with a single game, you could get to the game by doing this:

SGF.parse(filename).gametrees.first # => <SGF::Game:70180384181460>

If you have a string, instead, then:

SGF::Parser.new.parse sgf_string

Basics of properties

Some properties belong on the root node of a game only, such as the identity of the players. For convenience, some human-readable methods are defined on the gametree object itself to reach this information, for instance

gametree.black_player # => "tartrate"

Calling a property that is not defined in the current tree will result in an error. For instance, a property that does not exist in the game of Go:

gametree.black_octisquares # => SGF::NoIdentityError

Basics of navigating

Since a game is a tree (each node can be the source of many variations), a convenience method is defined to help you traverse the main branch one node at a time.

gametree.current_node # => starts as root node, e.g. #<SGF::Node:70180384857820, Has a parent, 1 Children, 16 Properties>
gametree.next_node    # => #<SGF::Node:70180384839420, Has a parent, 1 Children, 4 Properties>
gametree.current_node # => #<SGF::Node:70180384839420, Has a parent, 1 Children, 4 Properties>

Since it's easy to get lost when you're looking at things one node at a time (or because sometimes you don't want to iterate with an index), we also provide a convenience depth method on a given node to tell you how far down the tree you are.

And since this is Ruby, all of the objects (Collection, Gametree and Node) provide iteration through each. Note that in this example, we are using a gametree, and iteration on a gametree starts from the gametree's root, so the depth is 1. Iteration on a collection starts from the collection's root, and that node's depth would be 0. Iteration on any node starts from that node and goes through all its children.

NOTE: iteration is done as preorder tree traversal. You shouldn't have to care about this, but you might.

gametree.each do |node|
  puts "Node at depth #{node.depth} has #{node.properties.count} properties"
end
=begin
Node at depth 1 has 16 properties
Node at depth 2 has 4 properties
Node at depth 3 has 3 properties
Node at depth 4 has 4 properties
Node at depth 5 has 3 properties
Node at depth 6 has 3 properties
Node at depth 7 has 3 properties
Node at depth 8 has 4 properties
... And so on
=end

Basics of saving

There is SGF::Writer, which you can use starting from any node. There is also a convenience method on collection:

collection.save(filename) # => Shiny new text file
SGF::Writer.new.stringify_tree_from(node) # => Shiny string
SGF::Writer.new.save(node, filename) # => File with tree starting at node

If you need a raw SGF version of your data, you can use to_s:

node.to_s
gametree.to_s
collection.to_s

SGF Parsing warning (À bon entendeur…)

WARNING: An implementation requirement is to make sure any closing bracket ']' inside a comment is escaped: '\]'. If this is not done, you will be one sad panda! This library will do this for you upon saving, but will most likely die horribly when parsing anything which does not follow this rule.

Addenda

Branch name

The branch used for publishing the gem is the congruence branch. We chose this word because it has strong connotations for proper integration. This branch is congruent. It means all changes brought into this branch are congruent.

sgfparser's People

Contributors

amikula avatar dependabot[bot] avatar gitter-badger avatar mathieujobin avatar moss avatar trevoke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sgfparser's Issues

Merging gametrees

SGF allows for gametree merging. So, this gem should also allow for that to happen.

each should not stop processing if block returns nil

        sgf_raw = open(row["url"]).read
        puts row["url"]
        parser = SGF::Parser.new 
        tree = parser.parse sgf_raw

        tree.each do |x|
          puts x.inspect
        end

        tree.games[0].each do |x|
          puts x.inspect
        end

Output for both using http://files.gokgs.com/games/2011/11/29/andrew-ninecrane.sgf as source:

#<SGF::Node:2179137740, Children: 1, Parent: true, Properties:{"FF"=>"4", "ST"=>"2", "KM"=>"0.50", "RU"=>"Japanese", "TM"=>"900", "C"=>"andrew [3d]: hi\nninecrane [2d]: hi\n", "AP"=>"CGoban:3", "OT"=>"5x40 byo-yomi", "WR"=>"3d", "BR"=>"2d", "DT"=>"2011-11-29", "GM"=>"1", "SZ"=>"19", "PW"=>"andrew", "PB"=>"ninecrane", "CA"=>"UTF-8", "PC"=>"The KGS Go Server at http://www.gokgs.com/", "RE"=>"B+3.50”}>

Documentation

You have some comments in the source, but actual documentation is nil, and the stuff you do have in the README is actually wrong. For example, you pass no args to SGF::Parser.new, but the source calls for a raw string of the read data. Likewise, parser.parse has that argument, but in actuality, should not.

Generally any documentation at all would be nice.

Add errors/warning for SGF that doesn't conform to standard

I'd like to (optionally?) go through the tree once it is parsed and examine it for SGF defects, such as the first node of the game not having the FF identity.
Problems found that way would make their way into the errors array - or whatever that ends up being.

The parser seems to not parse well

Uploading a broken test to illustrate the issue:
(;FF[4];W[qd]) should have two nodes, one with FF => 4 and one with W => qd ... The second node is empty.

Speed Increases

The SGF parser is slllloooow.

Specifically...

SGF::Parser#next_character:

[email protected]? && @stream.sysread(1) is really slow. The stream methods are poor choices because they don’t get cached at all. You’d find exponential improvements in speed by reading the whole thing to a buffer at once, and then just slicing it or iterating over it.

SGF::Parser#parse_property

is really slow, due to the while loops in parse_comment, parse_multi_property and parse_generic_property. Please use built in string methods instead, they are much faster than your pure ruby implementation. Since you should now be reading to a buffer, you don’t need to go over ever character individually - expect huge speed ups.

SGF::Parser#still_inside_node?

steam methods AND while loop. lucky this doesn’t get called much.

Generally, reworking the parser to work with built-ins on a buffered read should do two things:

  1. immense speed ups
  2. no more issues with ]

Add errors/warning for malformed SGF

Right now, this parser will parse pretty much blindly.
I'd like to be able to add some kind of errors array, or something similar, to show the user what errors or problems occurred while parsing the file.

add_children modifies the children

This is a smell. When adding children to a node, that node modifies the children and sets himself as the parent.
Is this bad? Is there a more elegant way to do this, or is it fine?

Move depth property

The SGF format is move independent, but human logic is not. A move depth property shoudl be available for each node, specifying it’s depth from root. This is analogous to ‘move number’, as it is equivalent to the position if the node from the root (move 0).

Adding bonus properties like slices, or enumerations based on these would be delicious, delicious frosting on the cake of usability.

Examine naming : Tree/Game/Branch ?

The names are wrong. The Tree class really represents an entire game. The game class really represents a tree, or a branch of a tree. Naming is the issue here, and clarity thereby.

Use parser as an editor

Hello,
I tried to use the parser as an sgf editor.
I opened a sgf file, parsed into a tree, took the first game, current_node and then edited its properties.
But then when I tried to save back the tree as sgf, the current_node properties were back to old values.
Is there a way to use a parsed tree, modify some nodes and then save it back to a file?
Thanks in advance

Here is a sample code

require 'sgf'
parser = SGF::Parser.new
tree = parser.parse
tree.games.first.current_node[:EV] = "whatever"
tree.save <new_filename>
(does not work, new_filename does not contains changed property)

Collection#gametrees returns new objects all the time

This might be quite frustrating for a user.
The only reason this is done, right now, is to provide an up-to-date node count on to_s.

There are certainly smarter ways to do this without potentially blowing away a user's changes and causing unexpected SGFs to be saved.

Addendum: also done because otherwise I'm currently creating Collections early on with no gametrees, which leads to parsing empty gametrees.

Option 1: create collection AFTER all the parsing is done and I have filled gametrees
Option 2: maybe the observer pattern? (sigh) so that each node can send an update when it changes, and relevant objects who care about the update can, well, you know... Change.

Make indenter executable

While it could be useful to use the indenter inside a script to, say, reindent one's entire SGF collection, it would probably be nice to easily just do a one-off on the command line.

SGF::Parser - is @strict_parsing necessary?

This was added as I was TDDing the SGF parser. Is it an indication of an object trying to get out? Can it just be removed (and the tests that use this feature with it) ?

Parsing bad(?) sgf

Hi,

I'm using SgfParser (which is awesome, by the way) to parse tsumego from goproblems.com.

I guess this isn't a problem with SgfParser as such, but some of the sgf on goproblems has a format that doesn't parse correctly with SGF::Parser.new.parse.
(e.g. when it starts like "(;AB[ee]AB[ef]AB[ff]AW[ed]AW[dd]...")

In this case it just lists one AB/AW node (the last occurring), so in the example above AB=>'ff', AW=>'dd'.

For me I just validated it using some really ugly code (the source is on github, called gotasku), but maybe this case is a big enough deal to incorporate into the SgfParser.

Here is an example from goproblems:

goproblems.com/10000

(;AB[bq]AB[cq]AB[dp]AB[ep]AB[fp]AB[gp]AB[co]AB[hq]AB[iq]AB[jr]AB[gr]AW[gq]AW[fq]AW[eq]AW[dq]AW[cr]AW[hr]AW[hs]C[black to kill]AP[goproblems](;B[er](;W[dr]C[CHOICE](;B[fr];W[es](;B[br](;W[cs];B[bs]C[RIGHT])(;W[fs];B[cs]C[RIGHT]))(;B[is];W[fs])(;B[ir];W[fs])(;B[cs];W[br]))(;B[br];W[fr]))(;W[fr];B[dr]C[RIGHT]))(;B[fr];W[er](;B[es];W[dr](;B[br];W[fs];B[cs];W[gs])(;B[fs];W[gs])(;B[ds];W[br])(;B[cs];W[br]))(;B[ds];W[es](;B[cs];W[br])(;B[br];W[cs])))(;B[fs];W[es](;B[er];W[dr];B[ds];W[fr])(;B[is];W[fr])(;B[fr];W[er]))(;B[gs];W[fs])(;B[br];W[ds](;B[fr];W[er])(;B[er];W[fr]))(;B[es];W[er](;B[fs];W[gs];B[cs];W[fr])(;B[cs];W[fs])(;B[fr])))

Have properties belonging to the game accessible from tree

Properties like “RU” and “PW” and “TM” etc are only available on the root node, but they really belong to the tree class, no? They are comprehensive across the entire game, and should be accessible from the tree class (which is game specific).

Overloaded class initialization

SGF::Parser.new should be able to be created....

with a raw_string, like now
with a file handler
with a local path
(for bonus marks) from a url (or we could just use open-uri as a developer)

and of course...
without any arguments, with methods to supply one of the above later.

Update Gem Repo

It’s old compared to your source, doesn’t even have any of the game class stuff.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.