yakaz / yamerl Goto Github PK
View Code? Open in Web Editor NEWYAML 1.2 and JSON parser in pure Erlang
License: BSD 2-Clause "Simplified" License
YAML 1.2 and JSON parser in pure Erlang
License: BSD 2-Clause "Simplified" License
Hello! Thanks for this great library! I'm using this from Elixir and came across this issue:
Interactive Elixir (1.11.2) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> :yamerl_constr.string("{a: 2}")
[[{'a', 2}]]
iex(2)> :yamerl_constr.string("{a: 2")
[]
iex(3)>
Normally parse errors throw an exception but no exception is thrown here.
Yamler version 0.8.1
Eg.
[
[
[
{"application", "kernel"},
{"version", "2.15.3"},
{"path", "/usr/local/lib/erlang/lib/kernel-2.15.3"}
], [
{"application", "stdlib"},
{"version", "1.18.3"},
{"path", "/usr/local/lib/erlang/lib/stdlib-1.18.3"}
], [
{"application", "sasl"},
{"version", "2.2.1"},
{"path", "/usr/local/lib/erlang/lib/sasl-2.2.1"}
]
]
]
# applications.yaml
- application: kernel
version: 2.15.3
path: /usr/local/lib/erlang/lib/kernel-2.15.3
- application: stdlib
version: 1.18.3
path: /usr/local/lib/erlang/lib/stdlib-1.18.3
- application: sasl
version: 2.2.1
path: /usr/local/lib/erlang/lib/sasl-2.2.1
Why is main project's file is named yamerl_constr? Why not a simple "yamerl"?
Hi
I am not really sure thats valid yaml but ansible for example has no problem with this syntax.
test "timeout in yamerl" do
task = Task.async(
fn ->
:yamerl_constr.string("")
end
)
Task.await(task)
end
test "timeout in yamerl 2" do
task = Task.async(
fn ->
:yamerl_constr.string("---")
end
)
Task.await(task)
end
A file just with the document marker on top (required by some editors to recognize the file type) leads to a infinite loop somehow. An empty document without the the document marker works.
OTP 20
yamerl 0.6.0
We have a Ruby app that is serializing some RGBA hex strings with the Ruby YAML library. We get the following:
irb(main):001:0> require 'yaml'
=> true
irb(main):002:0> {rgba: "660000e0"}.to_yaml
=> "---\n:rgba: 660000e0\n"
And when we go over to yamerl it gets parsed as an exponential:
iex(1)> :application.start(:yamerl)
:ok
iex(2)> :yamerl_constr.string("---\n:rgba: 660000e0\n")
[[{':rgba', 6.6e5}]]
In the Ruby parser we would get a hex string back as expected:
irb(main):003:0> YAML.load("---\n:rgba: 660000e0\n")
=> {:rgba=>"660000e0"}
From what I can tell exponentials in YAML should take the form 1.23e+10
and the Ruby library conforms to that behavior:
irb(main):007:0> YAML.load("---\n:rgba: 6.6e+34\n")
=> {:rgba=>6.6e+34}
irb(main):008:0> {rgba: "6.6e+34"}.to_yaml
=> "---\n:rgba: '6.6e+34'\n"
Sorry if this isn't a yamerl bug - it's unclear to me in the spec how these are supposed to be handled
while compiling v0.8.0, I noticed the following warnings:
include/internal/yamerl_constr.hrl:45: Warning: record unfinished_node has field(s) without type information
include/internal/yamerl_constr.hrl:52: Warning: record node_anchor has field(s) without type information
should I worry? :)
I am trying to get yamerl into Elixir's hex through the community repository hexpm/community. But we have to provide a tag release to get it correctly in the repository. The v0.3.1-1
release still has the OTP 18.0 dict deprecation warnings.
Could you please create a new tag release from master?
I can't get this test string to parse:
test_string = """
%YAML 1.1
%TAG !u! tag:custom:
--- !u!100
Document1:
one: 1
--- !u!101
Document2:
two: 2
"""
Attempting to ignore the tags works for the first prefix, but not the second:
> :yamerl_constr.string(test_string, detailed_constr: true, ignore_unrecognized_tags: true)
** (throw) {:yamerl_exception, [{:yamerl_parsing_error, :error, ~c"Tag handle \"!u!\" never declared", 6, 11, :undeclared_tag_handle, {:yamerl_tag, 6, 5, ~c"!u!101"}, []}]}
Attempting to define the custom tags (forgive my use of YamlElixir here, but it's how I was able to move forward) also shows that the second document can't resolve the prefix, even when not ignoring unrecognized tags:
defmodule CustomNode do
def tags, do: for(id <- 100..101, do: ~c"tag:custom:#{id}")
def construct_token(constr, node, value) do
:yamerl_node_map.construct_token(constr, node, value)
end
def node_pres(node), do: :yamerl_node_map.node_pres(node)
end
:yamerl_app.set_param(:node_mods, [CustomNode])
> YamlElixir.read_all_from_string!(test_string)
** (YamlElixir.ParsingError) Tag "!u!101" unrecognized by any module (line: 6, column: 5)
(yaml_elixir 2.9.0) lib/yaml_elixir.ex:32: YamlElixir.read_all_from_string!/2
The repository has already a package.exs file, there is a reason that has not been published in http://hex.pm yet?
while debugging #25, I notce that a document explicitely specifying YAML 1.1 is still parsed as YAML 1.2 with the core schema.
Here is an example document:
%YAML 1.1
---
660000e0
Currently, yamerl still parses that node as a YAML 1.2 float, even though it should be string according to YAML 1.1.
2> yamerl:decode_file("yaml1.1-document.yaml").
[6.6e5]
Version 0.4.0. This means affecting hex packages and other packages like yaml_elixir.
Impact Launching observer, it shows consuming all CPU by running in infinite loop and growing memory without bound.
Vuln For responsible disclosure practices, an email address with an associated GPG public key (maybe one of these) is needed to transfer specifics. If no reply is made, it will be publicly disclosed 2 months from today.
Hey, I'm using trying to parse YAML in Elixir using this code:
:yamerl_constr.file("./data.yml")
and I get this error:
** (throw) {:yamerl_exception, [{:yamerl_parsing_error, :error, 'Tag "!textFormat" unrecognized by any module', 5, 19, :unrecognized_node, {:yamerl_tag, 5, 19, '!textFormat'}, []}]}
I've tried parsing the same yaml using Ruby's yaml gem and it didn't have any issues.
The YAML looks like this:
IDE:
editor:
themes:
base16-zenburn:
built-in: !textFormat
color: "#dc8cc3"
background: "#3f3f3f"
italic: false
char: !textFormat
color: "#5f7f5f"
background: "#3f3f3f"
italic: false
class: !textFormat
color: "#e0cf9f"
background: "#3f3f3f"
italic: false
comment: !textFormat
color: "#4f4f4f"
background: "#3f3f3f"
italic: false
currentLine: !textFormat
color: "#dcdccc"
background: "#4f4f4f"
italic: false
env-var: !textFormat
color: "#dca3a3"
background: "#3f3f3f"
italic: false
evaluatedCode: !textFormat
color: "#dcdccc"
background: "#4f4f4f"
italic: false
keyword: !textFormat
color: "#dc8cc3"
background: "#3f3f3f"
bold: true
italic: false
lineNumbers: !textFormat
color: "#4f4f4f"
background: "#3f3f3f"
italic: false
matchingBrackets: !textFormat
color: "#dcdccc"
background: "#606060"
bold: true
italic: false
mismatchedBrackets: !textFormat
color: "#dcdccc"
background: "#dc8cc3"
italic: false
number: !textFormat
color: "#dfaf8f"
background: "#3f3f3f"
italic: false
postwindowemphasis: !textFormat
color: "#dcdccc"
background: "#3f3f3f"
bold: true
italic: false
postwindowerror: !textFormat
color: "#dca3a3"
background: "#3f3f3f"
italic: false
postwindowsuccess: !textFormat
color: "#5f7f5f"
background: "#3f3f3f"
italic: false
postwindowtext: !textFormat
color: "#dcdccc"
background: "#3f3f3f"
italic: false
postwindowwarning: !textFormat
color: "#dfaf8f"
background: "#3f3f3f"
italic: false
primitive: !textFormat
color: "#e0cf9f"
background: "#3f3f3f"
italic: false
searchResult: !textFormat
color: "#dcdccc"
background: "#404040"
italic: false
selection: !textFormat
color: "#dcdccc"
background: "#606060"
italic: false
string: !textFormat
color: "#5f7f5f"
background: "#3f3f3f"
italic: false
symbol: !textFormat
color: "#7cb8bb"
background: "#3f3f3f"
italic: false
text: !textFormat
color: "#dcdccc"
background: "#3f3f3f"
italic: false
whitespace: !textFormat
color: "#606060"
background: "#3f3f3f"
italic: false
This likely happens for things like lists as well, but I didn't check those. But onto the issue.
I found this while implementing a keyword list tag for Elixir so that specific fields, rather than all map fields as is the only current option in YamlElixir
, could be changed to keyword lists. So I implemented a tag and a corresponding module to process said tag and return the proper results.
When creating two different documents that look like so:
foo:
foo: bar
bar: foo
foo: {foo: bar, bar: foo}
The AST/results for the two documents look exactly the same. However if you try to tag the map under foo
like so:
foo: !<custom tag here>
foo: bar
bar: foo
foo: !<custom tag here> {foo: bar, bar: foo}
The processing of these two documents take very different forms. In the first case the behaviour is as expected and effectively matches the flow in yamerl_node_map.erl
. You get collection start/end tokens, and key/value tokens along with the unfinished node record to properly construct the data structure. In each of these cases the keys and values are tagged correctly so that the proper token can be constructed:{:yamerl_scalar, 1, 1, {:yamerl_tag, 1, 1, {:non_specific, '?'}}, :flow, :plain, 'foo'}
However when using the second, flow style, format the processing changes completely. Rather than receiving the above token to be constructed, instead we get this: {:yamerl_scalar, 1, 44, {:yamerl_tag, 1, 6, 'tag:yaml_elixir,2019:keyword_list'}, :flow, :plain, 'foo'}
Since that doesn't match the format expected by the yamerl_node_str
module it won't build that token correctly. The Tag, rather than being applied to the structure as a whole, is being applied to the individual keys/values/everything within that flow style map. This means in order to support the flow style, the module implementing the custom tag now has to modify the token and then do its own iteration over the yamerl modules to see if they want to actually construct the token. Now I know there isn't exactly a lot of options to choose from when constructing the keys, but even there that seems like a bad blurring of responsibilities and leaking of internal logic.
What I would expect would be for the flow style map to behave exactly the same in processing as the block style map. With tags applied in the exact same way, and the processing done in the exact same order so that the same custom node module and logic would work for both styles.
According to the YAML spec (https://yaml.org/spec/1.2/spec.html#id2802432), one of "maps" properties is to possess unique keys.
Discovered a bug that yamerl allows multiple keys when map is proplist and/or {detailed_constr, true} is set.
I have a PR to address the bug along with a test coming soon.
Try to parse the following file:
https://github.com/ua-parser/uap-core/blob/master/tests/test_device.yaml
if you do print user_agent_string
for each row when you will hit:
SkyNet/1.5.1-0000(android:4.0.3;package:com.halfbrick.fruitninjafree;lang:zh_CN;app_version:null;channel:GF0S0N00000;device_brand:htccn_chs_cu;device_model:HTC T328w;resolution:480X800;udid:ffffffff-8d34-e60d-ffff-ffff92fede89;cpu_freq:1008000;google_account:null;phone_number:unknown;game_name:水果忍者;encoded:true;sdk_version:1.5.1;imei:353614053116514;location:unknown)
The Chinese chars are messed up. and list_to_binary
fails as well
Silviu
There seems to be an upper limit of the number of entries in a list. Using this example
%YAML 1.2
role:
ownership:
root: /var/log/bc var/log/bd
root: /var/log/bc1 var/log/bd
root: /var/log/bc2 var/log/bd
root: /var/log/bc3 var/log/bd
root: /var/log/bc4 var/log/bd
root: /var/log/bc5 var/log/bd
root: /var/log/bc6 var/log/bd
root: /var/log/bc7 var/log/bd
root: /var/log/bc8 var/log/bd
root: /var/log/bc9 var/log/bd
root: /var/log/bc0 var/log/bd
root: /var/log/bc11 var/log/bd
root: /var/log/bc12 var/log/bd
root: /var/log/bc13 var/log/bd
root: /var/log/bc14 var/log/bd
root: /var/log/bc15 var/log/bd
root: /var/log/bc16 var/log/bd
root: /var/log/bc17 var/log/bd
roota: /var/log/bc18 var/log/bd
rootb: /var/log/bc19 var/log/bd
results in
[[{"role",
[{"ownership",
[{"root","/var/log/bc var/log/bd"},
{"root","/var/log/bc1 var/log/bd"},
{"root","/var/log/bc2 var/log/bd"},
{"root","/var/log/bc3 var/log/bd"},
{"root","/var/log/bc4 var/log/bd"},
{"root","/var/log/bc5 var/log/bd"},
{"root","/var/log/bc6 var/log/bd"},
{"root","/var/log/bc7 var/log/bd"},
{"root","/var/log/bc8 var/log/bd"},
{"root","/var/log/bc9 var/log/bd"},
{"root","/var/log/bc0 var/log/bd"},
{"root","/var/log/bc11 var/log/bd"},
{"root","/var/log/bc12 var/log/bd"},
{"root","/var/log/bc13 var/log/bd"},
{"root","/var/log/bc14 var/log/bd"},
{"root","/var/log/bc15 var/log/bd"},
{"root","/var/log/bc16 var/log/bd"},
{"root","/var/log/bc17 var/log/bd"},
{"roota","/var/log/bc18 var/log/bd"},
{"rootb",[...]}]}]}]]
Removing just one entry from the above yaml file results in the correct production of the erl file. This is a trivial example. The real issue arose during the development of quite complex yaml files, which resulted in multiple ellipses occurring.
yamerl 0.9.0 fixed the situation where duplicate keys in mappings were not checked, and thus allowed. Some users relied on that behavior, even though it is invalid from a YAML specification point of view.
Both behaviors should be fine and there should be an option to enable the old behavior, meaning that duplicate keys would be kept when the constructure structure permits it.
See #39 and @bobkocisko comment.
Hello
Recently most project has been switching to use binaries to represent strings. I believe yamerl should do the same. Here's an example of the ambiguity created by "strings as list of integers" approach:
7> yamerl:decode("before_install: some_command").
[[{"before_install","some_command"}]]
8> yamerl:decode("before_install:\n - some_command\n - another").
[[{"before_install",["some_command","another"]}]]
The problem here is that you can't distinguish between those two cases with a simple is_list call, since you have to inspect a term more closely. If binaries are used for strings, it would be a trivial case.
Hey,
I'm trying to compile yamerl from the Hex package provided (https://hex.pm/packages/yamerl) but there is an error when running mix deps.compile yamerl
:
==> yamerl (compile)
ERROR: sh(awk '/^AC_INIT\(/ { ver=$0; sub(/AC_INIT\(\[[^]]+\], *\[/, "", ver); sub(/\].*/, "", ver); print ver; } { next; }' configure.ac)
failed with return code 2 and the following output:
awk: fatal: cannot open file `configure.ac' for reading (No such file or directory)
Looking at https://github.com/yakaz/yamerl/blob/master/package.exs#L22, it seems that configure.ac
is missing.
Thanks!
Used from elixir (I don't think it matters), but when executing:
iex> :yamerl.decode("test:")
or
iex> :yamerl_constr.string("test:")
It blocks forever, using 100% CPU.
It should return some kind of error about invalid YAML, or an object with .
For comparison, Python's yaml module, returns: {'test':None}
Hello
I get the following error while parsing multi-bytes strings.
Erlang/OTP 17 [erts-6.0] [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V6.0 (abort with ^G)
1> application:start(yamerl).
ok
2> yamerl_constr:string("いろは").
** exception error: bad argument
in function re:run/3
called as re:run([12356,12429,12399],
"^(\\.[0-9]+|[0-9]+(\\.[0-9]*)?)([eE][-+]?[0-9]+)?$",
[{capture,none}])
in call from yamerl_node_float:string_to_float2/1 (src/yamerl_node_float.erl, line 145)
in call from yamerl_node_float:construct_token/3 (src/yamerl_node_float.erl, line 64)
in call from yamerl_node_float:try_construct_token/3 (src/yamerl_node_float.erl, line 54)
in call from yamerl_constr:try_construct/3 (src/yamerl_constr.erl, line 371)
in call from yamerl_constr:construct/2 (src/yamerl_constr.erl, line 315)
in call from yamerl_constr:string/2 (src/yamerl_constr.erl, line 203)
Hi
I found an exception raised when parsing the block scalar style with indentation.
(yamerl 0.8.0 using from Elixir)
iex(1)> :yamerl.decode("--- |
...(1)> foo
...(1)> bar
...(1)> baz
...(1)> ---
...(1)> new document
...(1)> ")
** (throw) {:yamerl_exception, [{:yamerl_parsing_error, :error, 'Invalid block scalar indentation', 5, 1, :invalid_block_scalar_indentation, {:yamerl_scalar, 1, 5, {:yamerl_tag, 1, 5, {:non_specific, '!'}}, :block, :literal, 'foo\nbar\nbaz'}, []}]}
(yamerl 0.8.0) /home/hattori/exyaml/deps/yamerl/src/yamerl_errors.erl:59: :yamerl_errors.throw/1
(yamerl 0.8.0) /home/hattori/exyaml/deps/yamerl/src/yamerl_constr.erl:471: :yamerl_constr.string/2
I am escriptizing an erlang project that depends on yamerl and distributing to other users. Everyhing works fine but I get this info report:
=INFO REPORT==== 3-Nov-2014::18:42:12 ===
<HiPE (v 3.11)> Warning: not loading native code for module yamerl_parser: it was compiled for an incompatible runtime system; please regenerate native code for this runtime system
Do you know how can I use yamerl everywhere without having to configure it so that I do not get that info report?
I'm using yaml for configuration and it would be awesome to be able to highlight the term that is incorrect in a listing of the user's yaml file.
This requires one of:
Is there an appetite for such a change? Would it be difficult to make? Would you be likely to accept a PR?
Cheers,
James
Use of autotools is an overkill here in my opinion. Is there any specific reason to use it? Currently the lowest common denominator for Erlang projects is Rebar, let's all use it.
Includes in yamerl are kinda woky and only work because rebar3 explicitly adds app/include to private include libraries.
When attempting to include them from another application, which is necessary in order to write a custom node module (no access to ext_options otherwise!), it explodes.
Possible solutions:
It's time to drop the autotools and use a common Erlang build system as the primary and only way of building yamerl.
The Debian package will also be dropped. It can be restored in a separate repository, it doesn't belong to the "upstream" source repository.
I got this issue when i try to compile another project (coil)
λ mix
Uncaught error in rebar_core: {'EXIT',
{function_clause,
[{code,which,
[{coveralls,
{git,
"https://github.com/markusn/coveralls-erl",
{branch,"master"}}}],
[{file,"code.erl"},{line,719}]},
{rebar_core,'-plugin_modules/3-lc$^0/1-0-',
1,
[{file,"src/rebar_core.erl"},{line,573}]},
{rebar_core,plugin_modules,3,
[{file,"src/rebar_core.erl"},{line,573}]},
{rebar_core,process_dir1,7,
[{file,"src/rebar_core.erl"},{line,244}]},
{rebar_core,process_commands,2,
[{file,"src/rebar_core.erl"},{line,93}]},
{rebar,main,1,
[{file,"src/rebar.erl"},{line,58}]},
{escript,run,2,
[{file,"escript.erl"},{line,757}]},
{escript,start,1,
[{file,"escript.erl"},{line,277}]}]}}
** (Mix) Could not compile dependency :yamerl, "escript.exe "c:/Users/rz/.mix/rebar" compile skip_deps=true deps_dir="d:/proj/coil/_build/dev/lib"" command failed.
You can recompile this dependency with "mix deps.compile yamerl",
update it with "mix deps.update yamerl" or clean it with "mix deps.clean yamerl"
Hi,
Any specific reasons why the v0.6.0 tag is marked as "latest release" in github when the 0.7.0 tag is present and apparently in good shape? I assume you just forgot to mark 0.7.0 as latest release though I am checking.
I see both 0.6.0 and 0.7.0 passed CI. I see 0.7.0 is on hex.pm .
Thanks for the useful library!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.