Giter Site home page Giter Site logo

mirage / ocaml-cow Goto Github PK

View Code? Open in Web Editor NEW
106.0 26.0 23.0 1.09 MB

Caml on the Web (COW) is a set of parsers and syntax extensions to let you manipulate HTML, CSS, XML, JSON and Markdown directly from OCaml code.

Home Page: http://www.openmirage.org/

License: Other

OCaml 99.87% Makefile 0.13%

ocaml-cow's Introduction

ocaml-cow's People

Contributors

avsm avatar chris00 avatar chrismamo1 avatar craigfe avatar dsheets avatar emillon avatar hannesm avatar mor1 avatar pgj avatar rgrinberg avatar samoht avatar talex5 avatar tbrk avatar waldyrious avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ocaml-cow's Issues

pa_css: doesnt handle empty fields

 <:css<
  pre.verbatim, pre.codepre { }
 >>

Error: While expanding quotation "css" in a position of "str_item": Camlp4: Uncaught exception: Parsing.Parse_error

Fix attrs antiquotation expander

The attrs antiquotation expander is presently very brittle as demonstrated in 0d72af0.

Specifically, attrs' whitespace handling is very broken. Attributes may be separated by many whitespace characters with different char codes. Attribute values may contain whitespace. Attribute key-values may have whitespace between their tokens and '='.

Can't embed attributes with `:` in them

Specifically trying to add xml:base into Atom for relative links

<:xml<<a xml:foo="bar"></a>&>>;;
- : Cow.Xml.t = [`El ((("", "a"), [(("", "foo"), "bar")]), [])]     

The xml:foo turns into foo.

Print json prettier

It would be nice to have some control over how json was converted to a string.

For example, it would be useful to be able to print the json with some newlines in it so that it is human readable.

Why is re a direct dependency?

I've had a look and it doesn't seem like re is being used anywhere. I see the syntax extension is using str which is fine but otherwise no uses of re.

@avsm @samoht please confirm/deny?

code output incompatible with Core

If Core is opened, then our List.flatten is unbound. Apparently core exposes a module which can be opened locally to get the original stdlib back, but we'll still need a command line flag for full compatibility. Is there some ocamlfind predicate which might help with hiding this, I wonder...

`make tests` fails

ocamlfind ocamlc -package oUnit -linkpkg -I /home/dsheets/.opam/4.02.1/lib/dyntype dyntype.cma -I /home/dsheets/.opam/4.02.1/lib/ulex ulexing.cma -I /home/dsheets/.opam/4.02.1/lib/ocaml unix.cma -I /home/dsheets/.opam/4.02.1/lib/oUnit oUnitAdvanced.cma -I /home/dsheets/.opam/4.02.1/lib/oUnit oUnit.cma -I /home/dsheets/.opam/4.02.1/lib/re re.cma -I /home/dsheets/.opam/4.02.1/lib/re re_posix.cma -I /home/dsheets/.opam/4.02.1/lib/stringext stringext.cma -I /home/dsheets/.opam/4.02.1/lib/ocaml bigarray.cma -I /home/dsheets/.opam/4.02.1/lib/sexplib sexplib.cma -I /home/dsheets/.opam/4.02.1/lib/uri uri.cma -I /home/dsheets/.opam/4.02.1/lib/xmlm xmlm.cma -I /home/dsheets/.opam/4.02.1/lib/uutf uutf.cma -I /home/dsheets/.opam/4.02.1/lib/jsonm jsonm.cma -I /home/dsheets/.opam/4.02.1/lib/hex hex.cma -I /home/dsheets/.opam/4.02.1/lib/ezjsonm ezjsonm.cma -I /home/dsheets/.opam/4.02.1/lib/omd omd.cma -I ../_build/lib cow.cma render.cmo extension.cmo\
  test.ml -o test
File "/home/dsheets/.opam/4.02.1/lib/ocaml/unix.cma(Unix)", line 1:
Warning 31: files /home/dsheets/.opam/4.02.1/lib/ocaml/unix.cma(Unix) and /home/dsheets/.opam/4.02.1/lib/ocaml/unix.cma(Unix) both define a module named Unix
File "/home/dsheets/.opam/4.02.1/lib/ocaml/unix.cma(UnixLabels)", line 1:
Warning 31: files /home/dsheets/.opam/4.02.1/lib/ocaml/unix.cma(UnixLabels) and /home/dsheets/.opam/4.02.1/lib/ocaml/unix.cma(UnixLabels) both define a module named UnixLabels
make[1]: Leaving directory '/home/dsheets/Code/ocaml-cow/tests'
tests/render
make: tests/render: Command not found
Makefile:15: recipe for target 'tests' failed
make: *** [tests] Error 127

This issue was introduced since 1.2.0 in 8f271fd and 587b947, @samoht.

Json marshalling does not seem to work properly with type containing option

The following basic example:

open Cow
type data = {
output : string option;
errmsg : string option;
} with json

fails to compile with :

File "test.ml", line 3, characters 5-72:
Error: This expression has type 'a list
but an expression was expected of type string * Cow.Json.t

The error appears with cow-0.5.2 and ocaml-4.0.1. Thanks in advance.

Html parser is not robust enough

Example:

# let s = "<p><img src=\"http://typeocaml.com/content/images/2014/11/thunk.jpg#hero\" alt=\"thunk\">\nA thunk is simply a function with the <em>unit</em> parameter. For example:</p>\n\n<pre><code class=\"ocaml\">let f() = 1 + 2 * 3;;  \n</code></pre>\n\n<h1 id=\"features\">Features</h1>\n\n<p>It is indeed a function and seems simple enough. However, don't overlook it. Not having any parameters makes it so special that it even has such a particular name, <em>thunk</em>. Let's have a look at its features. </p>\n\n<h2 id=\"determinedresult\">Determined result</h2>\n\n<p>If a function has parameters, the result of its application may depend on its <a href=\"http://stackoverflow.com/questions/156767/whats-the-difference-between-an-argument-and-a-parameter\">arguments</a>. For example:</p>\n\n<pre><code class=\"ocaml\">let g x y = x + y;;  \n</code></pre>\n\n<p>Without giving values for x and y, we won't be able to know what concrete value that g would give back to us.</p>\n\n<p>Thunk, however, does not have any parameters, which means its result must be fixed. After defining it, the body will not be affected by anything any more. It is like a sealed box: we may not know what's inside yet, but we know that once it is there, its content won't change. For the <code>f</code> at the beginning, we know its value will definitely be 7. For <code>let f1() = h 2 (h 3 4)</code>, we may not know the value of its body but we are sure its body is two applications of <code>h</code> and it won't change.</p>\n\n<h2 id=\"laterevaluation\">Later evaluation</h2>\n\n<p>Binding also has determined result and it is immediately evaluated after we define it. For example, <code>let x = 1 + 2 * 3</code> will return you x with 7 at once. </p>\n\n<p>Thunk is different. When we hold a thunk, we must call it like <code>f()</code> to get its result. The evaluation is a kind of delayed, but we have the full control. Here is a demonstration:</p>\n\n<pre><code class=\"ocaml\">(*Will fail immediately after you hit enter*)\nlet y = 10 / 0;;\n\n(*Will not fail but return a thunk*)\nlet f_div0() = 10 / 0;;\n\n(*Now fail*)\nf_div0();;  \n</code></pre>\n\n<h2 id=\"perfectcomputationcapsule\">Perfect computation capsule</h2>\n\n<p>With the help of the above two features, thunk encapsulates computations without evaluation, in other words, thunk does not store the actual value; instead, it stores the way of how the value would be computed. And the actual compuations will be carried out only when you decide so. Here is a case that desires thunk:</p>\n\n<pre><code class=\"ocaml\">let print_prime prime x =  \n  if x then print_int prime\n  else print_int 0;; \n\n(*Assume find_nth_prime n is already somewhere*)\nprint_prime (find_nth_prime 1000000) false;;  \n</code></pre>\n\n<p><code>print_prime</code> prints the prime only when x is true. It sounds ok. But the next function application shows a flaw that it will not print the 1,000,000th prime, but anyway the prime will be computed. What a waste of cpu cycles!</p>\n\n<p>With thunk, it gets better:  </p>\n\n<pre><code class=\"ocaml\">let print_prime prime x =  \n  if x then print_int (prime()) (*called when needed*)\n  else print_int 0;;\n\n(*Create a thunk*)\nlet millionth_prime() = find_nth_prime 1000000;;\n\nprint_prime millionth_prime false;;  \n</code></pre>\n\n<p>Now, <code>print_prime</code> takes a thunk and we also create a capsule for the missionth prime. The thunk will be evaluated only before printing.</p>\n\n<h1 id=\"importantusages\">Important Usages</h1>\n\n<p>Thunk is the fundamental element for several classic usages:</p>\n\n<ul>\n<li><strong>stream_list</strong>: element is produced when being retrieved</li>\n<li><strong>lazy</strong>: evaluated only when needed and only once</li>\n<li><strong>async</strong>: concurrent computing, a kind of queue with lots scheduled computations. </li>\n</ul>\n\n<p>The above three and the important roles of thunks will be presented in details in future posts.</p>\n\n<hr>\n\n<p><em>To be continued</em></p>";;
              val s : string =
  "<p><img src=\"http://typeocaml.com/content/images/2014/11/thunk.jpg#hero\" alt=\"thunk\">\nA th…"
# let h = Cow.Html.of_string s;;
Exception: Parsing.Parse_error.

Rename fields when reading from json?

In my use case I have a json field that contains a - and i'd like cow convert it to the appropriate field that has an underscore instead. meta_conv seems to do this with its as directive.

License?

The documentation does not indicate the license for this library. Most of the sources seem to use a MIT-like license, but others such as json.ml claim to be LGPL 2.1 (+LE). Is it possible to have a canonical license covering the entire library?

html quotation should eat the right-most space

We are currently forced to introduce a final space:

  <:html< <a class="$str:cl$" href="$str:href$">$str:text$</a> >>

In this case, we can eat the last space or newline, and if you really want a space at the end, then introduce a double space (suggestion from @lpw25).

LICENSE missing?

I am interested in porting this software to Fan, but don't see any LICENSE yet. Thanks

De-indentation option

Serialization functions should offer the capability to strip consistent leading whitespace from every line.

br

Should br just have the type val br : t?

Bug in parsing of integer literals

It seems to parse 300_000_000 as multiple units, so 300 gets highlighted as a different color than the rest of it when it's syntax-highlighted.

alist and attrs antiquot expander wrongly typed

These antiquotation expanders are ... -> ((string * 'a) * 'b) list when they should be ... -> Xml.signal list in <syntax/xml/quotation.ml>. Documentation/expected output would also help. If you fix this, uncomment and correct tests in <tests/render.ml>.

HTML output should be Polyglot

Cow currently violates the Polyglot http://www.w3.org/TR/html-polyglot/ spec in a number of ways. Most importantly:

  • No doctype
  • No xmlns
  • Unnecessary inclusion of XML declaration
  • No case normalization
  • Collapses elements without EMPTY content model
  • Other minor DOM additions
  • No <script> handling

Change Json representation/Or use ezjsonm

I'm mostly a user of jsonm/ezjsonm for my general json parsing needs and only like to use cow for convenience. Unfortunately the json representation that cow uses is incompatible with ezjsonm/jsonm. I would much rather have cow just reuse ezjsonm's parsing code or at least present the same representation using polymorphic variants:

type json = 
  [ `Null | `Bool of bool | `Float of float| `String of string
  | `A of json list | `O of (string * json) list ]

I'd attempt to hack at this myself but this is a very breaking change so I'm not sure how interested the maintainers are.

Also, yojson is also rather popular but you can't please everybody. Yojson also has atd which serves the same purpose as cow.

Html.link should be Html.a

link is a different tag in HTML from a. As this value was just added by @Chris00, it shouldn't cause any breakage.

The link type and value html_of_link should also be renamed, eventually. These renamings will cause breakage in at least opam2web.

xml/html parsing and embedded markdown (mirage-decks)

see the email thread; basically, inline markdown as used by mirage-decks doesn't work with cow because there's no way to pass through invalid uses of xml significant characters (& < > etc). from the thread:
mort

it works for some of the markdown, but there are some things it doesn't help. basically, it stops cow barfing if there are characters like & < > when it parses the input; but cow then escapes these which means the js in the page that parses the markdown can't recognise them.

eg., <&> in a markdown fragment either causes cow to barf (parse error), or if surrounded by CDATA cow turns it into &lt;&amp;&gt; in the html returned to the browser, so the js then renders it as the corresponding string literal (in html, "&lt;&amp;&gt;") in a monospaced font.

the best i can think of is some way to explicitly tell cow to not parse a chunk of input and leave it "raw". or else i'll have to revert to serving fragments out of files and not use cow at all.

anil

I can't think of anything else immediately that might help, short of an HTML parser (which is more permissive about such things than an XML parser). Could you create an issue on https://github.com/mirage/ocaml-cow/issues about this so we don't forget?

...reverting to serving files for now.

Markdown Interface Too Simplified

Since abf78d5 and the switch to omd, the cow markdown interface has been too simple to support existing library users, e.g. opam2web.

Markdown.t is aliased to Html.t which destroys the markdown structure and makes Markdown.to_string lie (or at least hire a lawyer). Because the md AST isn't available, one can't use a Markdown.t to generate a ToC (and the Cow ToC function has now disappeared). To get the basic old features of Cow.Markdown, one now has to add omd as a dependency.

The present design means that Cow can't easily offer multiple HTML serializations of the same markdown which seems rather odd.

Perhaps this was all intended. It just seems odd (and slightly pointless) to me.

Support doctype declarations

With the following html file, Html.of_string fails.

<!DOCTYPE HTML>
<html>
  <head>
    <title>Test</title>
  </head>
  <body>
    <p>Lorem ipsum.</p>
  </body>
</html>

The error is:

[XMLM:1-8] <xxx><!DOCTYPE HTML>
<html>
  <head>
    <title>Test</title>
  </head>

  <body>
    <p>Lorem ipsum.</p>
  </body>
</html>
</xxx>: character sequence illegal here ("<!D")

Any ideas? This prevents cow from being used to read/write valid HTML 5.

Parse_error expanding anti-quotation within quotes

Here is the failing test case:

utop #  let mk_html foo = <:html< <id name="#$str:foo$"> "Hello, world!" </id> >>;;
val mk_html : bytes -> Cow.Xml.t = <fun> 
utop # mk_html "test";;
Exception: Parsing.Parse_error.

The problem is with anti-quotation within quotes. The desired behavior can be achieved with:

utop #  let mk_html foo = <:html< <id name=$str:"#"^foo$> "Hello, world!" </id> >>;;

It would help if the error message was more precise.

is the behavior of the ~figcaption argument to Html.figure correct?

> opam list cow 
[...]
cow.2.4.0  2.4.0       Caml on the Web

I get the following:

open Cow;;

Uri.of_string "foo.jpg"
|> Html.img
|> Html.figure ~figcaption:(Html.string "caption")
|> Html.to_string;;
- : string =
"<!DOCTYPE html>\n<figure><figcaption>caption<img src=\"foo.jpg\"/></figcaption></figure>"

Which doesn't seem to be what I'd expect.

Well, I guess the oddness is that really, figcaption is not an attribute, but a node.

(For what it is worth, I solved this for my use-case by defining

let figcaption text =
  Html.string text
  |> Html.tag "figcaption"

and then doing something like

let my_image = Uri.of_string "foo.jpg"
                 |> Html.img
in
Html.list [my_image; figcaption "caption"]
|> Html.figure
|> Html.to_string;;
- : string =
"<!DOCTYPE html>\n<figure><img src=\"foo.jpg\"/><figcaption>caption</figcaption></figure>"

I just thought I'd report the issue).

Incompatible with core

The Json module uses the List.mem_assoc function which core provides under List.Assoc.mem.

I'm not sure what's the correct way to handle this problem. For now I'm just wrapping cow code generation with open Caml and re-opening Core.Std but I'm sure I will run into more issues if I start using more type_conv extensions.

For example: see the sample at: https://github.com/rgrinberg/opium#example

Cannot use labeled argument assignment in html quotations

This code fails

let tcol_in_row ?(classes="col-xs-12 col-md-6") cols =
  let inrow x = <:html< <div class=$str:classes$>$x$</div> >> in
  <:html< <div class="row">$list:List.map inrow cols$</div> >>

let narrow_in_row cols =
  <:html< $tcol_in_row ~classes:"col-xs-1" cols$ >>

with the message

[ERROR] tcol_in_row ~classes is not a valid tag.
Allowed tags are [opt|int|flo|str|uri|list|alist|attrs] or the empty one.File "htmlutil.ml", line 9, characters 9-26:
While expanding quotation "html" in a position of "expr":
  Camlp4: Uncaught exception: Parsing.Parse_error

I'm sorry that I can't submit a patch; I am completely new to Camlp4.

BTW, I think that ocaml-cow (and the libraries it relies on) is great. I just finished using it to generate our new team website (http://parkas.di.ens.fr/). The generator (~1300 LoOC) slurps up a bunch of markdown and csv files and spits out html.

Somewhat opaque build procedure

In working on a patch (PR#82) I have run into strange compilation problems involving this error message:

Circular build detected (lib/cow.cmi already seen in [ lib/html.cmo; lib/cow.cmi; cow.all ])

I tried to investigate further, but a cursory examination of the Makefile shows that all of the actual compilation for OCaml-COW is done by the mysterious cmd executable which is already included in the repository. Is this something worth fixing? Is it even broken?

pa_cow: json

The generates code for json uses things like Json.String. Shouldn't it use fully qualified names such as Cow.Json.String? Are users really expected to open Cow?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.