mirage / ocaml-cow Goto Github PK

Caml on the Web (COW) is a set of parsers and syntax extensions to let you manipulate HTML, CSS, XML, JSON and Markdown directly from OCaml code.

Home Page: http://www.openmirage.org/

License: Other

OCaml 99.87% Makefile 0.13%

ocaml-cow's Introduction

Writing web-applications requires a lot of skills: HTML, XML, JSON and Markdown, to name but a few! This library provides OCaml combinators for these web formats by:

See more explanation at: http://mirage.github.io/ocaml-cow

This library is in beta, and full documentation is still being written. Some repositories which use it include:

Mirage website: http://github.com/mirage/mirage-www
Opam2web: http://github.com/OCamlpro/opam2web

ocaml-cow's People

Contributors

Stargazers

Watchers

ocaml-cow's Issues

pa_css: doesnt handle empty fields

 <:css<
  pre.verbatim, pre.codepre { }
 >>

Error: While expanding quotation "css" in a position of "str_item": Camlp4: Uncaught exception: Parsing.Parse_error

Fix attrs antiquotation expander

The attrs antiquotation expander is presently very brittle as demonstrated in 0d72af0.

Specifically, attrs' whitespace handling is very broken. Attributes may be separated by many whitespace characters with different char codes. Attribute values may contain whitespace. Attribute key-values may have whitespace between their tokens and '='.

Can't embed attributes with `:` in them

Specifically trying to add xml:base into Atom for relative links

<:xml<<a xml:foo="bar"></a>&>>;;
- : Cow.Xml.t = [`El ((("", "a"), [(("", "foo"), "bar")]), [])]

The xml:foo turns into foo.

Print json prettier

It would be nice to have some control over how json was converted to a string.

For example, it would be useful to be able to print the json with some newlines in it so that it is human readable.

Would be great to have sexp convertors for all the types

I know it's highly unlikely that we can convince some upstream packages to do this but it's very valuable to me (and I'm sure other users)

Why is re a direct dependency?

I've had a look and it doesn't seem like re is being used anywhere. I see the syntax extension is using str which is fine but otherwise no uses of re.

@avsm @samoht please confirm/deny?

code output incompatible with Core

If Core is opened, then our List.flatten is unbound. Apparently core exposes a module which can be opened locally to get the original stdlib back, but we'll still need a command line flag for full compatibility. Is there some ocamlfind predicate which might help with hiding this, I wonder...

Using characters “ ” produces a faulty output.

To reproduce, just try to use “ or ” in a page.

missing html combinators

at least:

code
blockquote
q
s

`make tests` fails

ocamlfind ocamlc -package oUnit -linkpkg -I /home/dsheets/.opam/4.02.1/lib/dyntype dyntype.cma -I /home/dsheets/.opam/4.02.1/lib/ulex ulexing.cma -I /home/dsheets/.opam/4.02.1/lib/ocaml unix.cma -I /home/dsheets/.opam/4.02.1/lib/oUnit oUnitAdvanced.cma -I /home/dsheets/.opam/4.02.1/lib/oUnit oUnit.cma -I /home/dsheets/.opam/4.02.1/lib/re re.cma -I /home/dsheets/.opam/4.02.1/lib/re re_posix.cma -I /home/dsheets/.opam/4.02.1/lib/stringext stringext.cma -I /home/dsheets/.opam/4.02.1/lib/ocaml bigarray.cma -I /home/dsheets/.opam/4.02.1/lib/sexplib sexplib.cma -I /home/dsheets/.opam/4.02.1/lib/uri uri.cma -I /home/dsheets/.opam/4.02.1/lib/xmlm xmlm.cma -I /home/dsheets/.opam/4.02.1/lib/uutf uutf.cma -I /home/dsheets/.opam/4.02.1/lib/jsonm jsonm.cma -I /home/dsheets/.opam/4.02.1/lib/hex hex.cma -I /home/dsheets/.opam/4.02.1/lib/ezjsonm ezjsonm.cma -I /home/dsheets/.opam/4.02.1/lib/omd omd.cma -I ../_build/lib cow.cma render.cmo extension.cmo\
  test.ml -o test
File "/home/dsheets/.opam/4.02.1/lib/ocaml/unix.cma(Unix)", line 1:
Warning 31: files /home/dsheets/.opam/4.02.1/lib/ocaml/unix.cma(Unix) and /home/dsheets/.opam/4.02.1/lib/ocaml/unix.cma(Unix) both define a module named Unix
File "/home/dsheets/.opam/4.02.1/lib/ocaml/unix.cma(UnixLabels)", line 1:
Warning 31: files /home/dsheets/.opam/4.02.1/lib/ocaml/unix.cma(UnixLabels) and /home/dsheets/.opam/4.02.1/lib/ocaml/unix.cma(UnixLabels) both define a module named UnixLabels
make[1]: Leaving directory '/home/dsheets/Code/ocaml-cow/tests'
tests/render
make: tests/render: Command not found
Makefile:15: recipe for target 'tests' failed
make: *** [tests] Error 127

This issue was introduced since 1.2.0 in 8f271fd and 587b947, @samoht.

Json marshalling does not seem to work properly with type containing option

The following basic example:

open Cow
type data = {
output : string option;
errmsg : string option;
} with json

fails to compile with :

File "test.ml", line 3, characters 5-72:
Error: This expression has type 'a list
but an expression was expected of type string * Cow.Json.t

The error appears with cow-0.5.2 and ocaml-4.0.1. Thanks in advance.

Html parser is not robust enough

Example:

# let s = "<p><img src=\"http://typeocaml.com/content/images/2014/11/thunk.jpg#hero\" alt=\"thunk\">\nA thunk is simply a function with the <em>unit</em> parameter. For example:</p>\n\n<pre><code class=\"ocaml\">let f() = 1 + 2 * 3;;  \n</code></pre>\n\n<h1 id=\"features\">Features</h1>\n\n<p>It is indeed a function and seems simple enough. However, don't overlook it. Not having any parameters makes it so special that it even has such a particular name, <em>thunk</em>. Let's have a look at its features. </p>\n\n<h2 id=\"determinedresult\">Determined result</h2>\n\n<p>If a function has parameters, the result of its application may depend on its <a href=\"http://stackoverflow.com/questions/156767/whats-the-difference-between-an-argument-and-a-parameter\">arguments</a>. For example:</p>\n\n<pre><code class=\"ocaml\">let g x y = x + y;;  \n</code></pre>\n\n<p>Without giving values for x and y, we won't be able to know what concrete value that g would give back to us.</p>\n\n<p>Thunk, however, does not have any parameters, which means its result must be fixed. After defining it, the body will not be affected by anything any more. It is like a sealed box: we may not know what's inside yet, but we know that once it is there, its content won't change. For the <code>f</code> at the beginning, we know its value will definitely be 7. For <code>let f1() = h 2 (h 3 4)</code>, we may not know the value of its body but we are sure its body is two applications of <code>h</code> and it won't change.</p>\n\n<h2 id=\"laterevaluation\">Later evaluation</h2>\n\n<p>Binding also has determined result and it is immediately evaluated after we define it. For example, <code>let x = 1 + 2 * 3</code> will return you x with 7 at once. </p>\n\n<p>Thunk is different. When we hold a thunk, we must call it like <code>f()</code> to get its result. The evaluation is a kind of delayed, but we have the full control. Here is a demonstration:</p>\n\n<pre><code class=\"ocaml\">(*Will fail immediately after you hit enter*)\nlet y = 10 / 0;;\n\n(*Will not fail but return a thunk*)\nlet f_div0() = 10 / 0;;\n\n(*Now fail*)\nf_div0();;  \n</code></pre>\n\n<h2 id=\"perfectcomputationcapsule\">Perfect computation capsule</h2>\n\n<p>With the help of the above two features, thunk encapsulates computations without evaluation, in other words, thunk does not store the actual value; instead, it stores the way of how the value would be computed. And the actual compuations will be carried out only when you decide so. Here is a case that desires thunk:</p>\n\n<pre><code class=\"ocaml\">let print_prime prime x =  \n  if x then print_int prime\n  else print_int 0;; \n\n(*Assume find_nth_prime n is already somewhere*)\nprint_prime (find_nth_prime 1000000) false;;  \n</code></pre>\n\n<p><code>print_prime</code> prints the prime only when x is true. It sounds ok. But the next function application shows a flaw that it will not print the 1,000,000th prime, but anyway the prime will be computed. What a waste of cpu cycles!</p>\n\n<p>With thunk, it gets better:  </p>\n\n<pre><code class=\"ocaml\">let print_prime prime x =  \n  if x then print_int (prime()) (*called when needed*)\n  else print_int 0;;\n\n(*Create a thunk*)\nlet millionth_prime() = find_nth_prime 1000000;;\n\nprint_prime millionth_prime false;;  \n</code></pre>\n\n<p>Now, <code>print_prime</code> takes a thunk and we also create a capsule for the missionth prime. The thunk will be evaluated only before printing.</p>\n\n<h1 id=\"importantusages\">Important Usages</h1>\n\n<p>Thunk is the fundamental element for several classic usages:</p>\n\n<ul>\n<li><strong>stream_list</strong>: element is produced when being retrieved</li>\n<li><strong>lazy</strong>: evaluated only when needed and only once</li>\n<li><strong>async</strong>: concurrent computing, a kind of queue with lots scheduled computations. </li>\n</ul>\n\n<p>The above three and the important roles of thunks will be presented in details in future posts.</p>\n\n<hr>\n\n<p><em>To be continued</em></p>";;
              val s : string =
  "<p><img src=\"http://typeocaml.com/content/images/2014/11/thunk.jpg#hero\" alt=\"thunk\">\nA th…"
# let h = Cow.Html.of_string s;;
Exception: Parsing.Parse_error.

cow 2.2.0 to 2.3.0 made breaking API changes

Although they were simple to fix, it was not clear at first what changed.
Please consider bumping the major release when this is the case

Rename fields when reading from json?

In my use case I have a json field that contains a - and i'd like cow convert it to the appropriate field that has an underscore instead. meta_conv seems to do this with its as directive.

License?

The documentation does not indicate the license for this library. Most of the sources seem to use a MIT-like license, but others such as json.ml claim to be LGPL 2.1 (+LE). Is it possible to have a canonical license covering the entire library?

html quotation should eat the right-most space

We are currently forced to introduce a final space:

  <:html< <a class="$str:cl$" href="$str:href$">$str:text$</a> >>

In this case, we can eat the last space or newline, and if you really want a space at the end, then introduce a double space (suggestion from @lpw25).

CSS Location and Camlp4 in 4.02.0 conflict

... but only on OS X?! Needs investigation before the 4.02 release. See ocaml/opam-repository#2466 for more info.

LICENSE missing?

I am interested in porting this software to Fan, but don't see any LICENSE yet. Thanks

De-indentation option

Serialization functions should offer the capability to strip consistent leading whitespace from every line.

br

Should br just have the type val br : t?

remove runtime errors

Seems that you can have runtime errors when you are using Cow. This is certainly not expected and should be fixed.

See for instance ocamllabs/opam-doc@2b6d1f9

Bug in parsing of integer literals

It seems to parse 300_000_000 as multiple units, so 300 gets highlighted as a different color than the rest of it when it's syntax-highlighted.

alist and attrs antiquot expander wrongly typed

These antiquotation expanders are ... -> ((string * 'a) * 'b) list when they should be ... -> Xml.signal list in <syntax/xml/quotation.ml>. Documentation/expected output would also help. If you fix this, uncomment and correct tests in <tests/render.ml>.

HTML output should be Polyglot

Cow currently violates the Polyglot http://www.w3.org/TR/html-polyglot/ spec in a number of ways. Most importantly:

Release a version compatible with omd 1.0.0

Release a version compatible with omd 1.0.0.

Change Json representation/Or use ezjsonm

I'm mostly a user of jsonm/ezjsonm for my general json parsing needs and only like to use cow for convenience. Unfortunately the json representation that cow uses is incompatible with ezjsonm/jsonm. I would much rather have cow just reuse ezjsonm's parsing code or at least present the same representation using polymorphic variants:

type json = 
  [ `Null | `Bool of bool | `Float of float| `String of string
  | `A of json list | `O of (string * json) list ]

I'd attempt to hack at this myself but this is a very breaking change so I'm not sure how interested the maintainers are.

Also, yojson is also rather popular but you can't please everybody. Yojson also has atd which serves the same purpose as cow.

Html.link should be Html.a

link is a different tag in HTML from a. As this value was just added by @Chris00, it shouldn't cause any breakage.

The link type and value html_of_link should also be renamed, eventually. These renamings will cause breakage in at least opam2web.

the META file contains reference to external archives

This line in META.in:

archive(syntax, preprocessor) = "xmlm.cma str.cma pa_cow.cma ezjsonm.cma"

Is completely wrong. Why do we have xmlm.cma, str.cma and ezjsonm.cma there? This should be resolved by users of this library using the right combination of ocamlfind magic (see samoht/assemblage#119)

should install .mli files

Cloned from OCamlPro/opam-repository#637

xml/html parsing and embedded markdown (mirage-decks)

see the email thread; basically, inline markdown as used by mirage-decks doesn't work with cow because there's no way to pass through invalid uses of xml significant characters (& < > etc). from the thread:
mort

it works for some of the markdown, but there are some things it doesn't help. basically, it stops cow barfing if there are characters like & < > when it parses the input; but cow then escapes these which means the js in the page that parses the markdown can't recognise them.

eg., <&> in a markdown fragment either causes cow to barf (parse error), or if surrounded by CDATA cow turns it into <&> in the html returned to the browser, so the js then renders it as the corresponding string literal (in html, "<&>") in a monospaced font.

the best i can think of is some way to explicitly tell cow to not parse a chunk of input and leave it "raw". or else i'll have to revert to serving fragments out of files and not use cow at all.

anil

I can't think of anything else immediately that might help, short of an HTML parser (which is more permissive about such things than an XML parser). Could you create an issue on https://github.com/mirage/ocaml-cow/issues about this so we don't forget?

...reverting to serving files for now.

Markdown Interface Too Simplified

Since abf78d5 and the switch to omd, the cow markdown interface has been too simple to support existing library users, e.g. opam2web.

Markdown.t is aliased to Html.t which destroys the markdown structure and makes Markdown.to_string lie (or at least hire a lawyer). Because the md AST isn't available, one can't use a Markdown.t to generate a ToC (and the Cow ToC function has now disappeared). To get the basic old features of Cow.Markdown, one now has to add omd as a dependency.

The present design means that Cow can't easily offer multiple HTML serializations of the same markdown which seems rather odd.

Perhaps this was all intended. It just seems odd (and slightly pointless) to me.

Separate of_json and json_of would be useful

It would be useful to have "of_json" and "json_of" as separate type-conv extensions.

type t = ...
with of_json

would only produce t_of_json (without json_of_t).

Support doctype declarations

With the following html file, Html.of_string fails.

<!DOCTYPE HTML>
<html>
  <head>
    <title>Test</title>
  </head>
  <body>
    <p>Lorem ipsum.</p>
  </body>
</html>

The error is:

[XMLM:1-8] <xxx><!DOCTYPE HTML>
<html>
  <head>
    <title>Test</title>
  </head>

  <body>
    <p>Lorem ipsum.</p>
  </body>
</html>
</xxx>: character sequence illegal here ("<!D")

Any ideas? This prevents cow from being used to read/write valid HTML 5.

Parse_error expanding anti-quotation within quotes

Here is the failing test case:

utop #  let mk_html foo = <:html< <id name="#$str:foo$"> "Hello, world!" </id> >>;;
val mk_html : bytes -> Cow.Xml.t = <fun> 
utop # mk_html "test";;
Exception: Parsing.Parse_error.

The problem is with anti-quotation within quotes. The desired behavior can be achieved with:

utop #  let mk_html foo = <:html< <id name=$str:"#"^foo$> "Hello, world!" </id> >>;;

It would help if the error message was more precise.

URI expander

$uri:foo$

becomes

(Uri.to_string foo)

is the behavior of the ~figcaption argument to Html.figure correct?

> opam list cow 
[...]
cow.2.4.0  2.4.0       Caml on the Web

I get the following:

open Cow;;

Uri.of_string "foo.jpg"
|> Html.img
|> Html.figure ~figcaption:(Html.string "caption")
|> Html.to_string;;
- : string =
"<!DOCTYPE html>\n<figure><figcaption>caption<img src=\"foo.jpg\"/></figcaption></figure>"

Which doesn't seem to be what I'd expect.

Well, I guess the oddness is that really, figcaption is not an attribute, but a node.

(For what it is worth, I solved this for my use-case by defining

let figcaption text =
  Html.string text
  |> Html.tag "figcaption"

and then doing something like

let my_image = Uri.of_string "foo.jpg"
                 |> Html.img
in
Html.list [my_image; figcaption "caption"]
|> Html.figure
|> Html.to_string;;
- : string =
"<!DOCTYPE html>\n<figure><img src=\"foo.jpg\"/><figcaption>caption</figcaption></figure>"

I just thought I'd report the issue).

Better templating system

We should use the same templating system as in ohm: http://ohm-framework.com/tutorials/html

Incompatible with core

The Json module uses the List.mem_assoc function which core provides under List.Assoc.mem.

I'm not sure what's the correct way to handle this problem. For now I'm just wrapping cow code generation with open Caml and re-opening Core.Std but I'm sure I will run into more issues if I start using more type_conv extensions.

For example: see the sample at: https://github.com/rgrinberg/opium#example

Cannot use labeled argument assignment in html quotations

This code fails

let tcol_in_row ?(classes="col-xs-12 col-md-6") cols =
  let inrow x = <:html< <div class=$str:classes$>$x$</div> >> in
  <:html< <div class="row">$list:List.map inrow cols$</div> >>

let narrow_in_row cols =
  <:html< $tcol_in_row ~classes:"col-xs-1" cols$ >>

with the message

[ERROR] tcol_in_row ~classes is not a valid tag.
Allowed tags are [opt|int|flo|str|uri|list|alist|attrs] or the empty one.File "htmlutil.ml", line 9, characters 9-26:
While expanding quotation "html" in a position of "expr":
  Camlp4: Uncaught exception: Parsing.Parse_error

I'm sorry that I can't submit a patch; I am completely new to Camlp4.

BTW, I think that ocaml-cow (and the libraries it relies on) is great. I just finished using it to generate our new team website (http://parkas.di.ens.fr/). The generator (~1300 LoOC) slurps up a bunch of markdown and csv files and spits out html.

Compatibility with the new ezjsonm

cc @samoht

Somewhat opaque build procedure

In working on a patch (PR#82) I have run into strange compilation problems involving this error message:

Circular build detected (lib/cow.cmi already seen in [ lib/html.cmo; lib/cow.cmi; cow.all ])

I tried to investigate further, but a cursory examination of the Makefile shows that all of the actual compilation for OCaml-COW is done by the mysterious cmd executable which is already included in the repository. Is this something worth fixing? Is it even broken?

'Xml.to_string' outputs two document declarations

e.g :

# Cow.Xml.to_string <:xml<<root>a</root>&>>;;
- : string =
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n\n<root>a</root>
<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"

pa_cow: json

The generates code for json uses things like Json.String. Shouldn't it use fully qualified names such as Cow.Json.String? Are users really expected to open Cow?

mirage / ocaml-cow Goto Github PK

ocaml-cow's Introduction

ocaml-cow's People

Contributors

Stargazers

Watchers

Forkers

ocaml-cow's Issues

Recommend Projects

Recommend Topics

Recommend Org