Giter Site home page Giter Site logo

ceurws / ceur-make Goto Github PK

View Code? Open in Web Editor NEW
14.0 14.0 4.0 251 KB

A set of scripts to semi-automatically generate workshop proceedings for CEUR-WS.org

License: GNU General Public License v3.0

TeX 0.66% Makefile 12.79% Shell 8.70% XSLT 74.71% Perl 2.12% Dockerfile 1.03%

ceur-make's People

Contributors

arademaker avatar clange avatar csarven avatar wolfgangfahl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ceur-make's Issues

Adapt LaTeX sources to a uniform style

A request from user Pascal Fontaine:

I think the process can be further automated, by automatically compiling
(with the right page numbers, uniform title emphasis style,
letter format, CEUR footnotes, etc...), creating the zip, maybe checking
correspondence of titles and authors between TeX and Easychair (I know
and fix various little things). You could restrict the input style to a
few of them.

Script for validating submissions

Basic functionality to implement (sync with latest version of [http://ceur-ws.org/HOWTOSUBMIT.html#TOPERRORS](top mistakes)):

  • index.html structure:
    • using outdated Vol-XXX template (check date stamp)
    • invalid HTML (feed through W3C service, or tidy)
    • encoding should be US-ASCII, with HTML entities for non-ASCII characters
    • authors not separated with comma (but, e.g., with “and”)
    • incomplete author names (comparing author/title with PDF would be too hard to automate)
    • inconsistent title capitalisation (e.g. compute ratio of capitalised letters for each title, warn about outliers)
    • only one submitting editor at the bottom
    • link to workshop must work
    • title must not be the Vol-XXX sample title
  • paper full-text:
    • copyright clauses (scan full-text PDF for the usual suspects: Springer, ACM, …)
  • file/directory structure of ZIP:
    • not in ZIP format
    • no metadata (.DS_Store, __MACOSX, .svn, .git)
    • PDF papers not in subdirectories but on top level

Should output a command line for error-report. Initially just with --error parameters for each error encountered, later with arguments (e.g. name of erroneous paper file).

Consider having a copy of ceur-ws.css in the ceur-make directory

Jyrki Nummenmaa:

When I view the generated index.html file, it does not find the ceur-ws css file since the reference is relative. I do not think it would hurt to make the reference absolute in which case the file would automatically look ok with the right style.

Two possible solutions to make the relative link work:

  1. maintain a physical copy (or even the actual master version?) of ceur-ws.css in the ceur-make repository
  2. let Makefile download the online ceur-ws.css.

Editors affiliation URL

Editors can have their own homepage URL, should it be possible to also add the affiliations urls (sort of like workplaceHomepage)?

ceur-ws/paper-01.pdf should depend on creation of ceur-ws directory

The Makefile rule

ceur-ws/paper-01.pdf: ceur-ws ID

is re-executed whenever ceur-ws is newer than ceur-ws/paper-01.pdf. The timestamp of the directory ceur-ws gets updated whenever a directory entry is added/deleted/renamed.

But we mean that before creating ceur-ws/paper-01.pdf, the directory ceur-ws should be created.

This could be controlled by creating, in the same rule that creates the directory, a hidden file ceur-ws/.directory, and depending on that file. However this file would have to be excluded from the ZIP.

Support multi-session workshops

See @csarven's draft implementation in https://github.com/ceurws/ceur-make/blob/linked-research/toc2ceurindex.xsl#L188. This will require the toc.xml ad hoc schema to be extended (so maybe a task for @csarven and @clange to work on together).

From EasyChair we probably won't get session information. But we could extend the documentation of ceur-make as follows:

  1. use make toc.xml to generate toc.xml.
  2. manually add your session structure to toc.xml
  3. then use make to generate your index.html.

Document how to use EasyChair's frontmatter in the further workflow

Jyrki Nummenmaa:

The instructions related to toc.xml generated from EasyChair project do not particularly mention what to do with frontmatter. Maybe it is self-evident to the editors?

We could consider auto-generating a LaTeX preface.pdf and adding it to the table of contents.

Display name vs givenName and familyName

Currently we use display name for the author. Should we move or also incorporate givenName and familyName specifically? It would mean that the toc needs to have a field for it. Another thing to investigate: is easychair metadata making that distinction or only giving display name?

Parameterize make target names

Instead of ceur-ws/temp.bib the Makefile should create a BibTeX file named by the workshop ID. Find out how the name of a make target can be parameterized; maybe using a second pass.

Let easychair2xml.pl write “command to create document” into toc.xml

make retex currently runs Perl on its own to find out the “command to create document” (actually just the main LaTeX source, not the command, but that's a separate issue) from each paper's README_EASYCHAIR file. This is something that easychair2xml.pl could easily do.

TODO fix this ticket to link to the relevant sources
TODO create ticket for the separate issue

index2main failure on Vol-2849

Michael Cochez reported:

I am publishing a CEUR volume, and wanted to use index2main. Now. It
appears the volume has been created using ceur-make and the script is
not able to extract the information from the index.html file. Is that
a known issue?

ceurws@mars:~/www/Vol-2849$ index2main index.html
line 12 column 7 - Error:

is not recognized!
line 12 column 7 - Warning: discarding unexpected
line 31 column 7 - Warning: discarding unexpected
line 32 column 7 - Error: is not recognized!
line 32 column 7 - Warning: discarding unexpected
line 33 column 10 - Error: is not recognized!
line 33 column 10 - Warning: discarding unexpected
line 43 column 37 - Error: is not recognized!
line 43 column 37 - Warning: discarding unexpected
line 43 column 130 - Warning: discarding unexpected
line 43 column 141 - Error: is not recognized!
line 43 column 141 - Warning: discarding unexpected
line 43 column 229 - Warning: discarding unexpected
line 57 column 16 - Error:
is not recognized!
-:1: parser error : Document is empty

I did now manually create the block for the homepage.

Makefile to validate RDFa against "unit tests"

ceur-make intends to output sane RDFa, i.e. RDFa that uses reasonable URIs for things and that is valid w.r.t. the vocabularies used.

However,

  • ceur-make might be wrong,
  • Some features (e.g. AUX papers) are not yet supported by ceur-make, and some other features (e.g. linking to machine-comprehensible FOAF user profiles if editors/authors have them) do require manual copy-editing of the RDFa if editors want to have them (and, don't worry, those who want to have this, are usually technically experienced). Any such manual copy-editing might go wrong.
  • We won't be able to stop the bad practice of bypassing ceur-make and copy-editing existing volumes' index.html files.

An easy way of implementing this would be a combination of

  • some new rules in the Makefile
  • a shell script
  • pyRDFa (for obtaining RDF/XML)
  • SPARQL queries executed by the ARQ command-line tool

Custom page number offset in toc.xml

Depending on how one generates the proceedings volume from toc.xml (via toc.tex), the first paper doesn't start on page 1, as a title page, table of contents, etc. might occur before. As the page numbering in toc.xml (which are taken from EasyChair) propagates to ceur-ws/temp.bib as well as ceur-ws/index.html, it would make sense to specify an offset (e.g. “5 pages”), which is respected when generating toc.xml.

Update XSL to current index file template (2020-07-09)

I believe the toc2ceurindex.xsl is at CEURVERSION=2015-12-02. The current version of Vol-XXX/index.html file is at CEURVERSION=2020-07-09.

Would it be possible please to update the XSL file? I've seen most new additions to CEUR-WS follow the 2020 template, as requested by CEUR-WS ("Always use the latest template"), probably based on manual edition of the HTML template, but obviously don't benefit from your RDFa annotations, which is quite a pity.

Standardising editor's affiliation country format

Editor's affiliation country value is a string which makes it possible to have any value e.g., fullname of the country, ISO 3166-1-alpha-2 etc. It may be preferable to standardise on the format. It would mean that the values should be entered as ISO 3166-1-alpha-2 e.g., CA, or a wikipedia URL e.g., http://en.wikipedia.org/wiki/Canada (which we can map to dbpedia during transformation, like we do for the location of the event). Both can be mapped in any case.

Remove RDFa instruction comments from index.html; put such documentation elsewhere

Requirements:

  • generated index.html should be free from XML comments
  • but there should be a clear documentation of how to write correct RDFa annotations.

Better alternatives than mere documentation:

  1. provide a script to strip such comments (see #12)
  2. add support for all RDFa annotations to toc.xml and workshop.xml.

Upon releasing this change, manually strip existing volumes from such comments.

$pdf left in index.html file

After running make, for some reason the link for each paper in index.html was "$pdf" instead of "paper-01.pdf" &c.

reconsider use of bibo:presentedAt

We currently say that a proceedings volume as "presented at" a workshop event. However weren't rather the individual papers presented a workshop?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.