Giter Site home page Giter Site logo

bibtex-xml's Introduction

BibTeX-XML

This project started as an exam project of my studies in the subject "Dokumentbeschreibungssprachen" (en. markup languages). The task was to develop an XML language for describing BibTeX databases, convert a given large database into that language and display it via a webbrowser using CSS and/or XSL Transformations.

Now, after the project has been submitted, some parts of it are serving me for my own software development training. The original database from the exam task has been left in the repository for being used as test data, but the original repository structure may have changed when you're reading this because there are no requirements to build a submission folder anymore.

Because this is only for my own training, this repository will implement things others may have done before (e.g. there is a bibtex package available for Haskell which is most likely much futher developed than my version).

bibtex-xml's People

Contributors

markusleupold avatar

Watchers

 avatar

bibtex-xml's Issues

Add XSD insertion feature

Scope: bibtoxml

Currently, bibtoxml only outputs BibTeX-XML documents without any DTD or XML-Schema. Command line options should be
provided to add a customizable XSD to the result document.

.gitignores shadow the second directory level

In a few gitignores, negated paths are only specified one level deep by using one asterisk. This only unignores the first level. To ignore all levels, we have to use a double asterisk !the/path/**

Add DTD insertion feature

Scope: bibtoxml

Currently, bibtoxml only outputs BibTeX-XML documents without any DTD or XML-Schema. Command line options should be
provided to add a customizable DTD to the result document.

data Value should be a Monoid

Scope: bibtoxml/src/BibTeX/Types.hs

In BibTeX, values can be concatenated to form a new value. BibTeX.Types supports this feature, but the implementation can be more precise: If you look closely, you will see that the set of all possible values forms a monoid together with the contatenation operation defined on it. Here's the proof:

Value is the set of all possible values. Each element v of Value has the data type Value and is constructed using one of the three constructors LiteralValue, ReferencedValue, and ComposedValue.

Also, we define the expansion operation e of a value:

e :: Value -> String

where e v is the expanded string representation of v (i.e. the meaning of v as ordinary text, with variable references replaced by their definition recursively)

Third, we define the concatenation operation <+> like following:

<+>: V x V --> V

where r = v1 <+> v2 is defined such that e r == e v1 ++ e v2.

The Value type in BibTeX.Types is not the actual value of a field. Semantically, a field's value is equal to the expansion of its Value element. The internal representation of a field's value therefore has to simulate the semantics of its expansion. This means, that concatenating Value elements is semantically equivalent to concatenating their expansions.
Because of that, Values inherit all characteristics from Strings which are based on the String concatenation (++). Strings form a Monoid with their concatenation, and therefore also Values do, q.e.d.

It would be a good idea to adapt the Value type according to Monoid laws and create a corresponding instance of Monoid. This will make the properties of Value easier to see and understand and therefore improve code quality.

Simplify the value parser

Scope: File bibtoxml/BibTeX/Parser.hs

Currently, the values are parsed using elements of type ValueParser, which are tuples of

  1. a predicate function which determines if the parser can be applied to an input stream
  2. the actual parser function which returns the raw parsed value as a String
  3. a constructor function to turn the raw String value into something of type Value

We can simplify this by specifying a single parser which combines the functions of elements one and two from above. This parser has type String -> Maybe (String, String). It will evaluate to Just the raw value and the remaining input stream if it can be applied to the input stream and otherwise to Nothing. Lazy evaluation will stop each parser as soon as it has been determined, that this parser ist not applicable.

Implement TeX-like token parsing in values to interpret TeX control sequences

BibTeX's natural environment is TeX, and because of that, the values of a BibTeX database are very likely to contain TeX control sequences. Simple examples would be:

  • \"o for the o umlaut
  • \bf for bold font text
  • ,, for german opening quote marks (U+201E)

It's clear, that these control sequences or active characters should be expanded when the user sees the end result. The question is, when the expansion should actually be done. There are three possibilities:

  1. During parsing of the database to the internal data structure of the BibTeX library. This means, that the BibTeX library must implement formatting information inside the values.
  2. During output of the database as XML. Then the XML document type (DTT or XSD) must define a format to describe formatting information.
  3. During XSL Transformation of the BibXML file to the HTML website. The XSLT then has to do some string parsing.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.