Giter Site home page Giter Site logo

Support citation key aliases about manubot HOT 6 CLOSED

manubot avatar manubot commented on May 17, 2024
Support citation key aliases

from manubot.

Comments (6)

dhimmel avatar dhimmel commented on May 17, 2024 2

you use the column names tag and citation but both are citation keys, the first being used as alias for the second.

Good point. All of the following are probably "citation keys":

  1. tag:avasthi-preprints (user input of tag / alias / citekey)
  2. doi:10.7554/eLife.38532 (resolved / detagged / de-aliased)
  3. doi:10.7554/elife.38532 (standardized)
  4. pqBLIXzp (shortened, output citation key that is used in the processed document)

So then I see that citekeys is an appropriate name for a mapping of alias-to-key. However, it is not super specific because it does not indicate that it deals with the aliasing step. I guess perhaps that's implied in the mapping / dictionary data structure?

Anyways, thanks for the feedback. It's important we adopt the best terminology while the project is still young... so will continue thinking about this issue. Interested in @agitter's thoughts as well.

from manubot.

dhimmel avatar dhimmel commented on May 17, 2024

Thanks @nichtich for the suggestion. Here's where we're at regarding citation keys.

We currently have a similar concept called citation tags. Currently, manuscripts can define a tabular mapping of citation keys (tags) to the corresponding standard identifier. See for example citation-tags.tsv for the Manubot software paper.

So the functionality we don't currently have is the ability to define citation keys/tags in a YAML frontmatter. However, this feature is planned as part of #99, which creates a pandoc filter for this package's citation processing. This PR would enable your final example (or something close to it).

There is a separate question of whether we should enable embedded-YAML citation tags for users that are not using the Pandoc filter. There is also a question of whether we should make the pandoc filter the only supported workflow in the future. Currently the issue is that Pandoc's syntax is too restrictive for many standard identifiers (especially URLs and ugly DOIs). Citation tags/keys are a workaround for forbidden characters in a citation identifier, but that requires users to use them (and likely first have processing of a citation fail).

Happy for any advice on how you think we should proceed? I see the proposed pandoc filter as a way that you could most directly use Manubot in your current workflow to enable a wider range of citation sources.

from manubot.

dhimmel avatar dhimmel commented on May 17, 2024

Another consideration your issue brings up is whether we should use the term "key" rather than "tag". Seems like citekey is a familiar term to LaTeX users?

from manubot.

nichtich avatar nichtich commented on May 17, 2024

Another consideration your issue brings up is whether we should use the term "key" rather than "tag". Seems like citekey is a familiar term to LaTeX users?

Pandoc documentation uses the name "citation key". In LaTeX tutorials and BibTeX documentation uses "cite_key" or "citation-key", so why introduce the new word "tag"?

In citation-tags.csv you use the column names tag and citation but both are citation keys, the first being used as alias for the second. If this data is expressed in tabular form, you could also ignore the header and just use first-column, second-column. If this data is expressed in key-value format (YAML or JSON), we don't need to name both sides anyway.

However, this feature is planned as part of #99, which creates a pandoc filter for this package's citation processing. This PR would enable your final example (or something close to it).

I'd prefer not to have a close-to-it variant (e.g. manubot pandoc filter using the field citation-keys) but the exact field name citekeys as exemplified above for compatibility between manubot and wcite. In your case:

citekeys:
  techblog-csl: url:http://blogs.nature.com/naturejobs/2017/05/03/techblog-create-the-perfect-bibliography-with-the-csl-editor/
  techblog-manubot: url:http://blogs.nature.com/naturejobs/2018/02/20/techblog-manubot-brown-predicting-the-paper-of-the-future
  avasthi-preprints: doi:10.7554/eLife.38532
  steem-post: url:https://goo.gl/jGBrxE
  # etc.

There is a separate question of whether we should enable embedded-YAML citation tags for users that are not using the Pandoc filter.

Why not? The YAML header can easily be extracted from a Markdown document.

There is also a question of whether we should make the pandoc filter the only supported workflow in the future.

This is independent from this issue. There can be multiple implementations as long as they agree on the input data format.

Happy for any advice on how you think we should proceed? I see the proposed pandoc filter as a way that you could most directly use Manubot in your current workflow to enable a wider range of citation sources.

Have a look at the wcite Pandoc filter. Note that Pandoc metadata field nocite is supported as well.

from manubot.

nichtich avatar nichtich commented on May 17, 2024

it is not super specific because it does not indicate that it deals with the aliasing step. I guess perhaps that's implied in the mapping / dictionary data structure?

yes and yes. citekey-aliases would be more specific.

from manubot.

dhimmel avatar dhimmel commented on May 17, 2024

#129 makes extensive modifications to variable/function names to adopt the citation keys / citekeys nomenclature internally. One this PR is complete we can:

  1. update manubot/rootstock docs to adopt the citation keys nomenclature
  2. consider whether to replace tag with alias for citation keys that are pointers to another user-specified key.

from manubot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.