Giter Site home page Giter Site logo

monarch-initiative / gpt-mapping-manuscript Goto Github PK

View Code? Open in Web Editor NEW
6.0 4.0 0.0 8.4 MB

Home Page: https://monarch-initiative.github.io/gpt-mapping-manuscript/

License: Creative Commons Attribution 4.0 International

Makefile 0.60% Shell 0.88% HTML 12.09% Jupyter Notebook 86.21% Perl 0.02% Python 0.20%
manubot mapping monarchinitiative ontogpt ontology-alignment

gpt-mapping-manuscript's Introduction

Automated scholarly manuscripts on GitHub

HTML Manuscript PDF Manuscript GitHub Actions Status

Manuscript description

This repository is a template manuscript (a.k.a. rootstock). Actual manuscript instances will clone this repository (see SETUP.md) and replace this paragraph with a description of their manuscript.

Manubot

Manubot is a system for writing scholarly manuscripts via GitHub. Manubot automates citations and references, versions manuscripts using git, and enables collaborative writing via GitHub. An overview manuscript presents the benefits of collaborative writing with Manubot and its unique features. The rootstock repository is a general purpose template for creating new Manubot instances, as detailed in SETUP.md. See USAGE.md for documentation how to write a manuscript.

Please open an issue for questions related to Manubot usage, bug reports, or general inquiries.

Repository directories & files

The directories are as follows:

  • content contains the manuscript source, which includes markdown files as well as inputs for citations and references. See USAGE.md for more information.
  • output contains the outputs (generated files) from Manubot including the resulting manuscripts. You should not edit these files manually, because they will get overwritten.
  • webpage is a directory meant to be rendered as a static webpage for viewing the HTML manuscript.
  • build contains commands and tools for building the manuscript.
  • ci contains files necessary for deployment via continuous integration.

Local execution

The easiest way to run Manubot is to use continuous integration to rebuild the manuscript when the content changes. If you want to build a Manubot manuscript locally, install the conda environment as described in build. Then, you can build the manuscript on POSIX systems by running the following commands from this root directory.

# Activate the manubot conda environment (assumes conda version >= 4.4)
conda activate manubot

# Build the manuscript, saving outputs to the output directory
bash build/build.sh

# At this point, the HTML & PDF outputs will have been created. The remaining
# commands are for serving the webpage to view the HTML manuscript locally.
# This is required to view local images in the HTML output.

# Configure the webpage directory
manubot webpage

# You can now open the manuscript webpage/index.html in a web browser.
# Alternatively, open a local webserver at http://localhost:8000/ with the
# following commands.
cd webpage
python -m http.server

Sometimes it's helpful to monitor the content directory and automatically rebuild the manuscript when a change is detected. The following command, while running, will trigger both the build.sh script and manubot webpage command upon content changes:

bash build/autobuild.sh

Continuous Integration

Whenever a pull request is opened, CI (continuous integration) will test whether the changes break the build process to generate a formatted manuscript. The build process aims to detect common errors, such as invalid citations. If your pull request build fails, see the CI logs for the cause of failure and revise your pull request accordingly.

When a commit to the main branch occurs (for example, when a pull request is merged), CI builds the manuscript and writes the results to the gh-pages and output branches. The gh-pages branch uses GitHub Pages to host the following URLs:

For continuous integration configuration details, see .github/workflows/manubot.yaml.

License

License: CC BY 4.0 License: CC0 1.0

Except when noted otherwise, the entirety of this repository is licensed under a CC BY 4.0 License (LICENSE.md), which allows reuse with attribution. Please attribute by linking to https://github.com/cmungall/gpt-mapping-manuscript.

Since CC BY is not ideal for code and data, certain repository components are also released under the CC0 1.0 public domain dedication (LICENSE-CC0.md). All files matched by the following glob patterns are dual licensed under CC BY 4.0 and CC0 1.0:

  • *.sh
  • *.py
  • *.yml / *.yaml
  • *.json
  • *.bib
  • *.tsv
  • .gitignore

All other files are only available under CC BY 4.0, including:

  • *.md
  • *.html
  • *.pdf
  • *.docx

Please open an issue for any question related to licensing.

Mapping GPT instructions

If you want to run the makefile:

  1. Create new python environment and activate
  2. Install prerequistes. Run make help for help about this.
  3. Set openai key: runoak set-apikey -e openai sk-$(KEY)

gpt-mapping-manuscript's People

Contributors

cmungall avatar hrshdhgd avatar matentzn avatar nlharris avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

gpt-mapping-manuscript's Issues

Move repo to Monarch?

Or do you want to keep it here? Just so I know which link to include in the paper.

Submit mappergpt to OAEI 2023

See sister issue for OAK with the details INCATools/ontology-access-kit#617

Info from email:

The OAEI Bio-ML track will have its 2023 version, for providing a platform for evaluating different ontology alignment systems, especially those using machine learning techniques. We sincerely invite all kinds of OM systems to participate in our Bio-ML track.

In comparison with the 2022 version, we will make the following changes in 2023:

We will set up a new evaluation dataset to test large language models (LLMs) for equivalence matching. The idea is to rank a set of candidate target concepts for a given source concept, by exploring LLMs like Flan-T5 and the GPT-series and prompts.

We will cancel the validation set for simplicity. It will be merged with the testing set in the unsupervised setting, and merged with the training set in the semi-supervised setting.

For subsumption matching, we will cancel the unsupervised setting, but keep the semi-supervised setting, for reducing the workload.

We will use modularization techniques to improve the to-be-matched ontologies that are extracted from the original large-scale ontologies according to the given ground truth mappings. This will keep more of the structure of the original ontology.

As in the last year, we encourage system submission via SEALS or HOBBIT, but can also allow direct result (mappings) submission, which will be marked in our results and reports.

Here are the temporary milestones:
27th July: dataset release
31st August: system submission
30th September: result release
14th October: report release

Organizers: Yuan He, Jiaoyan Chen, Hang Dong, Ernesto Jiménez-Ruiz and Ian Horrocks

@cmungall can we assign this to anyone?

Make sure manuscript deploys on github pages

Right now it does not deploy at all (I did it once manually). Something is this line keeps it from not pushing the updated HTML manuscript to the gh-pages branch.

@hrshdhgd can you take a look of what is happening? If you cant find a solution within 15 minutes, say it here and abandon the task.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.