Giter Site home page Giter Site logo

cthoyt / cthoyt.github.io Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 4.0 41.45 MB

My personal website, served at https://cthoyt.com

Home Page: https://cthoyt.com/

License: Creative Commons Attribution 4.0 International

HTML 88.00% Python 12.00%
biological-expression-language machine-learning knowledge-graphs knowledge-graph-embeddings drug-discovery drug-repurposing target-prioritization target-validation biocuration ontologies

cthoyt.github.io's Introduction

cthoyt.github.io

My personal website, served at https://cthoyt.com

Serve Locally

docker run --rm --volume="$PWD:/srv/jekyll" -p 4000:4000 -it jekyll/jekyll:latest jekyll serve

License

CC BY 4.0

cthoyt.github.io's People

Contributors

bilalshaikh42 avatar cthoyt avatar senecacreek avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

cthoyt.github.io's Issues

write post on the banana problem

A local unique identifier is the value within a semantic space. For example, MONDO has the local unique identifier 0005301 for "multiple sclerosis". If you want to make a URI, you take the MONDO URI prefix (http://purl.obolibrary.org/obo/MONDO_) and concatenate the local unique identifier on the end to make a URI (i.e., http://purl.obolibrary.org/obo/MONDO_0005301). Similarly, if you want to make a compact URI (CURIE), you take the MONDO CURIE prefix (MONDO) and concatenate a semicolon : then the local unique identifier (i.e., MONDO:0005301)
Unfortunately, there are a lot of places where people mistakenly write a whole CURIE in a place where a local unique identifier should go. This means someone writes MONDO:0005301 where they should have written 0005301. We call this a redundant prefix in the local unique identifier. This is also colloquially called the "banana problem"
Wikidata is one place where this happens. Identifiers.org also has propagated this mistake to many places (though MONDO does not appear in Identifiers.org, it might be the case that the submitter for the Wikidata property was influenced by how other properties did it, which were in turn influenced by Identifirs.org)
TL;DR, Wikidata has a lot of wrong ways of writing LUIDs in its properties referring to ontologies, MONDO being one example

Re-implementing the N2T ARK Resolver | Biopragmatics

Re-implementing the N2T ARK Resolver | Biopragmatics

Archival Resource Keys (ARKs) are flavor of persistent identifiers like DOIs, URNs, and Handles that have the benefit of being free, flexible with what metadata gets attached, and natively able to resolve to web pages. Name-to-Thing (N2T) implements a resolver for a variety of ARKs, so this blog post is about how that resolver can be re-implemented with the curies Python package.

https://cthoyt.com/2023/04/11/n2t-ark-resolver.html

Add missing events

update personal webpage

  • EOSC workshop
  • Bioregistry workshop
  • 2023 Ontology Summit
  • 2nd Mapping Commons workshop

You Should Use a Private Email on Publications | Biopragmatics

You Should Use a Private Email on Publications | Biopragmatics

While we were recently preparing to submit a manuscript, the lead author said they looked at my last few papers and noticed I always used a private email address instead of an institutional email address. They asked, perplexed, if they should also use my private email address with our submission. The answer was a resounding yes; always use a private email address. Here’s why.

https://cthoyt.com/2022/02/06/use-your-personal-email.html

Making DrugBank Reproducible | Biopragmatics

Making DrugBank Reproducible | Biopragmatics

If you’re reading my blog, there’s a pretty high chance you’ve used DrugBank, a database of drug-target interations, drug-drug interactions, and other high-granularity information about clinically-studied chemicals. DrugBank has two major problems, though: its data are password-protected, and its license does not allow redistribution. Time to solve these problems once and for all.

https://cthoyt.com/2020/12/14/taming-drugbank.html

Pythagorean Mean Rank Metrics | Biopragmatics

Pythagorean Mean Rank Metrics | Biopragmatics

The mean rank (MR) and mean reciprocal rank (MRR) are among the most popular metrics reported for the evaluation of knowledge graph embedding models in the link prediction task. While they are reported on very different intervals ($\text{MR} \in [1,\infty)$ and $\text{MRR} \in (0,1]$, their deep theoretical connection can be elegantly described through the lens of Pythagorean means. This blog post describes ideas Max Berrendorf shared with me that I recently implemented in PyKEEN and later wrote up as a full manuscript.

https://cthoyt.com/2021/04/19/pythagorean-mean-ranks.html

Discussions and Follow-ups from Biocuration 2024 | Biopragmatics

Discussions and Follow-ups from Biocuration 2024 | Biopragmatics

I’ve just returned from the 17th Annual International Biocuration Conference at the Indian Biological Data Centre (IBDC) in Faridabad, India. I wanted to highlight some of the interesting conversations I had while I was there, and ideas for follow-up. Most were centered around the Bioregistry and the Semantic Mapping Assembler and Reasoner (SeMRA), which I gave an oral presentation on.

https://cthoyt.com/2024/03/11/biocuration2024-discussions.html

Biosemantics vs. Biopragmatics | Biopragmatics

Biosemantics vs. Biopragmatics | Biopragmatics

In language, semantics describe the names and meanings of words. The bioinformatics community has aptly adopted biosemantics as a concept that encompasses the issues with the names and meanings of biological entities, usually in natural language processing and data integration. However, semantics does not capture the context of words, and biosemantics fails to describe the biological context and complex relationships between biological entities.

https://cthoyt.com/2020/01/22/biosemantics-versus-biopragmatics.html

Connecting Preprints to Peer-reviewed Articles on Wikidata | Biopragmatics

Connecting Preprints to Peer-reviewed Articles on Wikidata | Biopragmatics

After the BioCypher preprint went up on the arXiv, I checked in on the missing co-author items list on the Scholia page that reflects my Wikidata entry. In addition to the several co-authors of the BioCypher manuscript that I don’t know personally, I was curious to see which other papers of mine did not have fully complete co-author annotations. This post has a few SPARQL queries that I used to look into this as well as a few ongoing questions I have about the relationship between distinct entries for preprints and published articles.

https://cthoyt.com/2023/01/02/wikidata-preprints.html

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.