nhsx / ai-dictionary Goto Github PK
View Code? Open in Web Editor NEWPrototype AI Dictionary from the NHS AI Lab
Home Page: https://nhsx.github.io/ai-dictionary/
License: MIT License
Prototype AI Dictionary from the NHS AI Lab
Home Page: https://nhsx.github.io/ai-dictionary/
License: MIT License
Thank you for finding and reporting an inaccuracy in the term, please fill in the details below:
Term: Algorithm
Current content: A set of instructions that can be followed by a human or computer.
For example, the NHS algorithm for detecting Acute Kidney Injury is a set of instructions that can be repeated for multiple patients.
In AI, machine learning algorithms use data to make decisions.
Corrected content: A set of instructions that can be followed by a human or computer.
For example, the NHS algorithm for detecting Acute Kidney Injury is a set of instructions that can be repeated for multiple patients.
In AI, machine learning algorithms use data to make predictions or recommendations which can inform decision making.
Reason for correction including reference if appropriate: thanks to LE from the Lab
Thank you for finding and reporting an inaccuracy in the term, please fill in the details below:
Term: F1 score
Current content: An accuracy metric that combines the precision and recall values for a classification model into a single number, which ranges from 0 (poor accuracy) to 1 (high accuracy).
Corrected content: A metric which describes the accuracy of a classification model, by combining the precision and recall values into a single number, which ranges from 0 (poor accuracy) to 1 (high accuracy).
Reason for correction including reference if appropriate: Flagged by Vijay on the Hub as needing more explanation for generalists
Thank you for suggesting a new term, please enter its details below:
Title: Data clean room
Description:
Related terms: data protection
Reason for including new term including references if appropriate: Suggestion from Alasdair R. (LinkedIn)
See #16 but for line breaks:
Thank you for finding and reporting an inaccuracy in the term, please fill in the details below:
Term: MLOps
Current content: Machine learning operations is the process of safely deploying, monitoring and updating machine learning models in production, or real-world, environments.
Because machine learning models are built on data, and data can change, it is important to build resilience into the system so they can adapt to a changing environment.
Corrected content: Machine learning operations is the process of safely deploying, monitoring and updating machine learning models in production, or real-world, environments.
Because machine learning models are built on data, and data can change, it is important to build robustness into the system so they can adapt to a changing environment without losing performance.
Reason for correction including reference if appropriate: resilience needs to be defined from Rubeta/Hub
Thank you for suggesting a new term, please enter its details below:
Title: Data augmentation
Description: The process of artificially increasing the amount of data used to train a model, to reduce overfitting and improve model performance.
Commonly used in imaging applications, this can include rotating, cropping, adding noise or random levels of blur to existing images.
Related terms: overfitting, model, machine-learning
Reason for including new term including references if appropriate:
Steps to reproduce:
Thank you for finding and reporting an inaccuracy in the term, please fill in the details below:
Term: Validation data
Current content: Data that is not included in the training data, but is used to improve the performance of an AI model as it is trained.
This definition relates to the definition within AI, and not the regulatory aspects of medical devices.
Corrected content: Data that is not included in the training data, but is used to check the performance of the model as it is being trained. This is separate to the test data used to check the final performance of the model.
This definition relates to the definition within AI, and not the regulatory aspects of medical devices.
Reason for correction including reference if appropriate: I think it could be explained that 'validation data' is used as an initial check of the performance of the AI model but is strictly not used for training the model.
- Moyeen on the Hub
Currently related terms are read in from the array in json, and not ordered.
If we keep this behaviour, we should order the related terms in terms of relevance (by hand).
Thank you for suggesting a new term, please enter its details below:
Title: Linked data
Description:
Related terms:
Reason for including new term including references if appropriate: Suggestion from Pam/Hub
Thank you for suggesting a new term, please enter its details below:
Title: NLP
Description:
Related terms: transformer
Reason for including new term including references if appropriate: Suggestion from Pam/Hub
Thank you for suggesting a new term, please enter its details below:
Title: Multimodal AI
Description: An AI that brings different facets (smaller AI models or data verticals) of the target together to form a more complete predictor.
In healthcare, the term is also associated with specific 'modalities' of data, such as the sequences in mpMRI scanning. However, the concept of multimodality with respect to AI is more than just bringing different data verticals together - it is about how very different AI models effectively interoperate - even 'merge' - to create a whole that is greater than the sum of its parts.
Related terms: model
Reason for including new term including references if appropriate: Suggestion from @manishjiva #39
Thank you for finding and reporting an inaccuracy in the term, please fill in the details below:
Term: API
Current content: “Application Programming Interface”. A standardised way to share information. An API is much like a shipping container - it implements an agreed structure and mechanism for sharing data, and enables interoperability.
Corrected content: Application Programming Interface. A standardised way to share data. An API defines the mechanisms to receive and send data, which is agnostic to how the underlying data is stored.
For example, NHS Digital has a number of APIs available to help build modern healthcare technology.
Reason for correction including reference if appropriate: API – I’m not sure the analogy of it being like a shipping container helps particularly, Rubeta from Hub
To be checked with:
https://socialsharepreview.com/?url=https://nhsx.github.io/ai-dictionary
Suggestion from HR: migrate to https://schema.org/DefinedTerm
Thank you for suggesting a new term, please enter its details below:
Title: Neural networks
Description: Neural networks are an approach to machine learning, loosely inspired by nature, that can describe complex relationships using a broader range of data than traditional approaches.
For example, neural networks can be trained on image data to describe features in medical images such as tumours. They can also be trained on free text such as clinical notes, allowing their use in clinical coding applications.
Neural networks can also be trained on tabulated, or structured data, such as a spreadsheet. Their ability to model complexity often comes at the cost of explainability, whereby the more complex the model, the harder to explain it becomes.
Related terms: explainability, model, machine-learning, structured, unstructured, supervised, ai, deep-learning
Reason for including new term including references if appropriate: important term, suggested by Moyeen and Sharan from the Hub
Thank you for suggesting a new term, please enter its details below:
Title: Metadata
Description:
Related terms:
Reason for including new term including references if appropriate: Suggestion from Pam/Hub
Thank you for finding and reporting an inaccuracy in the term, please fill in the details below:
Thank you for finding and reporting an inaccuracy in the term, please fill in the details below:
standard:
**An agreed set of definitions, guidelines and sometimes technical approaches for a specific area. Formal Standards may be mandated by the Government, whereas de facto standards are created and used by communities working in that space.
For example, ISO 13485 is a designated Standard mandated by the UK Government for the development of a medical device. It includes specific guidelines and processes for the safe development of a medical device, and organisations must demonstrate their adherence to the Standard to be allowed to place a medical device on the market.**
**An agreed set of definitions, guidelines and sometimes technical approaches for a specific area. Formal Standards may be mandated by the Government, whereas de facto standards are created and used by communities working in that space.
For example, ISO 13485 is a UK Designated Standard for quality management systems for medical device, it can be used to demonstrate conformance with parts of the medical device regulations.**
Standards are entirely voluntary under the UK's device regulations.
Thank you for suggesting a new term, please enter its details below:
Title: Overfitting
Description: The process of building a model which is based too closely on the data. This results in a model which may be very accurate on the training data, but when tested on additional datasets such as the test data, unseen data or data from a new environment, performs badly.
Approaches to reduce overfitting include cross-validation, data augmentation and ensemble techniques (which combine different models).
Related terms: machine-learning, bias, model, training-data, test-data, cross-validation, underfitting, data-augmentation
Reason for including new term including references if appropriate: Important term, suggested by Moyeen on the Hub
Thank you for finding and reporting an inaccuracy in the term, please fill in the details below:
Term: Cross validation
Current content: An approach to generating different validation and training data sets from within a single data set, to improve the performance of a predictive model.
Popular techniques include k-fold cross validation.
Corrected content: An approach to reducing overfitting during model development, by iteratively selecting different portions of the data to train and validate a predictive (supervised) machine learning model.
Cross validation can increase the overall performance of a model, along with data augmentation techniques.
Reason for correction including reference if appropriate: by trying to be simple I wonder if the meaning has been lost here – the Wikipedia definition (below) could be simplified
Cross-validation is a resampling method that uses different portions of the data to test and train a model on different iterations. It is mainly used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice, from Rubeta/Hub
Thank you for suggesting a new term, please enter its details below:
Title: Underfitting
Description: The process of building a model which is not based closely enough on the data. This results in a model which performs badly and fails to capture the relationships you are looking for.
There is a balance to be made between underfitting and overfitting.
Related terms: machine-learning, model, training-data, overfitting
Reason for including new term including references if appropriate: Important term, suggested by Moyeen on the Hub
To implement #63 we need to format html tables.
Please see branch feature/table
for example structure.
Example formatting (ignore colour, although differentiating between false and positive terms would be useful): https://miro.medium.com/max/1400/1*fxiTNIgOyvAombPJx5KGeA.png
Currently the search bar searches through the term.title
.
As a user, I want to be able to search for a term by matching the slug or description of the term so that I can find the definition I'm interested in.
UAT: Searching for "natural" will return the current Machine Learning definition
UAT: Searching for "ai" will return the Artificial Intelligence definition
Thank you for suggesting a new term, please enter its details below:
Title: Deep Learning
Description: An approach to building models using neural networks with more than one "hidden" layer of artificial neurons. This is a common approach when working with image and text data.
Deep learning models are able to capture complex relationships but can be difficult to interpret what data leads to a particular outcome.
Related terms: machine-learning, supervised, ai, neural-network, explainability
Reason for including new term including references if appropriate: important term, suggested by Moyeen and Sharan from the Hub
Currently a related term can relate to itself
e.g. issue templates, PR templates, contributing, license etc.
Add automation to check for dead (404, DNR) links in the terms list.
From MX, via Hub/email:
What would be helpful is to add extra references in each of the words to allow users to dig a bit more into each of the concepts. Perhaps wikipedia links might be useful.
Thank you for finding and reporting an inaccuracy in the term, please fill in the details below:
Term: Unsupervised machine learning
Current content: A type of machine learning where you do not know the outcome or definition of your data, and are looking for patterns. This includes clustering techniques such as k-nearest neighbours and principal component analysis (PCA).
For example, unsupervised machine learning can help identify different groups of hospital patients who use hospital services in different ways.
Corrected content: A type of machine learning where you do not know the outcome or definition of your data, and are looking for patterns. This includes clustering techniques such as k-means and principal component analysis (PCA).
For example, unsupervised machine learning can help identify different groups of hospital patients who use hospital services in different ways.
Reason for correction including reference if appropriate: k-means not knn; thanks LE from Lab
Thank you for suggesting a new term, please enter its details below:
Title: Synthetic Data, Gradient descent, Binary, Sequential data
Description:
Related terms:
Reason for including new term including references if appropriate: Inbound over email from Dr. Z.A.
To validate JSON schema
Thank you for finding and reporting an inaccuracy in the term, please fill in the details below:
Term: Validation data
Current content: Data that is not included in the training data, but is used to improve the performance of an AI model as it is trained.
This definition relates to the definition within AI, and not the regulatory aspects of medical devices.
Corrected content: Data that are not included in the training data, but are used to improve the performance of an AI model as it is trained.
This definition relates to the definition within AI, and not the regulatory aspects of medical devices.
Reason for correction including reference if appropriate: data is a plural word – therefore the statement should read: Data that are not…..but are used…….
Suggestion from HR: Have you got plans to tag/semver it so that one can refer to it from a QMS? I was thinking that you might want to separate the related terms so that the versioned term definitions are cleaner.
Thank you for suggesting a new term, please enter its details below:
Title:
Graph Neural Networks
Description:
Graph Neural Networks (GNNs) are a class of deep learning methods designed to perform inference on data described by graphs.
Graphs are a very powerful way of representing data, relationships and their complexity. Training machine learning models to learn relationships in graphs and predict their features as more data integrates into the graph.
GNNs are neural networks that can be directly applied to graphs, and provide an easy way to do node-level, edge-level, and graph-level prediction tasks.
In recent years, variants of GNNs such as graph convolutional network (GCN), graph attention network (GAT), graph recurrent network (GRN) have demonstrated ground-breaking performances on many deep learning tasks.
An example of data in healthcare representable as graphs is psychopathology networks consist of aspects (e.g., symptoms) of mental disorders (nodes) and the connections between those aspects (edges). A trained GNN on this graph will be able to predict disorders based on the provided symptoms.
Related terms:
Neural Networks
Models
Deep Learning
Reason for including new term including references if appropriate:
Reference:
1- https://neptune.ai/blog/graph-neural-network-and-some-of-gnn-applications
2- https://arxiv.org/abs/1812.08434
Graphs are a very powerful way of representing data, relationships and their complexity. Training machine learning models to learn relationships in graphs and predict their features as more data integrates into the graph.
Thank you for suggesting a new term, please enter its details below:
Title: Federated Learning
Description:
Related terms: Data protection, machine learning
Reason for including new term including references if appropriate: Suggestion from Alasdair R. (LinkedIn)
XAI
Typically, AI solutions adopt a "black box" approach in which it is impossible (or at the least very difficult) to explain how the model generated a specific answer. XAI, short for Explainable Artificial Intelligence, refers to when it is possible for humans for understand how the results of an AI model were obtained.
Thank you for suggesting a new term, please enter its details below:
Title: Generative Adversarial Network (GAN)
Description:
Related terms: Neural Network, Deep Learning, AI
Reason for including new term including references if appropriate: Suggestion from Alasdair R. (LinkedIn)
Thank you for finding and reporting an inaccuracy in the term, please fill in the details below:
Term: Data
Current content: Information stored in a digital way.
For example, this can be information on your physical state such as heart rate, blood pressure, or notes on your recent visit to your primary care physician. A picture can be data, as can audio but we are struggling to standardise how to represent smell (for the time being).
Corrected content: Information stored in a digital way.
For example, this can be information on your physical state such as heart rate, blood pressure, or notes on your recent visit to your primary care physician.
Imaging data is a common type of healthcare data, which includes data generated from X-ray machines, CT scanners, MRI scanners, OCT systems etc.
Reason for correction including reference if appropriate: I wasn't sure why smell was mentioned – and actually smell essentially detects volatile compounds in the skin so that would be the data to collect https://pubmed.ncbi.nlm.nih.gov/21079799/ from Rubeta/Hub
Thank you for suggesting a new term, please enter its details below:
Confusion Matrix
A tool that is used to describe the performance of a classification model (or "classifier") on a set of test data for which the true values are known
False Positive
False Negative
True Positive
True Negative
Reason for including new term including references if appropriate:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.