idigitopia / idigitopia.github.io Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 0.0 22.14 MB

HTML 78.42% CSS 21.58%

idigitopia.github.io's People

Contributors

Watchers

idigitopia.github.io's Issues

Errudite: Scalable, Reproducible, and Testable Error Analysis

paper

Brief Thoughts:

Motivation: Errors should be analyzed methodologically than in ad hoc way, as errors are the only training signal for understanding and improving ML systems.
Notion:
- Errors should be grouped, and the groups should be precisely defined for reproducibility.
- Large subsets of instances should be analyzed before making any conclusions.
- Hypotheses regarding the cause of errors should be explicitly tested.
At its core, Erudite is an expressive domain-specific language (DSL) for precisely querying instances based on linguistic features.
- The same notion can be defined for other domains such as images or handcrafted features as well, although this may need some more work.
- The work however is mostly focused on NLP error analysis.
The context presented by the authors is especially challenging as it plays with unstructured inputs and outputs.
Interactive user interface is a welcome addition to the framework.
Discussion: How can DSL definition process be automated and how do we verify that these approaches help without the need of extensive user studies.

Visualizing Data using t-SNE

paper, demo1, demo2

Brief Thoughts:

Motivation: High dimensional data to low dimension (2 or 3) for visualization.
Can also be viewed as a clustering approach such that low-dimensional representations of very similar data points are kept close together.
preserve the local structure of the data, while revealing global structures.
use non-linear mapping of the underlying high-dimensional dataset instead of traditional linear approaches.
tunable parameters such as perplexity and learning rate.
- perplexity: conditional probability distribution induced by a gaussian kernel.
- intuitively it serves as a way to balance attention between local and global aspects of the data.
it's good to visualize the data but not directly usable to better optimize the model or compare two representations.
cluster size and distance have relatively less meaning and is not grounded on the dataset.

Neuroscience Chapter - RLBook 2020

book

Dendrite, Axons.
RL has not inspired Neuroscience or vice versa as of yet.
It is primarily and investigation to how we map findings and draw colorralies from Neuroscience to RL.
A fundamental question is does the brain learn or adapt similar to an RL Agent?
Dopamine neurons have a massive connection pool to other neurons. Dopmanie whose behavior looks lot like TD Errors. So if rewards is being passed by dopamines it would be very similar to TD learning. (some evidence Shown)
If a neuron fires and has a way of calculating a reward or penalty of that particular firing and are also able to track what previous neuron firings affected this particular firing, Then we can see the learning mechanism as an eligibility trace. (just a ypothesis)
The brain could also have actor critic mechanism with different parts acting as an actor or a critic. (just a hypothesis)

The What-If Tool: Interactive Probing of Machine Learning Models

paper

Brief Thoughts:

From: Google PAIR (People + AI Research Initiative)
Motivation: Code-free and visualization based probing of machine learning models.
Notions:
- Global measures = Classification Accuracy, precision-recall curve, logarithmic loss, ROC, KL divergence among others
- Local measures = subgroups in test data, cross-slices in test data, individual data point, perturbed data point, a different threshold/ optimization procedure of model, neighborhood of data points.
- moving from global measures to more local measures
Identify the limitation of the model and make it explainable and fairer.
About the Tool:
- Runs in the browser. (easy jupyter notebook plugin)
- for multiple stakeholders. ML experts and non-experts.
- model agnostic.
- allows the comparison of models.
- allows slice to test datasets of features.
- allows counterfactual analysis (find the nearest point that is classified differently.)

Rule Matrix: Visualizing and Understanding Classifiers with Rules.

paper, video

Brief Thoughts:

Motivation: Help solve the explainability and interpretability crisis.
Notion: Rule-based explanatory interface.
Generate a Rule list by a Rule induction algorithm then filter the rule list with help of support and confidences.
Visualize at the end.
Rule has an antecedent and condecedent.
4 main components
- Control Panel
- The rule matrix (Data Flow, Rule Matrix, Support View)
- Data Filters
- Data Table.
Some Opinions:
- The interface seems to provide too much data at once and the user is left not knowing what you are looking at until you know how the visualization works. (just a lengthy way of saying, I did not find it "intuitive")
- Rule matrix could use some labeling, rather than me having to remember that the x-axis is the feature and Y-axis is the rule.
- All in all, it is a fantastic example of information compression, but I do not see how this is going to help someone understand
  the model better in an easier way.
- There should be someplace where I can see global metrics, such as the accuracy of the model.
- The tool should be able to filter the data where the agent's accuracy is low, not the user.

Invited Talk: Offline RL by Nando De freitas.

Offline RL Workshop, NeurIPS2020

Is Pessimism Provably Efficient for Offline RL?

paper

Overview

Question: Is it possible to design a probably efficient algorithm for offline RL under minimal assumptions on the dataset?

offline RL suffers from the insufficient coverage of the dataset, which eludes most existing theoretical analysis.
propose a pessimistic variant of the value iteration algorithm (PEVI)
penalty function simply flips the sign of the bonus function for promoting exploration in online RL
Establishes a data-dependent upper bound on the suboptimality of PEVI for general Markov decision processes (MDPs).

Intro:

Online RL requires massive data, hence difficult to apply in the real world.
Offline RL starts with a fixed dataset, but remains less understood in theory.
The suboptimality of any algorithm in RL is sourced from 3 sources:
- Intrinsic uncertainty: dataset fails to carry trajectories that can be extrapolated to get the optimal policy.
- Spurious correlation: dataset carries trajectories when extrapolated misguided the agent and results in a bad policy.
- Optimization error: dataset has everything but you cannot optimize the function approximator.
pessimism allows PEVI to elimitate the spurious correlation.
PEVI is minimax optimal for linear MDP.

Guidelines for Human-AI Interaction

paper demo

Context:
- AI is probabilistic and prone to errors. (For example weather reports are not always accurate, and probabilistic in nature. 40% chance of rainfall)
- AI learns and changes over time. (For example, My youtube recommendations are never of the same quality)
- AI can violate established principles of traditional UI Design. (For example conditions, consistency)
Notion: "Issues caused by AI frequently can't and don't need to be solved by more fancy algorithms, but by good design that mitigates them."
Motivation: We wish to go from Principia to pragmatism, converting an extensive body of literature for designing for human-AI interaction to 18 actionable guidelines.
Guidelines:
- Initially: When the UI is being designed.
- During Interaction: When the user interacts with the UI.
- When wrong: When the output of the AI is wrong.
- Over time: Human - AI interaction over a period of time.
"When wrong" is probably the most important section for explainability and interpretability as it is more crucial to be able to interact with the AI better when it is wrong or harder to understand.
In summary, these guidelines help one focus on the Human needs of AI design.

ReBeL - Combining Deep Reinforcement Learning and Search for Imperfect-Information Games

[link], [video]

Brief Thoughts:

Motivation: simple heuristic search with bootstrapping values (Monte Carlo Tree Search) may not recover the optimal policy for imperfect information games.

idigitopia / idigitopia.github.io Goto Github PK

idigitopia.github.io's People

Contributors

Watchers

idigitopia.github.io's Issues

Brief Thoughts:

Brief Thoughts:

Brief Thoughts:

Brief Thoughts:

Overview

Intro:

Brief Thoughts:

Recommend Projects

Recommend Topics

Recommend Org