asappresearch / flambe Goto Github PK
View Code? Open in Web Editor NEWAn ML framework to accelerate research and its path to production.
Home Page: https://flambe.ai
License: MIT License
An ML framework to accelerate research and its path to production.
Home Page: https://flambe.ai
License: MIT License
Is your feature request related to a problem? Please describe.
Currently, you can't pass in positional arguments when running a script.
Describe the solution you'd like
Let's add a new argument to Script
called pos_args
which is a list of strings. So it'd look like:
my_project: /path/to/my_pip_installable
---
!Experiment
pipeline:
stage_0: !Script
script: my_project.train # my_project is the name of the module
args:
pos_args:
- pos1
- pos2
arg1: 'foo'
arg2: 'bar'
which would result in a call script pos1 pos2 --arg1 foo --arg2 bar
.
Describe alternatives you've considered
Making my script only take keyword arguments. This is possible, but I think it'd help flambe to support positional arguments.
Additional context
N/A
Is your feature request related to a problem? Please describe.
I forgot to set up __init__.py
properly in my custom extension, and the error message is not quite obvious about what went wrong.
Describe the solution you'd like
It'd be great to have an example of an __init__.py
in the doc just like the one for setup.py
.
Describe alternatives you've considered
Asking @nmatthews-asapp, who helped me out!
Additional context
The current implementation where the encoder is part of the embedder prevents "one embedder, two encoder" implementations. As the embedder is oftentimes the largest single matrix in an NLP model, this can lead to an unnecessary increase in memory usage.
Describe the bug
An error pop's up when using the forward method from AvgPooling
To Reproduce
Steps to reproduce the behavior:
Use the layer with any matching dimensions.
return data / padding_mask.sum(dim=1)
RuntimeError: The size of tensor a (300) must match the size of tensor b (64) at non-singleton dimension 1
Expected behavior
The result should be the average, given the padding_mask.
Software Versions (please complete the following information):
Hi,
if I try to load a Component
registered through make_component
, I obtain a TaggedScalar
instead of a Schema
instance. For example:
test.yaml:
!ray.HyperBandScheduler
test.py:
schema = yaml.load(open("test.yaml"))
print(schema)
# Output: <ruamel.yaml.comments.TaggedScalar at 0x7fe8db40b310>
# Expected output: flambe.compile.component.Schema(...)
How can I obtain a Schema
instance?
Currently, we disabled the tests for a couple of dataset here #126
Once we make Circle CI work with the request for the zip files, we should add them back.
Is your feature request related to a problem? Please describe.
Currently, customizing a model requires sometime overriding the Trainer. It would be best if the Trainer was an object that users didn't have to override. Furthermore, it would be good to be able to set more defaults values across the board. The Trainer is very generic which is great, but relying on the model more would simplify the interfaces, and improve user experience. The Trainer would be in charge of executing training, fp16, model parallelization, some logging, etc..
Describe the solution you'd like
The solution is to create a Model class with a set of methods (some of which optional) to be called by the Trainer. This has the added benefits that the new model objects can implement defaults for common parameters such as loss function or sampler.
Here are a set of potential methods for the model class:
Some questions to answer:
The debug mode is already very useful in its current state. However, it would be great if it was extended to include the following features:
config.yaml
that will only load the first k% of the data (i.e., only load a small part from disk). This will make debugging much faster, as the model's forward call will be triggered much earlierconfig.yaml
, such as debugging-specific dimensionalities that could be much smaller, allowing for faster load time and the use of smaller GPUs.cluster.yaml
is specified, it would make sense to issue an early warning, but otherwise disable debug mode and proceed normally. This would override this issue.Possible solutions:
First solution:
--debug
parameter to the runner/run.py
/ flambe
scripts that triggers debug mode instead of setting the mode via the configconfig.yaml
that can override each parameter--debug-file
parameter that enables debug mode and takes the overrides from a 2nd fileSecond solution:
Add a special character combination (like !g
) that allows you to set two values in the config.yaml
, where the first is for non-debug mode, and the second is for debug mode. E.g.:
model: !RNN
hidden_dim: !d [600, 100]
num_layers: !d [4, 1]
would set the hidden dim to 100 and the num layers to 1 in debug mode.
Describe the bug
As one example, Flambe passes attention_mask
into the forward
method of GPT2Model
.
This parameter is only available in the version of this model as found in the transformers
library.
Flambe imports from pytorch_transformers
, and so does not interact with this model (and potentially others?).
To Reproduce
!Experiment
name: sst-text-classification
pipeline:
# stage 0 - Load the Stanford Sentiment Treebank dataset and run preprocessing
dataset: !SSTDataset
transform:
text: !GPT2TextField
alias: gpt2
label: !LabelField
# Stage 1 - Define a model
model: !TextClassifier
embedder: !GPT2Embedder
alias: gpt2
output_layer: !SoftmaxLayer
input_size: !@ model[embedder].hidden_size
output_size: !@ dataset.label.vocab_size
# Stage 2 - Train the model on the dataset
train: !Trainer
dataset: !@ dataset
model: !@ model
train_sampler: !BaseSampler
batch_size: 3
# drop_last: true
val_sampler: !BaseSampler
batch_size: 3
# drop_last: true
loss_fn: !torch.NLLLoss
metric_fn: !Accuracy
optimizer: !torch.Adam
params: !@ train[model].trainable_params
max_steps: 1
iter_per_step: 1
Error
...
pred, target = self.model(*batch)
File "/persist/conda/envs/flambe/lib/python3.6/site-packages/torch/nn/modules/module.py", line
493, in __call__
result = self.forward(*input, **kwargs)
File "/persist/git/flambe/flambe/nlp/classification/model.py", line 71, in forward
encoding = self.embedder(data)
File "/persist/conda/envs/flambe/lib/python3.6/site-packages/torch/nn/modules/module.py", line
493, in __call__
result = self.forward(*input, **kwargs)
File "/persist/git/flambe/flambe/nlp/transformers/utils.py", line 162, in forward
head_mask=head_mask)
File "/persist/conda/envs/flambe/lib/python3.6/site-packages/torch/nn/modules/module.py", line
493, in __call__
result = self.forward(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'attention_mask'
I was just thinking, maybe it would be convenient to have a standard location where flambe will look for the cluster.yaml, so then you wouldn't need to mention it in each flambe command.
Describe the bug
When I specify activation layers for the MLPEncoder, I see the following error:
2019-09-03 11:51:52,138 ERROR trial_runner.py:487 -- Error processing event.
Traceback (most recent call last):
File "/home/peter/code/flambe/venv/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 436, in _process_trial
result = self.trial_executor.fetch_result(trial)
File "/home/peter/code/flambe/venv/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 323, in fetch_result
result = ray.get(trial_future[0])
File "/home/peter/code/flambe/venv/lib/python3.6/site-packages/ray/worker.py", line 2195, in get
raise value
ray.exceptions.RayTaskError: ray_worker (pid=9937, host=peter-MS-7758)
File "/home/peter/code/flambe/venv/lib/python3.6/site-packages/ray/tune/trainable.py", line 87, in __init__
self._setup(copy.deepcopy(self.config))
File "/home/peter/code/flambe/flambe/experiment/tune_adapter.py", line 76, in _setup
block.load_state(state)
File "/home/peter/code/flambe/flambe/compile/component.py", line 1093, in load_state
load(self)
File "/home/peter/code/flambe/flambe/compile/component.py", line 1092, in load
load(child, prefix + name + STATE_DICT_DELIMETER)
File "/home/peter/code/flambe/flambe/compile/component.py", line 1092, in load
load(child, prefix + name + STATE_DICT_DELIMETER)
File "/home/peter/code/flambe/flambe/compile/component.py", line 1092, in load
load(child, prefix + name + STATE_DICT_DELIMETER)
[Previous line repeated 1 more time]
File "/home/peter/code/flambe/flambe/compile/component.py", line 1083, in load
unexpected_keys, error_msgs)
File "/home/peter/code/flambe/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 685, in _load_from_state_dict
hook(state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs)
File "/home/peter/code/flambe/flambe/compile/component.py", line 977, in _load_state_dict_hook
version = local_metadata[VERSION_KEY].split('.')
KeyError: '_flambe_version'
To Reproduce
Steps to reproduce the behavior:
Use the following yaml file:
!Experiment
name: sst-text-classification
pipeline:
# stage 0 - Load the Stanford Sentiment Treebank dataset and run preprocessing
dataset: !SSTDataset
transform:
text: !TextField
label: !LabelField
# Stage 1 - Define a model
model: !TextClassifier
embedder: !Embedder
embedding: !torch.Embedding # automatically use pytorch classes
num_embeddings: !@ dataset.text.vocab_size
embedding_dim: 300
embedding_dropout: 0.3
encoder: !MLPEncoder
input_size: 300
output_size: 128
output_activation: !torch.ReLU
n_layers: 2
hidden_size: 256
hidden_activation: !torch.ReLU
output_layer: !SoftmaxLayer
input_size: 128
output_size: !@ dataset.label.vocab_size
# Stage 2 - Train the model on the dataset
train: !Trainer
dataset: !@ dataset
model: !@ model
train_sampler: !BaseSampler
val_sampler: !BaseSampler
loss_fn: !torch.NLLLoss
metric_fn: !Accuracy
optimizer: !torch.Adam
params: !@ train.model.trainable_params
max_steps: 10
iter_per_step: 100
# Stage 3 - Eval on the test set
eval: !Evaluator
dataset: !@ dataset
model: !@ train.model
metric_fn: !Accuracy
eval_sampler: !BaseSampler
# Define how to schedule variants
schedulers:
train: !tune.HyperBandScheduler
Expected behavior
No error should be raised.
Screenshots
Software Versions (please complete the following information):
Additional context
When running on a cluster, clicking both the "download" buttons on the report site can lead to a crash of parts of the webapp.
steps to reproduce:
Oops! Results not available
Sometimes our custom save format doesn't work well for certain architectures, or users would just prefer pickle, or because the save format isn't as mature as pickle there can be bugs. For all these reasons, we should enable pickle checkpointing in Experiment
Describe the bug
If you try to follow this example, you get an error Unexpected key(s) in state_dict
. This is b/c of a bug in this if clause:
flambe/flambe/compile/component.py
Lines 1264 to 1269 in aad135b
The else clause is never hit if module
is flambe.nn.Module
b/c it inherits from torch.nn.Module
, and therefore any custom attribute is treated as an unexpected key.
To Reproduce
Follow the example in the doc.
Expected behavior
The custom attribute should be loaded correctly.
Screenshots
If applicable, add screenshots to help explain your problem.
Software Versions (please complete the following information):
Additional context
Add any other context about the problem here.
I'm often seeing the following ValueError
:
File "/persist/git/flambe/flambe/experiment/progress.py", line 66, in refresh
k, v = h.split('=')
ValueError: not enough values to unpack (expected 2, got 1)
This happens when viewing TensorBoard in browser, and refreshing the page; the above error is logged to console by the server. This doesn't break anything, but it seems like a bug all the same.
Is your feature request related to a problem? Please describe.
Use-case: I use flambé to both debug models then grid-search over the stuff I'm happy with.
To debug, I often need to see the predictions the model is making. This includes (in a classification problem) the predicted index and a map from the index to its label.
Describe the solution you'd like
Some option in Trainer
(re: predicting on the val set) and Evaluator
(re: predicting on the test set) that logs predictions for me in a thorough manner--all things I'd want to inspect offline, in other words. This would include: the inputs, the full predicted output, and the target.
Thereafter, I would load this data into (say) a notebook, and start to inspect what's going on.
Describe the bug
Because of a new safety check introduced in #136 checkpointing may fail when overwriting a previous save file. It should be possible to overwrite a file when saving. Whether or not it should be opt-in or opt-out is open for discussion.
Describe the bug
If you use relative paths for local resources, flambe components no longer can find those files anymore due to the fact that the cwd is the artifact directory.
To Reproduce
Use relative paths when specifying local resources. Components that use them would raise a file not found error.
Expected behavior
Components who use these local resources should be able to find them.
Screenshots
N/A
Software Versions (please complete the following information):
Additional context
N/A
Is your feature request related to a problem? Please describe.
The BaseSampler
will handle arrays of tensors elegantly. However, these arrays only come to be if you manually construct them yourself.
Describe the solution you'd like
To be able to have an array-as-one-observation, and for the TextField
to handle that gracefully (then pass it along to the BaseSampler
).
Additional context
Models that embed a large sequence of context utterances, where simply joining those utterances into one string makes for one really long string.
Describe the bug
If you run the reporting server without tensorboard, you see Flask logs like so:
* Serving Flask app "flambe.experiment.webapp.app" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off
* Running on http://localhost:12345/ (Press CTRL+C to quit)
127.0.0.1 - - [23/Nov/2019 04:50:33] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [23/Nov/2019 04:50:34] "GET /state HTTP/1.1" 200 -
However, if you run the reporting server with tensorboard installed, for some reason, the reporting server stops showing Flask logs.
* Serving Flask app "flambe.experiment.webapp.app" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off
The reporting server and tensorboard function fine otherwise. However, this is a bit annoying b/c I can't see which port the reporting server runs on if I don't specify it.
To Reproduce
Expected behavior
Flask should log even with tensorboard.
Screenshots
N/A
Software Versions (please complete the following information):
Additional context
N/A
Is your feature request related to a problem? Please describe.
Currently the experiment always progresses to later stages even if there was a trial error. This is sometimes okay, but there should be some kind of opt-in feature to stop as soon as a trial fails to avoid wasted computation.
Describe the solution you'd like
Opt-in flag on Experiment
that can be set in either the config or via the CLI
Describe the bug
When running a Builder
, the extensions are not getting registered as in: https://github.com/Open-ASAPP/flambe/blob/ccba9762d2f27e8898d17e26bf3a14eae0b55355/flambe/experiment/experiment.py#L279
This causes the config.yaml
file not to contain the extension section and therefore, the output folder fails when being used with flambe.load
.
To Reproduce
Execute any Builder
where the component
is coming from an extension
Software Versions (please complete the following information):
Is your feature request related to a problem? Please describe.
Currently, the flambe report webpage displays a dependency graph of all the stages in the pipeline. While it looks cool, once you have more than three stages, it's pretty much impossible to read it.
Describe the solution you'd like
I think a simpler tree like you see on CircleCI and other CI/CD websites are a lot easier to read and more useful.
Describe alternatives you've considered
I don't pay attention to the graph b/c it's too complex to understand anyway.
Additional context
Describe the bug
This line
https://github.com/asappresearch/flambe/blob/master/flambe/compile/serialization.py#L369
has a potential risk of removing undesired stuff from the file system.
Solution
We need to uncompress the gz
in a temp folder and let the OS do the cleanup.
Is your feature request related to a problem? Please describe.
Currently it's not very explicit how links are resolved i.e. are they resolved against the config (nested dictionary structure) or against the attributes of the objects once they are initialized. The answer is both; it depends on what you're linking to in the config and where that link is.
Describe the solution you'd like
Change links to be most more powerful and more intuitive by supporting pointing to a specific object in the config and then accessing attributes on that object. This will look like:
!@ train_stage[model][embedder][encoder].rnn.hidden_size
for example. The brackets work similar to brackets in normal python, they access a specific object in the nested dictionary structure created from the YAML config. The dot notation is then used to access attributes on the initialized object.
Describe alternatives you've considered
We've considered leaving it implicit, or dropping the attribute access altogether, but these limitations are confusing and, well, limiting. We also considered other syntax for exactly the same logic, but this familiar syntax should be intuitive to new users as it mirrors how things are done in Python.
It would be great if we could extract torchscript models from flambe models, after training. That requires all flambe base models to be compatible with torchscript. After a very superficial attempt to make this work, I found that (at least) some typing instructions in flambe.component
are torchscript incompatible.
Is your feature request related to a problem? Please describe.
No
Describe the solution you'd like
When using sru
as rnn type it could be useful to expose an activation parameter:
https://github.asapp.dev/ASAPPinc/prodml/blob/062c2d10f0475acd9f2fca9659bebf1450951038/asapp/model/sru/sru_functional.py#L444
Describe the bug
You can't quite use relative paths with TabularDataset.from_path
b/c the current working directory is the output directory.
os.getcwd()
'/path/to/your/project/flambe-output/output__experiment/dataset/0_2019-08-06_20-56-50cyx6metq'
If you use a relative path like ../../../../data
, it'd work, but this is not really intuitive.
To Reproduce
Steps to reproduce the behavior:
!Experiment
name: experiment
pipeline:
dataset: !TabularDataset.from_path
train_path: data/train.csv # any local path
val_path: data/val.csv # any local path
test_path: data/test.csv # any local path
(pid=15596) Warning: failed to load file {file_path}
(pid=15596) [Errno 2] File b'data/train.csv' does not exist: b'data/train.csv'
Expected behavior
The expectation is that the relative path would start from the directory where the yaml file is.
Screenshots
If applicable, add screenshots to help explain your problem.
Software Versions (please complete the following information):
Additional context
Add any other context about the problem here.
This is a feature request bordering on a bug. Right now, flambe does not allow to track metrics during training. This, however, is essential to monitor learning.
One problem that I see is that it does not make sense to compute the train metrics after an entire train epoch, as flambe does for test/eval metrics. Given the size of some datasets, this is not really feasible.
Consequently, the interface to the metrics needs to be able to accommodate for the incremental computation of the metrics. That, in turn, requires a decision as to how this should be implemented, partly because not every metric supports incremental computation (think: AUC).
Unfortunately, having incremental computation requires to keep track of previous computations - i.e., we need a state that we update incrementally
From the top of my head, these are the choices we have:
First option: make the metrics state-ful.
incremental
method, added to the metric, could be used to update the metricSecond option: add a metric-state object.
incremental
call of the metric (and any other, possibly, to have a uniform interface)Third option: add local tracking for each metric
(I don't think this is a good option, but I wanted to mention it for completeness)
It is frustrating that the debug: True
flag is only tested once the cluster has started up. It would make more sense to check for debug: True
- and other cluster-relevant flags - before starting the cluster.
I think that the same could be said for any error that results from a problem with the experiment yaml. Another example is passing a wrong argument name.
A possible solution would be a "dry run" of the config locally first, maybe.
Describe the bug
Line 146 in 703c343
raises this error with torch 1.3:
RuntimeError: Negation, the `-` operator, on a bool tensor is not supported. If you are trying to invert a mask, use the `~` or `logical_not()` operator instead.
Software Versions (please complete the following information):
Describe the bug
I just did a clean, fresh install for flambe 0.4.6 and I see this error messsage:
botocore 1.13.13 has requirement python-dateutil<2.8.1,>=2.1; python_version >= "2.7", but you'll have python-dateutil 2.8.1 which is incompatible.
flambe itself works fine.
To Reproduce
Steps to reproduce the behavior:
pip install flambe==0.4.6
Expected behavior
We shouldn't see any error message.
Software Versions (please complete the following information):
Describe the bug
Currently links inside of lists don't get their targets updated properly before compilation. This should be easily fixable by having the traverse
method used in this utility function also recurse on sequences.
tmux
is currently used to launch the processes in the clusters (flambe
and flambe-site
) and there are some occasions where the process returns a non-success exit code but the process is actually launched correctly.
There is a current fix that checks if the process is running (regardless the exit code), but this may not be the best solution for the problem.
Describe the bug
The base sampler produces the following warning:
py.warnings [block_train] /home/ubuntu/.local/lib/python3.6/site-packages/flambe/sampler/base.py:167: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().re
quires_grad_(True), rather than torch.tensor(sourceTensor).
tensors = [torch.tensor(example) for example in column]
To Reproduce
Steps to reproduce the behavior:
Simply run the base sampler with verbose logging.
Expected behavior
There should be no warning.
Screenshots
N/A
Software Versions (please complete the following information):
Additional context
N/A
Is your feature request related to a problem? Please describe.
When certain configs have a duplicate key in the config, then flambe
fails with:
ruamel.yaml.constructor.ConstructorError: could not determine a constructor for the tag xxxx
I expect this happens with the ones that have a Runnable from an extension as top level object in the config
Describe the solution you'd like
The error message should be clear on what's happening:
The provided configuration file contains duplicated keys 'xxxx'
Is your feature request related to a problem? Please describe.
Currently registering multiple classes of the same name with YAML will not work properly. We should allow multiple distinct classes of the same name to be registered in different namespaces, e.g. "NLLLoss" and "torch.NLLLoss"
Describe the solution you'd like
The registry should be a separate entity (we should not rely on ruamel.yaml to maintain the registry) so that we can easily manage these cases. The registry will be implemented to organize namespaces as a mapping for tags and classes. Whenever ruamel.yaml is needed, the registry can be synced with yaml to ensure it's up to date.
Is your feature request related to a problem? Please describe.
For most research, new models/algorithms/etc are tested on specific "standard" datasets.
Describe the solution you'd like
flambe-integrated datasets that could be used by, e.g., having the following config.yaml
entry: dataset: !flambe.MultiNLI
Additionally, it would be great to even have this functionality over entire suits of datasets, such as: dataset: !flambe.GLUE_ExperimentSuite
Is your feature request related to a problem? Please describe.
Each metric in extra_validation_metrics
(provided to the default Trainer
) must be of a different classes, because each is written to Tensorboard using its class name. This prohibits logging different metrics of the same class with different configurations, because they'll be overwritten.
Describe the solution you'd like
Make extra_validation_metrics
a key-value mapping rather than a list.
Describe alternatives you've considered
This could be handled by a custom Trainer, or each metric with different parameter could be implemented as a separate class.
Additional context
Is your feature request related to a problem? Please describe.
There is no clean unified interface for datasets.
Describe the solution you'd like
A unified interface should include the following properties:
Is your feature request related to a problem? Please describe.
The flambe-output
folders take up a lot of space.
Describe the solution you'd like
An interactive flambe clean
tool; upon invocation, it prints the locations of all flambe-output
folders in a specified directory, their respective sizes, and a Y/n
option to delete them en masse.
Describe alternatives you've considered
Writing a script to do this myself.
Is your feature request related to a problem? Please describe.
No.
Describe the solution you'd like
I'd like a tb_plots_save_dir
option in the Flambe config; if given, the runner saves all TensorBoard plots to that dir.
Describe alternatives you've considered
Writing a tool to do this myself.
Additional context
Erm, no.
Describe the bug
I run the example, and it doesn't work.
To Reproduce
Steps to reproduce the behavior:
Paste the following into a file called foo.yaml
:
!Experiment
name: foo
pipeline:
dataset: !SSTDataset
transform:
text: !BERTTextField.from_alias
alias: 'bert-base-uncased'
lower: true
label: !LabelField
teacher: !TextClassifier
embedder: !Embedder
embedding: !BERTEmbeddings.from_alias
path: 'bert-base-uncased'
embedding_freeze: True
encoder: !BERTEncoder.from_alias
path: 'bert-base-uncased'
pool_last: True
output_layer: !SoftmaxLayer
input_size: !@ teacher.embedder.encoder.config.hidden_size
output_size: !@ dataset.label.vocab_size # We link the to size of the label space
student: !TextClassifier
embedder: !Embedder
embedding: !BERTEmbeddings.from_alias
path: 'bert-base-uncased'
embedding_freeze: True
encoder: !PooledRNNEncoder
rnn_type: sru
n_layers: 2
hidden_size: 256
pooling: last
output_layer: !SoftmaxLayer
input_size: !@ student.embedder.encoder.hidden_size
output_size: !@ dataset.label.vocab_size
finetune: !Trainer
dataset: !@ dataset
train_sampler: !BaseSampler
batch_size: 16
val_sampler: !BaseSampler
batch_size: 16
model: !@ teacher
loss_fn: !torch.NLLLoss
metric_fn: !Accuracy
optimizer: !AdamW
params: !@ finetune.model.trainable_params
lr: 0.00005
distill: !DistillationTrainer
dataset: !@ dataset
train_sampler: !BaseSampler
batch_size: 16
val_sampler: !BaseSampler
batch_size: 16
teacher_model: !@ finetune.model
student_model: !@ student
loss_fn: !torch.NLLLoss
metric_fn: !Accuracy
optimizer: !torch.Adam
params: !@ distill.student_model.trainable_params
lr: 0.00005
alpha_kl: 0.5
temperature: 1
Then, run flambe foo.yaml
.
NB: I changed the line input_size: !@ model.embedder.encoder.config.hidden_size
to input_size: !@ teacher.embedder.encoder.config.hidden_size
.
Expected behavior
Experiment works.
Screenshots
Error: AttributeError: 'BERTTextField' object has no attribute 'embeddings'
Software Versions (please complete the following information):
Is your feature request related to a problem? Please describe.
Yes. I'm trying to dynamically inject pathnames into a config, then run an experiment using that config. Out of the box, there is no clean way to do this.
Describe the solution you'd like
A function that accepts a path to a Jinja2-templated flambe config, an output path, and key:val pairs to inject.
Describe alternatives you've considered
Loading the config into memory with YAML or flambe-YAML tools, editing in memory, then writing a new config to disk.
Additional context
Here's what I've come up with:
import os
import re
import jinja2
def generate_config_from_template(template_path, config_path, remove_comments=False, **template_kwargs):
dirname = os.path.dirname(template_path)
basename = os.path.basename(template_path)
loader = jinja2.FileSystemLoader(searchpath=dirname)
env = jinja2.Environment(loader=loader)
template = env.get_template(basename)
with open(config_path, 'w') as f:
for line in template.render(**template_kwargs).split('\n'):
if remove_comments:
line = re.sub('# .*', '', line).rstrip()
if line:
f.write(line + '\n')
Where a config template might look like:
post_process_preds: 'post_process_preds_ext'
---
!Experiment
name: ada-text-classification
pipeline:
# stage 0 - Load the dataset object SSTDataset and run preprocessing
0_dataset: !Foo
train_path: {{ train_path }}
test_path: {{ test_path }}
transform:
text: !TextField
label: !LabelField
# stage 1 - train the text classifier on the SSTDataset
1_train: !Trainer
dataset: !@ 0_dataset # link back to the existing dataset
train_sampler: !BaseSampler # define a way of sampling dataset
val_sampler: !BaseSampler
model: !TextClassifier
embedder: !Embedder
embedding: !torch.Embedding # automatically use pytorch classes
num_embeddings: !@ 0_dataset.text.vocab_size # reference vocab size
embedding_dim: 300
encoder: !PooledRNNEncoder
input_size: 300
rnn_type: lstm
n_layers: !g [2]
hidden_size: 256
output_layer: !SoftmaxLayer
input_size: !@ 1_train.model.embedder.encoder.rnn.hidden_size
output_size: !@ 0_dataset.label.vocab_size
take_log: false
loss_fn: !torch.NLLLoss # Use existing PyTorch negative log likelihood
metric_fn: !torch.NLLLoss # Used for validation set evaluation
optimizer: !torch.Adam
params: !@ 1_train.model.trainable_params # Link to model parameters
max_steps: 2 # Each step runs `iter_per_step` iterations
iter_per_step: 2 # Eval and checkpoint every 50 iterations
2_eval: !Evaluator
dataset: !@ 0_dataset
model: !@ 1_train.model
metric_fn: !torch.NLLLoss
output_path: {{ preds_path }}
eval_sampler: !BaseSampler
eval_data: test
3_post_process_preds: !post_process_preds.PostProcessPreds
preds_path: !@ 2_eval.output_path
preds_id_path: {{ preds_id_path }}
post_processed_preds_path: {{ post_processed_preds_path }}
label_vocab: !@ 0_dataset.label.vocab
Describe the bug
When you run a runnable one of whose components uses an extension that hasn't been installed (pip install
or flambe -i
), flambe throws the following error, which does not tell the user what is actually going on:
11:35:30 | AttributeError("'CommentedMap' object has no attribute 'add_extensions_metadata'",)
Traceback (most recent call last):
File "/home/peter/code/sci-summary/venv/lib/python3.6/site-packages/flambe/runner/run.py", line 87, in main
runnable.run(force=args.force, verbose=args.verbose)
File "/home/peter/code/sci-summary/venv/lib/python3.6/site-packages/flambe/experiment/experiment.py", line 279, in run
schema_block.add_extensions_metadata(self.extensions)
AttributeError: 'CommentedMap' object has no attribute 'add_extensions_metadata'
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A better error message that tells the user to install the extension.
Screenshots
If applicable, add screenshots to help explain your problem.
Software Versions (please complete the following information):
Additional context
Add any other context about the problem here.
Is your feature request related to a problem? Please describe.
Given that an experiment can take a long time to get to the latest object initialization step, any incompatibilities of arguments to the constructors should be caught as early as possible.
Describe the solution you'd like
I think that flambe.component
should have a classmethod check_constructor_args(*args, **kwargs)
that's automatically called very early after starting an experiment. This would offer the option to override that method, so that any kind of exception could be raised very, very early in the experiment's pipeline.
This PR added a new flag #110 that needs unit testing.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.