Comments (6)
So combining curves in tensorboard is possible, albeit inelegant. I found some SO q's that referenced doing this like this one: https://stackoverflow.com/questions/48951136/plot-multiple-graphs-in-one-plot-using-tensorboard
So I was able to acheive this in fuse by making a function like this in fuse/dl/lightning/pl_funcs.py
:
# Tensorboard ONLY
def tensorboard_epoch_end_compute_and_log_losses_combined(pl: pl.LightningModule, mode: str, batch_losses: Sequence[Dict]) -> None:
"""
On epoch end average out the batch losses and log the averaged losses
:param pl: LightiningModule. Used for logging.
:param mode: prefix to add to each loss name (when logging), typically validation/train/test
:param batch_losses: list of batch_dict["losses"] as added by 'epoch_losses'
:return: None
"""
keys = batch_losses[0].keys()
for key in keys:
losses = []
for elem in batch_losses:
if isinstance(elem[key], torch.Tensor):
losses.extend(elem[key].detach().cpu().tolist())
else:
losses.append(elem[key])
loss = mean(losses)
pl.log(f"combined.losses.{key}", {mode: loss}, on_epoch=True)
And then you can use it in the model training_epoch_end
the same way we do other logging:
# Log Combined (tensorboard ONLY)
fuse_pl.tensorboard_epoch_end_compute_and_log_losses_combined(self, "train", [e["losses"] for e in step_outputs])
Which results in the desired behavior, but also makes a seperate entry for every additional metric plotted by this method (resulting in multiple "runs" for something that is actually just 1 run):
from fuse-med-ml.
Also, it seems like more advanced logging frameworks such as wandb already support mixing plots elegantly, so i suggest we dont do this.
from fuse-med-ml.
Regarding CSV logging, pytorch lightning CSVLogger seems totally compatible with fuse's implementation. You can just import it, make its logging dir the same as that of the fuse_logger, and pass it to lightning trainer:
from pytorch_lightning.loggers import CSVLogger
pl_logger_csv = CSVLogger(paths["model_dir"], name="my_model4")
pl_trainer = pl.Trainer(
default_root_dir=paths["model_dir"],
max_epochs=train_params["trainer.num_epochs"],
accelerator=train_params["trainer.accelerator"],
strategy=train_params["trainer.strategy"],
devices=train_params["trainer.num_devices"],
auto_select_gpus=True,
# logger=[pl_logger_tensorboard, pl_logger_csv],
)
from fuse-med-ml.
Since there are no changes necessary for this issue ill close it :)
from fuse-med-ml.
Actually Ill reopen this again and add CSVLogger to a few examples before closing 😅
from fuse-med-ml.
added them in #204
from fuse-med-ml.
Related Issues (20)
- State of the field section missing from the fuse-med-ML paper HOT 2
- Add example to Fuse.dl section in the Fuse-med ML paper HOT 3
- urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] on CICD
- Remove `torchvision` version upper bound
- Add a flag for clearml start function to enable/disable it
- fix `LightningDeprecationWarning` HOT 1
- Fix the installation protocol so it will be compatible with A100 GPU and CCC's CUDA version HOT 2
- Uncollate is not needed and slows down training HOT 1
- Display CI output on Jenkins' console while building
- NDict API change - missing pre-existing features HOT 2
- Update hello_world.ipynb example HOT 3
- Mention in readme that FuseMedML is part of PyTorch ecosystem
- Caching should only be done on rank 0 HOT 1
- Failure in running mutliple GPUs with dp strategy for ehr_transformer HOT 1
- Support pytorch lightning 2.0.0
- backbone resnet and resnet 3d - better pretrained support
- bug in resnet3d following latest change HOT 2
- `TestCrossAttentionTransformerEncoder` randomly fails
- I got “NameError: name 'NDict' is not defined” from ndict.py, how can I fix it HOT 4
- Remove usage of `xmlrunner` package
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fuse-med-ml.