Comments (15)
Hi @cifkao,
Thanks for raising these issues and for helping us improve the codebase.
For LRA, we use pathfinder32_hard
for the Pathfinder experiments and pathfinder128_hard
for the Path-X.
I'll send a PR to add this to the readme for the task.
Also regarding _PATHFINER_TFDS_PATH
, it should be set to the directory where you have the unzipped data from https://storage.googleapis.com/long-range-arena/lra_release.gz, i.e. where the following data for all the pathfinder tasks live:
pathfinder128/
pathfinder256/
pathfinder32/
pathfinder64/
I'll address this issue in my PR as well.
from long-range-arena.
Thank you @cifkao for checking this and sorry for the trouble.
We checked and it turned out that we released the raw images for the pathfinder datasets and you need to make a TFDS files that you can generate using this code:
https://github.com/google-research/long-range-arena/blob/main/lra_benchmarks/data/pathfinder.py
However, we now also have the generated TFDS files available to make it convenient for people to use LRA. Here you can download the TFDS files for pathfinder: https://storage.cloud.google.com/long-range-arena/pathfinder_tfds.gz
and then set _PATHFINER_TFDS_PATH
to the unzipped directory. Let us know if you hit any other issue.
from long-range-arena.
This version of the link does not require you to log in to a Google account: https://storage.googleapis.com/long-range-arena/pathfinder_tfds.gz
from long-range-arena.
I can't reproduce performer's result in pathfinder32_hard task either. Get just 50.47% best eval result.
My training shell script is as follow :
PYTHONPATH="$(pwd)":"$PYTHON_PATH" python lra_benchmarks/image/train.py \ --config=lra_benchmarks/image/configs/pathfinder32/performer_base.py \ --model_dir=./tmp/pathfinder_F \ --task_name=pathfinder32_hard
from long-range-arena.
I can't reproduce performer's result in pathfinder32_hard task either. Get just 50.47% best eval result. My training shell script is as follow :
PYTHONPATH="$(pwd)":"$PYTHON_PATH" python lra_benchmarks/image/train.py \ --config=lra_benchmarks/image/configs/pathfinder32/performer_base.py \ --model_dir=./tmp/pathfinder_F \ --task_name=pathfinder32_hard
Me neither. Furthermore, I've taken a look at the model config and it doesn't make sense -- the QKV dim is set to 16, while MLP and hidden are 32. I've skimmed through the code, these are actual dimensions, not the head ones after split.
The hyper-parameters used for the xformer model are as follow:
4 layers, 8 heads, 128 as the hidden dimensions of FFN blocks, 128 as the query/key/value hidden
dimensions, and the learning rate of 0.01.
Similar problems exist with other tasks.
from long-range-arena.
@MostafaDehghani
Thanks, these make sense. Albeit I'm still struggling to reproduce the results after re-implementation.
from long-range-arena.
Thanks. I set _PATHFINER_TFDS_PATH
as you advised, and now I'm getting this error:
I1208 11:46:39.121981 140014513174336 dataset_builder.py:529] Constructing tf.data.Dataset for split hard[:80%], from /mnt/beegfs/projects/tpt-s2a-4/data/lra_release/pathfinder32/1.0.0
Traceback (most recent call last):
File "lra_benchmarks/image/train.py", line 420, in <module>
app.run(main)
File "/mnt/beegfs/home/cifka/venv/lra/lib/python3.7/site-packages/absl/app.py", line 300, in run
_run_main(main, args)
File "/mnt/beegfs/home/cifka/venv/lra/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "lra_benchmarks/image/train.py", line 337, in main
normalize=normalize)
File "/mnt/beegfs/home/cifka/d/projects/long-range-arena/lra_benchmarks/image/input_pipeline.py", line 182, in get_pathfinder_base_datasets
train_dataset = get_split(f'{split}[:80%]')
File "/mnt/beegfs/home/cifka/d/projects/long-range-arena/lra_benchmarks/image/input_pipeline.py", line 175, in get_split
split=split, decoders={'image': tfds.decode.SkipDecoding()})
File "/mnt/beegfs/home/cifka/venv/lra/lib/python3.7/site-packages/tensorflow_datasets/core/dataset_builder.py", line 535, in as_dataset
) % (self.name, self._data_dir_root))
AssertionError: Dataset pathfinder32: could not find data in /mnt/beegfs/projects/tpt-s2a-4/data/lra_release/. Please make sure to call dataset_builder.download_and_prepare(), or pass download=True to tfds.load() before trying to access the tf.data.Dataset object.
I think the problem is my pathfinder32
doesn't contain a directory called 1.0.0
:
$ ls /mnt/beegfs/projects/tpt-s2a-4/data/lra_release/pathfinder32/
curv_baseline curv_contour_length_14 curv_contour_length_9
from long-range-arena.
There is a small problem with the zip file we release. Some extra files slipped into the zip file while archiving. We are fixing that and will upload a new zip file with a better structure of directories and no unnecessary files.
In the meantime, can you use this path /mnt/beegfs/projects/tpt-s2a-4/data/lra_release/lra_release
and see if it works?
from long-range-arena.
In the meantime, can you use this path
/mnt/beegfs/projects/tpt-s2a-4/data/lra_release/lra_release
and see if it works?
I had in fact deleted everything except for lra_release/lra_release
, because it seemed to be a subset of what is under lra_release/lra_release
. So actually /mnt/beegfs/projects/tpt-s2a-4/data/lra_release/
should already have all the files (I do have the pathfinder32
directory there).
Either way, I checked the archive and I cannot find anything called 1.0.0
(which is where it's trying to load the data from).
from long-range-arena.
Seems to work, thanks!
from long-range-arena.
Perfect! Let's keep this issue open until I send the PR that adds these information to the Readme :)
from long-range-arena.
Is anyone able to reproduce the paper's results using performer on pathfinder? Accuracy is much worse (62% vs. 77%). I was able to approximately reproduce with transformer and bigbird.
from long-range-arena.
Is anyone able to reproduce the paper's results using performer on pathfinder? Accuracy is much worse (62% vs. 77%). I was able to approximately reproduce with transformer and bigbird.
@renebidart
Same here, although for me the results were much worse. (52% for performer)
bigbird is reproducible (73.48%)
my training shell script is as follow :
export _PATHFINER_TFDS_PATH=./TFDS
difficulty=hard
PYTHONPATH="$(pwd)":"$PYTHON_PATH" python lra_benchmarks/image/train.py \
--config=lra_benchmarks/image/configs/pathfinder32/bigbird_base.py \
--model_dir=./results/pathfinder32_${difficulty} \
--task_name pathfinder32_${difficulty}
from long-range-arena.
@EternalSorrrow,
Please take a look at my comment here: #37 (comment)
I'll soon send a fix for the issue with the configs of pathfinder.
from long-range-arena.
Thank you @cifkao for checking this and sorry for the trouble.
We checked and it turned out that we released the raw images for the pathfinder datasets and you need to make a TFDS files that you can generate using this code: https://github.com/google-research/long-range-arena/blob/main/lra_benchmarks/data/pathfinder.py
However, we now also have the generated TFDS files available to make it convenient for people to use LRA. Here you can download the TFDS files for pathfinder: https://storage.cloud.google.com/long-range-arena/pathfinder_tfds.gz
and then set
_PATHFINER_TFDS_PATH
to the unzipped directory. Let us know if you hit any other issue.
Hello,
I fallowed these steps and use this file https://storage.cloud.google.com/long-range-arena/pathfinder_tfds.gz, but I still got the fallowing issue
"AssertionError: Dataset pathfinder32: could not find data in /Users/user/tensorflow_datasets"
while "tensorflow_datasets" is the repository where I extracted the .gz file, and I replaced _PATHFINER_TFDS_PATH="/Users/user/tensorflow_datasets".
Did you had the chance to fix these previous issues?
Thanks
from long-range-arena.
Related Issues (20)
- bug in Pathfinder-128 dataset HOT 9
- Error in matching task
- Perceiver on LRA
- Pathfinder not learning three times in a row. HOT 1
- Error when run document retrival HOT 3
- Request about cuda version when using GPUs HOT 4
- Quadratic Longformer suspicion HOT 1
- Dataset for the matching task HOT 1
- Are encoder and decoder both implemented with sparse attention for bigbird? How long is the verified output length for the decoder?
- Current code doesn't work with latest flax version and run on CPU only HOT 15
- The best checkpoint of Transformer
- AAN dataset unavailable HOT 1
- AAN dataset crashing when loading .tsv file HOT 4
- ModuleNotFoundError: No module named 'flax.deprecated' HOT 3
- How to use the pathfinder.py code to generate the dataset? HOT 1
- Pretrained models
- Is there a pytorch equivalent of this implementation? HOT 2
- Question regarding model checkpoint
- Question regarding Pathfinder and Listops performance HOT 2
- Is it really byte-level?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from long-range-arena.