Comments (4)
We added a deprecation warning in PyTorch
https://github.com/pytorch/pytorch/blob/e3af4be96338d0645310e7cab47ce78509a6603e/torch/utils/data/datapipes/iter/tararchivereader.py#L37
We will remove them in the next release after 1.10. Before that, if any user imports DataPipe from torchdata, they should use the one in torchdata rather than the one in pytorch core.
For the list of duplicated DataPipes, we should not expose them from PyTorch core to torchdata but directly use the implementation from torchdata.
from data.
I'll send a PR.
from data.
That's a good idea to have a single source of Truth. And, we need to carefully take care of duplicated DataPipes like TarArchiveReader
without overriding them from data repo.
Feel free to send a PR.
from data.
That's a good idea to have a single source of Truth. And, we need to carefully take care of duplicated DataPipes like
TarArchiveReader
without overriding them from data repo.
The second part makes it sound that there is no single source of truth if torchdata.datapipes
diverges from torch.utils.data.datapipes
. Looking at TarArchiveReader
, I found this
data/torchdata/datapipes/iter/util/tararchivereader.py
Lines 11 to 12 in 77c07bf
So what is the plan here? Remove everything from torch.utils.data.datapipes
that is also in torchdata.datapipes
and import and expose the rest?
from data.
Related Issues (20)
- Roadmap for mixed chain of multithread and multiprocessing pipelines? HOT 2
- DataLoader2 Memory Behavior is very strange on Epoch Resets HOT 9
- FileExistsError when using `on_disk_cache` and multiple workers HOT 1
- Dataloader2 with FullSyncIterDataPipe throws error during initilization HOT 3
- Make archive datapipes faster HOT 1
- Is torchdata still being actively developed? HOT 6
- An iterator that can stream over stdin
- torchdata has a very low accuracy
- Future of torchdata and dataloading HOT 28
- Calling __iter__ twice on DataLoader2 causes hang with MPRS HOT 2
- Loading `.tfrecords` files that require a deserialization method
- S3FileLoaderIterDataPipe buffer_size
- Iterating a data pipe, created with random split, ends in error as the code tries to iterate past the data pipe lenght
- `v2.1.2+cu118` and `v2.1.1+cu118` run into torchdata `ImportError: libssl.so.3: cannot open shared object file: No such file or directory`, that `v2.1.0+cu118` doesn't have an issue with HOT 1
- PyTorch 2.2: import torchdata fails on ubuntu-20.04 github runners HOT 3
- Dataloader is slow with iterdatapipes and shuffle that has large in-memory fields (because traverse_dps is slow) HOT 3
- DataLoader2 with multiprocess raise exception: Can not request next item while we are still waiting response for previous request HOT 1
- Move to removesuffix string method after python 3.8 support is dropped
- torchdata not compatible with torch 2.3.0 HOT 1
- [StatefulDataLoader] macOS tests are too slow
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from data.