Comments (15)
@albanD I think this is a little tangential, the .devcontainer stuff is here: https://github.com/pytorch/pytorch/tree/main/.devcontainer
Currently this doesn't use a distributed sccache. In fact one of the main reasons that make it hard to develop on codespaces is because the default machines are so weak and compiling c++/cuda takes forever. I think if we were able to hook in sscache into the devcontainer it would be a silver bullet feature!
from pytorch.
I don't know if the env changes are going to invalidate the cache hit. In other projects the past experience with bazel, it was very fragile without exactly the same reproducible env. But probably sscache is more robust.
from pytorch.
/cc @ezyang in the case you have a feedback on this.
from pytorch.
If you are willing to do your builds in a Docker environment that exactly matches the CI env, you can technically use the sccache bucket. But IME it's more trouble than it's worth.
from pytorch.
I suppose that with a little effort in the CI we could make a build (and the related sccache) quite aligned to the nightly container we are publishing every day for users/contributors.
So that the cache it will be valid. Also the pytorch team will need to check if the bucket has public access in read only mode.
from pytorch.
cc @drisspg would the codespace env work for this? Where is the doc on how to use it, I can't find it.
from pytorch.
The cache bucket is currently open as read-only and we need to consider the risk to expose the write permission to the public for their local reproducibility. Sccache supposedly has a read-only mode in https://github.com/mozilla/sccache/blob/main/docs/Local.md#read-only-cache-mode, but we haven't been able to make it work with our S3 bucket IIRC (it keeps asking for AWS credentials).
from pytorch.
but we haven't been able to make it work with our S3 bucket IIRC (it keeps asking for AWS credentials).
Is there any news on this? As I saw this comment yesterday: mozilla/sccache#1753 (comment)
from pytorch.
So if that comment is correct we could use SCCACHE_S3_NO_CREDENTIALS=true
. The main point is to populate the bucket from a regularly distributed image like nightly-devel.
from pytorch.
I have some doubts about the current devcontainer
case as we are building the images on demand. It would work if we reference nightly images in devcontainer
from pytorch.
Does the image matter? I though that what matters is the state of the PyTorch c++ files when attempting to compile. If someone is developing off a recent enough main branch then presumably there would alot more cache hits?
from pytorch.
I think this is the same here, but we can re-use the ci docker images for the devcontainer. So we can actually put you in exactly the same setting to make sure you're going to properly hit the cache.
from pytorch.
So if that comment is correct we could use
SCCACHE_S3_NO_CREDENTIALS=true
. The main point is to populate the bucket from a regularly distributed image like nightly-devel.
This might not be as easy as it sounds. We explicitly don't use sccache for nightly and release builds for security reason to avoid cache poisoning attack. Theoretically, after building nightly, we could upload the build cache in an one-way direction to S3, but it does sound complicated.
from pytorch.
Theoretically, after building nightly, we could upload the build cache in an one-way direction to S3, but it does sound complicated.
As we compile nightlies from scratch it could be relatively easy to push it in a new nightly bucket totally distinct from the current CI bucket or not?
from pytorch.
But in any case, if you think that the CI image is pretty clean to be distributed it is the same.
I was talking about the nightly docker images cause I had the impression that the nightly-dev image is more curated to be distributed to users/developers but I've not checked all the details to compare it to the CI one.
from pytorch.
Related Issues (20)
- H100 Max Autotune Compilation Significantly Slower than A100 HOT 2
- torch.compile + constructing an nn.Parameter + mutating it can give wrong results HOT 1
- torch.compile errors on torch.autograd.backward HOT 1
- Need a C++ way to convert PyTorch model to TorchScript model
- `OpInfo.supported_dtypes` could return `None`
- RuntimeError: Exporting the operator fft_rfftn to ONNX opset version 11 is not supported. HOT 1
- Tensorboard add_custom_scalar does not work
- Help me!!!Thanks you very much!!!About "THCBlas.cu:334" HOT 1
- libtorch_cuda.so: undefined reference to `[email protected] HOT 6
- Semi-Structured Sparsity unsupported for Windows HOT 2
- Symbolic shapes unable to reason: Ne(Mod(u0*u2 + u1*u2, u0 + u1), 0) HOT 4
- Illegal instruction (core dumped): PyTorch 2.3.0+rocm6.0 HOT 2
- Dynamo doesn't log backward graphs in compilation metrics HOT 3
- [dynamo] nn parameterization causing increased compile time HOT 4
- Investigate 'NoneType' object has no attribute 'mutable_local' when nn module inlining enabled.
- _VariableFunctions.pyi expands to invalid utf-8 encoding. HOT 1
- Happier cowpath for torch.tensor_split on Tensor split sizes HOT 3
- `_allow_non_fake_inputs` parameter of `make_fx` has no effect HOT 1
- DISABLED test_binary_op_list_slow_path__foreach_div_cuda_bool (__main__.TestForeachCUDA) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch.