Comments (13)
Oh sorry for leaving the PR #88 for months..
I'll be work on the PR today
from continuity.
Looks like a decent proposal:
- let's follow the right to left convention, ie
Copy(dst, src string)
, since we've followed that here in other contexts. - I'm not sure if a full manifest build is necessary, but the extra work is negligible and can be hidden behind the function call.
- How will this integrate with drivers? I don't
Resource
needs a copyable method, but we need to push this down to the driver, so they can make efficient decisions. Ideally, we'd have aCopyTree
function (or some clever naming) at the root that can be used in common cases and then one can fall back to drivers for more complex uses (I am copying from VM remote to some specialized thing).
from continuity.
Ack to your first two bullets.
Not sure I quite follow the third. By "I don't Resource
needs a copyable method" did you mean "I don't think Resource
needs a Copy
method" or "I don't think Resource
needs a Copyable
interface"?
I think perhaps you are saying both, since Resource
doesn't have visibility onto a Driver
. That implies we should add the copying for each resource type to the toplevel context.CopyTree
you propose which would then know how to use the drivers to duplicate each kind of Resource
?
Hrm, might we want different drivers for source (implied in the context) and destination (would need to be passed?) Perhaps destination should also be a(n empty) Context
?
from continuity.
Not sure I quite follow the third. By "I don't Resource needs a copyable method" did you mean "I don't think Resource needs a Copy method" or "I don't think Resource needs a Copyable interface"?
"I don't think Resource needs a Copyable interface"
Resource
is not supposed to be bound to an environment, although it may draw on resources, such as disk storage.
I'm not sure how tightly we should integrate with drivers. Perhaps, only as much is needed to avoid repeating logic in getting resources from a given environment.
I guess that would mean there is a version that takes paths and instantiates the environments (contexts) from paths and another that works solely from environments. Does that make sense?
from continuity.
I take it your use of "environment" assume #78 has happened and will try to interpret with that in mind and use that term throughout.
So I think you are proposing two functions. First a
func Duplicate(dstroot, srcroot string) error
That would build an Environment
(nÊe Context
) for srcroot
and dstroot
and then call our other new function:
func DuplicateEnvironment(dst, src, Environment) error
Or should this be:
func (src *Environment) Duplicate(dst Environment) error
In either case this will then iterate over the Resource
s in src
and call their Copy
methods in turn.
My original proposal had those Copy
methods having this signature:
Copy(srcroot, dstroot string) error
Which makes it impossible to integrate with drivers at all. Maybe this is signposting me towards the idea that the duplication should be happening in DupicateEnvironment
(with a big type switch) rather than pushing down via a new Copy
method.
To do this you would need to be able to "readish" operations on src.driver
and "writeish" operations on dst.driver
. Since driver
is private in Environment
either one or both of those is going to be inaccessible in any particular method. The first option I see there is to add Driver
method to Environment
to expose it (or to make the member public). WDYT? I'm not immediately enamoured with that idea.
An alternative model just occurred to me:
func (src *Environment) Duplicate(dst string) (*Environment, error)
func (src *Environment) DuplicateWithOptions(dst string, options EnvironmentOptions) (*Environment, error)
i.e. take the dst
as a string and return a new Environment
referencing the newly duplicated tree. Now we have access to src.driver
since it is a method on src
and we can have access to the dst
driver because we are creating it and can hold onto it.
from continuity.
@ijc Yes, I am using "environment" because "context" is confusing in the context of the existence of the context package.
That proposal looks good. As far as driver access goes, I think it is reasonable to expose something on environment to allow implementations to access platforms directly for performance optimizations.
One other thing to consider is the behavior: do this "sync" (match the set) or "copy" (destination is now a superset of both src and dst). Are they destructive?
from continuity.
That proposal looks good.
There was at least 3 in #95 (comment) ;-) Do you have a favourite?
One other thing to consider is the behavior: do this "sync" (match the set) or "copy" (destination is now a superset of both src and dst). Are they destructive?
In the name of simplicity (and because I'm in inherently lazy) I'd prefer to mandate that the destination not exist at all upon entry for now. If we come up with a need to "overlay" src onto dst that should be doable in a backwards compatible way I think.
from continuity.
There was at least 3 in #95 (comment) ;-) Do you have a favourite?
There all my favo[u]rite! I guess I was saying the general direction looks good.
I think the focus should be on the top-level, path-based function Duplicate
(maybe CopyTree
or SyncTree
, we can bike shed it later). Pushing down some functionality to the environment seems like a good implementation detail.
I think the main implementation would be something like this:
func Duplicate(dst, src string) error {
return DuplicateEnvironment(local.New(dst), local.New(src))
}
Basically, its a helper for doing the easy thing and then you can use the other parts when something special is required.
However, I do see the benefit of making the "environment" one-sided, in that you sync to a local directory, but this may have limitations if you are syncing two environments that are only accessible from their drivers.
Stepping back, I may have incorrectly rejected one of the original ideas to have this on the manifest, because something like this makes a lot of sense:
func (m *Manifest) Apply(env Environment, provider Provider)
The above would let you apply the manifest to an environment, retrieving the content from the given provider, which might let you get direct access to fds for using things like CopyFileRange
. We may already have a concept similar to the above, however, it would be nice to avoid the creation of the manifest, as hashing the whole filesystem is expensive when your syncing to an empty directory tree.
@dmcgowan Any thoughts here?
from continuity.
it would be nice to avoid the creation of the manifest, as hashing the whole filesystem is expensive when your syncing to an empty directory tree.
I'm unclear if hashing is a necessary part of the hardlink stuff. The hardlinkManager
doesn't hash but the hlm.Merge
calls the package-level Merge
which does seem to pay some attention to the digests.
So it's not the Manifest
as such which causes all the hashing but rather the fact that it creates all the Resources
. While we might be able to avoid building the Manifest
I'm not so sure we can avoid creating Resource
objects, can we?
I think we certainly want to DTRT with hardlinks in this functionality.
from continuity.
I finally started looking into implementing this week (before I got distracted again)... It looks like continuity.ApplyManifest
already covers most of the use case here, in my copious free time I'll look into massaging it to suite my needs, likely by coercing ./bin/continuity apply
to be able to dup a manifest
+source tree
in the first instance.
One thing which seems to be missing is timestamps, adding that would require reving the protobuf descriptions I think. Need to think a bit harder about whether that is a requirement for my use case (copy up from underlying image into a volumes for cri-containerd) at first glance I think it probably is...
from continuity.
One thing which seems to be missing is timestamps
I think we need to add timestamsp, regardless. cc @AkihiroSuda
I know there was a PR for this.
from continuity.
@AkihiroSuda I think we were undecided on #88 but I think it is clearly a necessity now.
from continuity.
My use case for this was to achieve the same thing which containerd/cri#535 did much more simply.
Given that I no longer have a pressing need for this functionality. Probably this issue can be closed, unless you would prefer to keep it around for the design discussion should this need arise again in a different context?
from continuity.
Related Issues (20)
- Empty files can diff as "modified" even when they're not HOT 1
- Errophone auto-generated constants, for Go standard library (gollvm) HOT 2
- Broken build on arm64 HOT 3
- TestUsage broken on ZFS HOT 2
- Should the default pathDriver walk though volume mount points on Windows? HOT 9
- fs.CopyFile should support copying metadata
- copyFileContent broken on zfs (in some cases)
- context.Resource generates unavoidable log-spam on Windows
- fs.CopyFile should use clonefile(2) on macOS when VOL_CAP_INT_CLONE is available
- Support GOOS=netbsd
- Fails to build with newer versions of golang-goprotobuf HOT 1
- Not connected... I guess? HOT 1
- Security vulnerabilities with spf13/cobra dependency HOT 7
- go.mod: update dependencies / CI failing HOT 1
- Continuity is failing to ignore unsupported xattrs on Windows
- Add CI coverage for more filesystems
- clonefile() should fall back to copyfile() for cross-device copying
- [Darwin] clonefile() fails if the target already exists HOT 1
- `continuity.context.Walk` assumes incorrect behavior for `os.Readlink`
- nerdctl commit taking long time possibly because of fs.Changes walking through all files HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
đ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. đđđ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google â¤ī¸ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from continuity.