Giter Site home page Giter Site logo

Polishing the API about cgofuse HOT 23 CLOSED

winfsp avatar winfsp commented on May 21, 2024
Polishing the API

from cgofuse.

Comments (23)

billziss-gh avatar billziss-gh commented on May 21, 2024 1

Thanks. I will see if I can get that change in tonight.

from cgofuse.

billziss-gh avatar billziss-gh commented on May 21, 2024

Commit f053f04 adds SetCapCaseInsensitive and SetCapReaddirPlus methods to FileSystemHost. The same mechanism can be used to add additional fuse_operations::init capabilities if desired.

from cgofuse.

ncw avatar ncw commented on May 21, 2024

As far as my use in rclone is concerned I don't think you've left anything out of importance. I can see CapCaseInsensitive being useful - thanks.

I like the fact that you've left the API very close the the C API and that kind of means that you've done your job!

Some gophers might be upset by the errc being integers and not a error type. However I think you've made the right choice - you can't return a general purpose go error back to FUSE so why pretend that you can? You'd just end up translating it to EIO and then having nowhere to log the error message.

However I think you should write a bit of docs explaining that errors are returned in the first int returned by the fs methods. You've explained that these should be -fuse.Exxx errors which is good. Not all gophers will be familiar with that style of C/kernel error return.

How do you implement ReaddirPlus in the fuse filesystem? I couldn't see any callback methods in f053f04 ?

I haven't finished implementing cgofuse in rclone yet, but I can report the basic functionality is working (listing files, read and write) so 👍 from me!

from cgofuse.

billziss-gh avatar billziss-gh commented on May 21, 2024

How do you implement ReaddirPlus in the fuse filesystem? I couldn't see any callback methods in f053f04 ?

Normally FUSE only cares for the st_mode and st_ino fields of struct stat during a readdir. This is not documented in the high-level FUSE API, but it is documented in the low-level one. [In fact in the high-level API the st_ino field is also ignored unless you use -ouse_ino from the command line.]

In FUSE 3.0 they added a "readdir-plus" capability. Unfortunately the way they added it is incompatible with the FUSE 2.x series which are in widespread use. Cgofuse is currently based on FUSE 2.8, so I could not just use the FUSE 3.0 "readdir-plus".

Readdir-plus is nice to have, but not essential on UNIX. After all readdir does not have any portable way of accessing information beyond d_name and d_ino (see dirent.h). Most OS'es also provide d_type for the file's type, but not full stat information.

On Windows the directory API's provide full stat information. So WinFsp was forced to do a readdir followed by a series of getattr calls to get the stat information to return to the Windows kernel. This was very inefficient. So a WinFsp specific capability was added: FSP_FUSE_CAP_READDIR_PLUS that instructs the WinFsp-FUSE layer that the stat information passed by the file system during readdir is actually full and valid (i.e. that the file system has not just filled the st_ino and st_mode fields). WinFsp-FUSE knows to avoid making the extra getattr calls in this case.

So the SetCapReaddirPlus function is currently just an optimization for Windows, but an important one IMO. It is fully implemented within WinFsp-FUSE. To use it from Cgofuse, simply pass a full Stat_t during Readdir (i.e. not one that is half-filled with only the Ino and Mode fields valid). You are still allowed to pass nil if you do not have a Stat_t for some files and FUSE will know to issue any extra Getattr calls required.

from cgofuse.

advanderveer avatar advanderveer commented on May 21, 2024

I have yet to work with the actual file system interface but it is unclear to me why FileSystemHost.Mount() mount has to receive the os.Args of the calling program. Maybe it just requires documentation or maybe a type that specifically describes mount arguments instead of an opaque []string?

from cgofuse.

billziss-gh avatar billziss-gh commented on May 21, 2024

@advanderveer wrote:

I have yet to work with the actual file system interface but it is unclear to me why FileSystemHost.Mount() mount has to receive the os.Args of the calling program. Maybe it just requires documentation or maybe a type that specifically describes mount arguments instead of an opaque []string?

These are the actual command line arguments passed to the file system (perhaps augmented by the Mount caller). See this link for some quick documentation.

But I agree with you that this needs to be better documented or perhaps even improve the API itself.

from cgofuse.

ncw avatar ncw commented on May 21, 2024

Thanks for explaining that - that makes sense.

from cgofuse.

advanderveer avatar advanderveer commented on May 21, 2024

Thanks for the link! If I understand it correctly the Mount() delegates to the FUSE reference implementation on OSX/Linux which happens to have (according to the documentations) a "lazy" way of initializing the file system which we use: fuse_main.

The question would be if we want to present the reference C implementation to Go via this "lazy/high level" function? It makes a lot of assumptions about the environment and arguments, e.g the first argument must be the program name, and --options will output some help text here, It installs signal handlers etc.

A lower-level init function of the FUSE reference implementation may be more suitable but I'm not able to find it from the docs alone so maybe it doesn't exist.

NOTE: the signal handling as being "added" by libfuse under fuse_main() probably doesn't work for Go applications as one would expect: https://godoc.org/os/signal#hdr-Go_programs_that_use_cgo_or_SWIG

from cgofuse.

billziss-gh avatar billziss-gh commented on May 21, 2024

@advanderveer wrote:

The question would be if we want to present the reference C implementation to Go via this "lazy/high level" function? It makes a lot of assumptions about the environment and arguments, e.g the first argument must be the program name, and --options will output some help text here, It installs signal handlers etc.

I agree with your assessment. Having a command line interface to Mount is not the best idea. I did struggle with that decision as well. Let me explain the reasons I chose this approach. I am happy to consider alternative interfaces.

The FUSE 2.x series does not have a clean interface into the FUSE loop. It is either fuse_main or you have to rewrite fuse_main yourself by parsing command line arguments, setting up the environment and finally calling fuse_loop or fuse_loop_mt. For maximum flexibility one may want to rewrite fuse_loop, I have had to do this with a single-threaded but fully asynchronous file system that uses the lowlevel FUSE API.

Unfortunately rewriting fuse_main has certain drawbacks the most important of which is that it is not straight forward to do in a cross-platform way (and even harder to do from cgo as I found out when I briefly attempted this experiment).

So the easiest and most stable way to enter the FUSE loop is fuse_main (real symbol name fuse_main_real). Fuse_main accepts command line arguments:

$ $GOPATH/bin/hellofs -h
usage: /home/billziss/Projects/go/bin/hellofs mountpoint [options]

general options:
    -o opt,[opt...]        mount options
    -h   --help            print help
    -V   --version         print version

FUSE options:
    -d   -o debug          enable debug output (implies -f)
    -f                     foreground operation
    -s                     disable multi-threaded operation
...

So I considered providing an interface such as:

func (*FileSystemHost) Mount(mountpoint string, opts []string) bool

Or even:

type FileSystemConfig struct {
    // ...
}
func (*FileSystemHost) Mount(mountpoint string, config *FileSystemConfig) bool

In the end I chose not to do this, because of the following reasons:

  • Most file systems actually benefit from the default command line handling in FUSE. It is quite powerful and allows setting a variety of options. I find it especially useful as a file system developer, because I can change the behavior of my file system dramatically just by adding/removing a few options (e.g. attr_timeout) on the command line.

  • The file system would have to do its own command line parsing simply to fill the FileSystemConfig struct. Mount would then take the FileSystemConfig and create new FUSE specific command line arguments out of them. For most file systems this sounds like a lot of unnecessary work simply to create the illusion of a nicer interface into Mount. For the few that would benefit from hiding the default FUSE option handling, they can still do their own command line parsing and feed an unrelated command line to Mount.

A lower-level init function of the FUSE reference implementation may be more suitable but I'm not able to find it from the docs alone so maybe it doesn't exist.

Unfortunately it does not exist. We would have to write our own fuse_main_real or simulate it by rewriting command line arguments as discussed above.

NOTE: the signal handling as being "added" by libfuse under fuse_main() probably doesn't work for Go applications as one would expect: https://godoc.org/os/signal#hdr-Go_programs_that_use_cgo_or_SWIG

Good catch.

  • Windows: we have already discussed the implications in a different thread.
  • OSX/Linux: luckily the FUSE layer is smart enough to not set a signal handler if one already exists. So it should not touch any of the signals that Go cares about: fuse_signals.c

from cgofuse.

billziss-gh avatar billziss-gh commented on May 21, 2024

I wrote:

func (*FileSystemHost) Mount(mountpoint string, opts []string) bool

Actually this interface would simplify the Mount code, which currently has to extract the mount point in order to support Unmount. It would also resolve an issue with some of the help being printed twice when using the -h option.[FIXED]

Is it an interesting compromise or does it just feel artificial?


EDIT: I note that it would still be problematic for some file systems that want the default FUSE command line handling. For example, both of these command lines do the same thing:

$ bin/hellofs -o allow_other mnt
$ bin/hellofs mnt -o allow_other

But now the file system has no easy way to extract the mount point to give to Mount.

from cgofuse.

advanderveer avatar advanderveer commented on May 21, 2024

I think that no abstraction is better than a leaky abstraction. As you adequately wrote down, there doesn't seem to be a suitable set of arguments we can offer in this layer that makes thing more clear. That is Ok, we will simply need to make sure what the Mount() method has suitable documentation that outlines the issue and delegates the dicision to the library user:

//Mount will ask the platform specific FUSE implementation to mount the file system. On Windows arguments are passed to WinFSP <Link to arg docs here>. On other platforms (Linux/OSX) the arguments are passed to the libfuse reference implementations: <link to arg docs here>. 
func (*FileSystemHost) Mount(mountpoint string, opts []string) bool {...}

from cgofuse.

billziss-gh avatar billziss-gh commented on May 21, 2024

@advanderveer wrote:

//Mount will ask the platform specific FUSE implementation to mount the file system. On Windows > arguments are passed to WinFSP <Link to docs here>. On other platforms (Linux/OSX) the arguments are passed to the libfuse reference implementations: <link to docs here>. 
func (*FileSystemHost) Mount(mountpoint string, opts []string) bool {...}

Agree on clarifying the documentation re: Mount.

Just to be clear: The Mount interface is currently:

func (*FileSystemHost) Mount(args []string) bool

Would you also prefer to have it changed to?

func (*FileSystemHost) Mount(mountpoint string, opts []string) bool

I am happy with such a change. @ncw since you have rclone cmount already working based on cgofuse, do you have an opinion?

from cgofuse.

ncw avatar ncw commented on May 21, 2024

Yes that sounds fine to me.

from cgofuse.

advanderveer avatar advanderveer commented on May 21, 2024

I like func (*FileSystemHost) Mount(mountpoint string, opts []string) bool as well

from cgofuse.

billziss-gh avatar billziss-gh commented on May 21, 2024

Commit 47171f3 changes Mount as per our discussion. This is currently in the Mount branch, but I will merge into master tomorrow.

The Mount documentation follows.


func (*FileSystemHost) Mount

func (host *FileSystemHost) Mount(mountpoint string, opts []string) bool

Mount mounts a file system on the given mountpoint with the mount options in opts.

Many of the mount options in opts are specific to the underlying FUSE implementation. Some of the common options include:

-h   --help            print help
-V   --version         print FUSE version
-d   -o debug          enable FUSE debug output
-s                     disable multi-threaded operation

Please refer to the individual FUSE implementation documentation for additional options.

It is allowed for the mountpoint to be the empty string ("") in which case opts is assumed to contain the mountpoint. It is also allowed for opts to be nil, although in this case the mountpoint must be non-empty.

The file system is considered mounted only after its Init() method has been called and before its Destroy() method has been called.

from cgofuse.

billziss-gh avatar billziss-gh commented on May 21, 2024

BTW, the last wart that I would like to tackle is the fuse_operations::lock operation. I have punted on this, because most file systems are happy with the default kernel support and because WinFsp does not send kernel file lock requests to user mode (it handles them entirely in kernel mode).

I think it is still worthwhile to fix this last wart in cgofuse. I will look at this later tonight and come back with a proposal. I am happy to hear yours of course too.

from cgofuse.

billziss-gh avatar billziss-gh commented on May 21, 2024

I spent some time looking at fuse_operations::lock and how to add it to cgofuse. I am going to punt on this at this time. The reasons are:

  • The fuse_operations::lock operation is currently only supported on Linux FUSE. OSXFUSE and WinFsp-FUSE do not support passing file locking operations to user mode (they satisfy them in kernel mode). [This means that file locking will work for processes on the same machine, but it will not work for processes on different machines.]

  • Recent versions of FUSE have also added fuse_operations::flock for BSD style locking. This further complicates matters.

  • Implementing file locking support correctly in cgofuse means that we would have to cater for the lock_owner field in fuse_file_info. This needs to be passed as an argument in a number of operations. The FUSE docs suggest that only fuse_operations::lock and fuse_operations::release need to worry about this field. From my experience with implementing locking support for WinFsp, I would expect that read and write operations also need to worry about locking. [NOTE: Linux FUSE may only support advisory locking which may explain why read, write are not mentioned.] Because of this uncertainty I am disinclined to change the current API in any significant way to accommodate functionality that only works in one platform.

  • Although I understand file locking and have implemented it at the kernel level in WinFsp I do not understand all the nuances about how it works on Linux FUSE. I therefore prefer to punt on this until someone more knowledgable on the subject comes along with information or a pull request.

So I am ready to freeze the cgofuse API. However I want to ensure that introducing locking into the API at a later stage would not be problematic. I believe it should not be for the following reasons:

  • Although I am new to Go it looks to me that the versioning issues with interfaces that exist on other platforms do not exist in Go, primarily because of the recompilation and static linking requirements. Therefore it should be easy to add a Lock method when the need arises.

  • If we need to pass the lock_owner field to operations like Release or Read or Write we can introduce a Getlockowner function that uses thread local storage to communicate this information to those file systems that care.

from cgofuse.

ncw avatar ncw commented on May 21, 2024

I think punting on lock is fine. I can't think of a FUSE fs which uses that feature.

Although I am new to Go it looks to me that the versioning issues with interfaces that exist on other platforms do not exist in Go, primarily because of the recompilation and static linking requirements. Therefore it should be easy to add a Lock method when the need arises.

That is correct yes.

If we need to pass the lock_owner field to operations like Release or Read or Write we can introduce a Getlockowner function that uses thread local storage to communicate this information to those file systems that care.

Not sure how well Go deals with thread local storage. It is probably OK in this specific case where you are calling back from a C thread and but in general go routines move around between threads in arbitrary ways. There is always runtime.LockOSThread.

My understanding of how threading works with cgo is as follows

  • go calls C function - this must obviously stay on the same thread
    • the runtime marks this thread as locked
    • if it runs for too long the runtime starts a new thread
  • some time later C calls back to go
    • this must be in the same thread
  • I don't know whether if you did a time.Sleep() for instance in the goroutine called from C whether your goroutine could be rescheduled on a different thread - I doubt it as that would mess up the return to C

I think basically because everything is callbacks with cgofuse I think we don't have to worry about the thread local storage not working or call runtime.LockOSThread.

from cgofuse.

advanderveer avatar advanderveer commented on May 21, 2024

File locking on linux leaves a bitter taste for many (as you probably read) so I think it is alright to skip this for now. As you indicate, one could always add something like a LockableFilesystem interface in the future, e.g:

type LockableFilesystemInterface interface{
       FileSystemInterface
       Lock(...)
       Release(...)
}

from cgofuse.

billziss-gh avatar billziss-gh commented on May 21, 2024

Thank you both for the feedback. I will tag cgofuse as v1.0 in a few minutes.

@ncw wrote:

Not sure how well Go deals with thread local storage. It is probably OK in this specific case where you are calling back from a C thread and but in general go routines move around between threads in arbitrary ways. There is always runtime.LockOSThread.

Thanks for this write up. It confirms my own understanding. I note that the current API already uses thread local storage (implicitly) in Getcontext (which uses fuse_get_context).

@advanderveer wrote:

File locking on linux leaves a bitter taste for many (as you probably read) so I think it is alright to skip this for now. As you indicate, one could always add something like a LockableFilesystem interface in the future.

Thanks for the Linux locking pointers and the LockableFilesystem suggestion. I do agree that POSIX locking is a mess and believe that Windows got this one right.

from cgofuse.

billziss-gh avatar billziss-gh commented on May 21, 2024

Cgofuse is tagged as v1.0 and all tests pass.

The API is now FROZEN. Breaking API changes will receive a major version update (2.0). Incremental API changes will receive a minor version update (1.x).

from cgofuse.

ncw avatar ncw commented on May 21, 2024

Hooray! :-) Congratulations on a really good package.

from cgofuse.

billziss-gh avatar billziss-gh commented on May 21, 2024

@ncw thanks. Thanks for also showing me the Go way on multiple occasions :)

from cgofuse.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.