Problem The current implementation of jupyter-server-fileid is har

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

LOL, yes, still waking up. I will move that comments over to <a class="issue-link js-i

I opened <a class="issue-link js-issue-link" data-error-text="Failed to load title" da

Support any contents manager about jupyter_server_fileid HOT 7 CLOSED

jupyter-server commented on June 9, 2024

Support any contents manager

from jupyter_server_fileid.

Comments (7)

dlqqq commented on June 9, 2024 2

@ellisonbg and I were able to discuss this issue at length. Our proposal is that we can simply have file_id_manager_class be a trait on FileIdExtension, defined as follows:

    # in FileIdExtension
    file_id_manager_class = Type(klass=AbstractFileIdManager)
    
    @default(file_id_manager_class)
    def _default_file_id_manager_class():
      return LocalFileIdManager if isinstance(self.settings["contents_manager"], FileContentsManager) else ArbitraryFileIdManager
    
    def initialize_settings(...):
      ...
      self.settings["file_id_manager"] = self.file_id_manager_class(...)

With this proposal, the following user flows are covered out-of-the-box:

User uses default FileContentsManager => automatically default to using LocalFileIdManager because we know their contents manager only interacts with local filesystem.
User installs custom contents manager with no file ID manager specified in Jupyter config => automatically default to ArbitraryFileIdManager which only listens to emitted events and makes no assumptions about the local filesystem
User installs custom contents manager with custom file ID manager specified in Jupyter config => FileIdExtension inherits that trait and uses their custom file ID manager

Packages that provide a custom contents manager would specify their custom file ID manager in their Jupyter config file, or leave it blank. For example, to specify a custom file ID manager YYY for contents manager XXX:

{
  "ServerApp": {
    "contents_manager_class": XXX
  },
  "FileIdExtension": {
    "file_id_manager_class": YYY
  }
}

The only shortcoming is if the user has multiple contents managers installed locally, then passing --ServerApp.contents_manager_class=XXX is insufficient, as the user should also be passing the corresponding file ID manager. But our rationale is that to guarantee that XXX works, the user should pass the corresponding Jupyter config file it ships with anyways, as the package providing XXX may have additional configuration required for it to work besides setting FileIdExtension.file_id_manager_class.

The benefits to this approach are several:

No changes are necessary to Jupyter server
Custom contents managers can specify the file ID manager they need out-of-the-box
Jupyter server does not require a dependency on jupyter_server_fileid

from jupyter_server_fileid.

dlqqq commented on June 9, 2024

Hmm, if there's no way to retrieve some inode number equivalent from a filesystem (any immutable attribute that's preserved on moves), then the FileIdManager is essentially useless for anything out-of-band and 80% of the logic can be eliminated.

My proposed series of changes to help address this:

add an abstract class AbstractFileIdManager (see #1)
rename FileIdManager => LocalFileIdManager
write a custom implementation ArbitraryFileIdManager that works on arbitrary filesystems, and just listens to contents manager filesystem events exclusively to track changes. No effort is made to track out-of-band filesystem ops.

One problem with this is that manually specifying a custom contents manager would require you to specify changing the file ID manager instance manually as well. Not sure if this is an issue we want to solve. We could either:

Have the file ID extension check if the contents manager is local simply by checking self.settings["contents_manager].__class__.__name__, and then pick the right file ID manager class to instantiate.
Add a exclusively_local trait on the abstract contents manager class that informs other extensions of whether it deals exclusively with local filesystems. File ID extension checks for the truthiness of this trait and picks the right file ID manager class to instantiate.

from jupyter_server_fileid.

ellisonbg commented on June 9, 2024

@dlqqq thanks for writing this up. I have thought about the design a bit more since we talked and I think this approach looks good. This will enable us to evolve the file id stuff separate from Jupyter Server, which I think will help at this point.

from jupyter_server_fileid.

ellisonbg commented on June 9, 2024

Thinking a bit more overnight. If there are N RTC clients, those clients would need to poll get_paths() on a regular interval, which causes a lot of issues. I think a better approach is as follows:

The file id manager should call sync_all in a subprocess on a regular interval.
Each time there is a path change, it should publish an event on the event bus.

This way, each client won't have to poll the server to get all these updates.

from jupyter_server_fileid.

kevin-bates commented on June 9, 2024

Hi @ellisonbg - I agree with your last comment (background sync). Was this intended for the conversation on #20?

from jupyter_server_fileid.

ellisonbg commented on June 9, 2024

LOL, yes, still waking up. I will move that comments over to #20 thanks.

from jupyter_server_fileid.

davidbrochart commented on June 9, 2024

I opened #24.

from jupyter_server_fileid.

Support any contents manager about jupyter_server_fileid HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent