Giter Site home page Giter Site logo

Comments (7)

brimoor avatar brimoor commented on May 21, 2024

Or, in lieu of actually supporting this, we need to raise a fatal error when multiple connections are detected (since this can result in data loss and other catastrophic things)

from fiftyone.

lethosor avatar lethosor commented on May 21, 2024

That depends a lot on how you're defining "multiple connections" - and in any case, I don't think connections themselves are the issue here. There are several questions to answer:

  • Should a user be able to spin up multiple completely isolated instances of FiftyOne, using a separate database, storage location, and set of ports+connections? This would require separate database directories for each instance.
    • If we allow this, how do persistent datasets work? Which instance's data persists? Do all of them persist? If so, how do you specify which database to read persistent datasets from in the future?
      • Update: #65 was assuming that data could live under the virtual environment root (if one exists). That could be an option (and another option could be storing data in the current directory), but similar questions would come up - should a user be able to spin up isolated instances in the same virtual environment or directory? I don't think we should limit users to one instance per virtual environment, because then each instance would need its own copy of FiftyOne's dependencies (a virtual environment that I created yesterday is 530 MB).
  • Should a user be able to spin up multiple separate instances of FiftyOne using the same database? This would require reworking the database service to wait for all FiftyOne sessions using it to close before it exits.
    • If we allow this, how do we prevent multiple sessions from breaking each other? As long as they're working with separate datasets, I think they should be fine, but I have no idea how to prevent them from working on the same dataset at the same time.
      • Would there be a situation where a user would want to work on the same dataset from multiple sessions at the same time?
  • Should a user be able to connect to multiple remote sessions at the same time? What about a remote and a local session? Most of the problems in this case, except for local port conflicts, would be the same as above but on the remote machine.
  • Should different users on the same machine be able to use FiftyOne at the same time? Currently, they will likely run into port conflicts (assuming they're not using separate network namespaces), but I think this important to support as well. Their data directories are already separated (since they have different home directories), so this is just a matter of avoiding port conflicts.

All of these would require changes to enable the app, database, and server to use arbitrary/random ports. #243 (specifically _wait_for_child_port) made some progress in this direction, and the rest should be achievable.

from fiftyone.

brimoor avatar brimoor commented on May 21, 2024

Should a user be able to spin up multiple separate instances of FiftyOne using the same database? This would require reworking the database service to wait for all FiftyOne sessions using it to close before it exits.

Yes, the design is that a user always has a single database, and they can interact with it from multiple sessions at the same time. If they modify a dataset in one session, then they'll see that the next time they query the DB from another session. There's no caching right now, so such things are all automatic. The user could cause themselves issues by editing a dataset concurrently in multiple sessions, but that's their prerogative.

In the future, we'll support a cloud-backed database, where, like any other web tool, a user can fire up multiple web-based FiftyOne App tabs and interact with their datasets. If they modify a dataset somewhere, that's reflected the next time they refresh their other tabs

from fiftyone.

brimoor avatar brimoor commented on May 21, 2024

Should different users on the same machine be able to use FiftyOne at the same time? Currently, they will likely run into port conflicts (assuming they're not using separate network namespaces), but I think this important to support as well. Their data directories are already separated (since they have different home directories), so this is just a matter of avoiding port conflicts.

yep, agree that this is important and achievable via port improvements

from fiftyone.

brimoor avatar brimoor commented on May 21, 2024

Should a user be able to connect to multiple remote sessions at the same time? What about a remote and a local session? Most of the problems in this case, except for local port conflicts, would be the same as above but on the remote machine.

yes I think users should be allowed to connect to any combination of multiple remote/local sessions.

Connecting to multiple remote sessions is lower priority though, as they only utility there is having two apps open at the same time. While it should be allowed (since multiple web tabs will be allowed in the future), I don't think we have enough app features yet to compel someone to want to do this.

One clear use case would be a remote user firing up a remote app and then continuing to mess with their dataset in their remote shell, or a different (or multiple) shells. So 1 remote + 1 local is important.

from fiftyone.

lethosor avatar lethosor commented on May 21, 2024

Yes, the design is that a user always has a single database, and they can interact with it from multiple sessions at the same time. If they modify a dataset in one session, then they'll see that the next time they query the DB from another session. There's no caching right now, so such things are all automatic.

Ok, that's effectively how it works now, as long as the first FiftyOne session you open is the last one that you close, and as long as you don't try to use the app (since there would be port conflicts with existing instances of the app/server). Reworking the database service should resolve those issues and make it behave more consistently.

The user could cause themselves issues by editing a dataset concurrently in multiple sessions, but that's their prerogative.

Was this what you were thinking about when you mentioned "data loss and other catastrophic things"? This was my primary concern with a single database too. If we leave it up to the user to not do this (which I think is the approach we take in some other places already), that would make things considerably easier on our end. My biggest concern at that point would be what happens if someone forgets they have an idle session running and opens another one using the same dataset. As long as their changes in the new session are saved as expected, I don't think this would be much of a problem.

from fiftyone.

brimoor avatar brimoor commented on May 21, 2024

I said "data loss and other catastrophic things" bc I don't know the details of the current design and what bad things may happen if you open multiple DB connections.

But, once we update so that multiple connections are allowed, then, yes it's fine if a user has a long-standing shell open that they forgot about. Any data that it has in-memory may be outdated but the next time they query the DB they'll get updated info. Or maybe an error if they are, for example, working with a dataset that has since been deleted in another session.

Putting the appropriate error checks around interacting with a stale dataset is an important part of this implementation.

from fiftyone.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.