Comments (7)
Or, in lieu of actually supporting this, we need to raise a fatal error when multiple connections are detected (since this can result in data loss and other catastrophic things)
from fiftyone.
That depends a lot on how you're defining "multiple connections" - and in any case, I don't think connections themselves are the issue here. There are several questions to answer:
- Should a user be able to spin up multiple completely isolated instances of FiftyOne, using a separate database, storage location, and set of ports+connections? This would require separate database directories for each instance.
- If we allow this, how do persistent datasets work? Which instance's data persists? Do all of them persist? If so, how do you specify which database to read persistent datasets from in the future?
- Update: #65 was assuming that data could live under the virtual environment root (if one exists). That could be an option (and another option could be storing data in the current directory), but similar questions would come up - should a user be able to spin up isolated instances in the same virtual environment or directory? I don't think we should limit users to one instance per virtual environment, because then each instance would need its own copy of FiftyOne's dependencies (a virtual environment that I created yesterday is 530 MB).
- If we allow this, how do persistent datasets work? Which instance's data persists? Do all of them persist? If so, how do you specify which database to read persistent datasets from in the future?
- Should a user be able to spin up multiple separate instances of FiftyOne using the same database? This would require reworking the database service to wait for all FiftyOne sessions using it to close before it exits.
- If we allow this, how do we prevent multiple sessions from breaking each other? As long as they're working with separate datasets, I think they should be fine, but I have no idea how to prevent them from working on the same dataset at the same time.
- Would there be a situation where a user would want to work on the same dataset from multiple sessions at the same time?
- If we allow this, how do we prevent multiple sessions from breaking each other? As long as they're working with separate datasets, I think they should be fine, but I have no idea how to prevent them from working on the same dataset at the same time.
- Should a user be able to connect to multiple remote sessions at the same time? What about a remote and a local session? Most of the problems in this case, except for local port conflicts, would be the same as above but on the remote machine.
- Should different users on the same machine be able to use FiftyOne at the same time? Currently, they will likely run into port conflicts (assuming they're not using separate network namespaces), but I think this important to support as well. Their data directories are already separated (since they have different home directories), so this is just a matter of avoiding port conflicts.
All of these would require changes to enable the app, database, and server to use arbitrary/random ports. #243 (specifically _wait_for_child_port
) made some progress in this direction, and the rest should be achievable.
from fiftyone.
Should a user be able to spin up multiple separate instances of FiftyOne using the same database? This would require reworking the database service to wait for all FiftyOne sessions using it to close before it exits.
Yes, the design is that a user always has a single database, and they can interact with it from multiple sessions at the same time. If they modify a dataset in one session, then they'll see that the next time they query the DB from another session. There's no caching right now, so such things are all automatic. The user could cause themselves issues by editing a dataset concurrently in multiple sessions, but that's their prerogative.
In the future, we'll support a cloud-backed database, where, like any other web tool, a user can fire up multiple web-based FiftyOne App tabs and interact with their datasets. If they modify a dataset somewhere, that's reflected the next time they refresh their other tabs
from fiftyone.
Should different users on the same machine be able to use FiftyOne at the same time? Currently, they will likely run into port conflicts (assuming they're not using separate network namespaces), but I think this important to support as well. Their data directories are already separated (since they have different home directories), so this is just a matter of avoiding port conflicts.
yep, agree that this is important and achievable via port improvements
from fiftyone.
Should a user be able to connect to multiple remote sessions at the same time? What about a remote and a local session? Most of the problems in this case, except for local port conflicts, would be the same as above but on the remote machine.
yes I think users should be allowed to connect to any combination of multiple remote/local sessions.
Connecting to multiple remote sessions is lower priority though, as they only utility there is having two apps open at the same time. While it should be allowed (since multiple web tabs will be allowed in the future), I don't think we have enough app features yet to compel someone to want to do this.
One clear use case would be a remote user firing up a remote app and then continuing to mess with their dataset in their remote shell, or a different (or multiple) shells. So 1 remote + 1 local is important.
from fiftyone.
Yes, the design is that a user always has a single database, and they can interact with it from multiple sessions at the same time. If they modify a dataset in one session, then they'll see that the next time they query the DB from another session. There's no caching right now, so such things are all automatic.
Ok, that's effectively how it works now, as long as the first FiftyOne session you open is the last one that you close, and as long as you don't try to use the app (since there would be port conflicts with existing instances of the app/server). Reworking the database service should resolve those issues and make it behave more consistently.
The user could cause themselves issues by editing a dataset concurrently in multiple sessions, but that's their prerogative.
Was this what you were thinking about when you mentioned "data loss and other catastrophic things"? This was my primary concern with a single database too. If we leave it up to the user to not do this (which I think is the approach we take in some other places already), that would make things considerably easier on our end. My biggest concern at that point would be what happens if someone forgets they have an idle session running and opens another one using the same dataset. As long as their changes in the new session are saved as expected, I don't think this would be much of a problem.
from fiftyone.
I said "data loss and other catastrophic things" bc I don't know the details of the current design and what bad things may happen if you open multiple DB connections.
But, once we update so that multiple connections are allowed, then, yes it's fine if a user has a long-standing shell open that they forgot about. Any data that it has in-memory may be outdated but the next time they query the DB they'll get updated info. Or maybe an error if they are, for example, working with a dataset that has since been deleted in another session.
Putting the appropriate error checks around interacting with a stale dataset is an important part of this implementation.
from fiftyone.
Related Issues (20)
- [BUG] Not able to use GUI through script HOT 1
- [FR]Custom Color Specification for Labels in FiftyOne Visualizations HOT 6
- [FR] Dict fields not shown on session viewer HOT 1
- [BUG] HOT 1
- [FR]Assessing Data Diversity in Object Detection Using FIFTYONE and Visualizing with UMAP for YOLO Recognitions HOT 1
- [FR]Create New Directory for Filtered COCO Dataset in FiftyOne HOT 1
- [BUG] Shuffle Stage Random Number Bug HOT 1
- [Question]: Use multi-processing to process samples of a view HOT 1
- Can it support the import and visual display of excel and csv files?
- [DOCS] Error in the example on how to initialize a custom run HOT 1
- [BUG] RuntimeError when setting session color_scheme HOT 1
- [FR] Compute similarity using image hash functions and the Hamming distance
- could not found mongo db (fiftyone.core.config.FiftyOneConfig) HOT 8
- [BUG] foz.load_zoo_dataset not working - BadZipFile: File is not a zip file HOT 6
- [BUG] Mongodb gets wipe out in unexpected intervals, all datasets lost! HOT 2
- [BUG] fiftyone-db cannot be installed anymore with poetry and pip 24 HOT 7
- Can't open the session anymore - _HAS_DEFAULT_FACTORY Error HOT 1
- [INSTALL] Is Python 3.7 supported with Fiftyone==0.20.1 ? HOT 1
- [DOCS] reencode_video clarifications HOT 5
- [FR] KeyError: 'Accessing samples by numeric index is not supported. Use sample IDs, filepaths, slices, boolean arrays, or a boolean ViewExpression instead' HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fiftyone.