Comments (2)
From a recent discussion with NL, this would likely work for them but we would need to have a way to communicate to them the latest good commit hash so that they can rollback lib_store
as well.
from irmin.
I have another design to propose to solve crash (in)consistency problems.
Step 1. Change the "index" from "Index" to an append only file
Since irmin-pack supports minimal indexing, the index grows 41 byte per block, which is 43 MB per year. All the Index machinery is not useful anymore when using the minimal indexing mode.
Up to now we wanted to keep Index in case of minimal indexing fails - so that Tezos could fallback on the non-minimal indexing strategy. Minimal-indexing has proven itself, there is no need to keep that failsafe.
During our initial discussions on implementing irmin-pack's lower layer, it seemed to us that dropping the support for non-minimal indexing would simplify a lot the implementation (we would still support stores that knew non-minimal indexing in the past).
At open time, the control file now allows to detect the case where the suffix is ahead of time of the dict. However, we are still not able to detect the cases where the index is ahead of time of the suffix (we either raise Pack_store.Invalid_read
or worse).
All in all, we can now consider the fact of migrating away from Index.
For the index, we could use a storage scheme similar to dict. It would be an append only file that is fully loaded in memory when opening the store, that could be garbage collected and which end offset could be remembered by the control file.
For the GC we would include a "generation" integer in the index filename. We would GC using the "surgery" technique. We would have to handle newies the same way as the Irmin 3.4 suffix.
Step 2. Raise Recovery_needed
when opening a deeply corrupted store
Currently with irmin 3.4, when opening a store where the control file is ahead of time of the dict or the suffix, we raise Inconsistent_store
. We currently provide no way for recovering these stores.
Following step 1.
we would be able to also detect these cases for the index.
For both the dict/index/suffix we could then raise Recovery_needed
and implement a recovery method. See next step.
Step 3. A new recovery method
Following the 2 previous steps we could implement a recovery method that:
- Decides a new end offset for the index file
- Decides a new end offset for the suffix file
- Decides a new end offset for the dict file
- Overwrites the old control file
The algorithms would search in index for the valid entry with the highest offset. An entry in index is valid if:
- it points to a valid offset in the suffix/prefix/lower,
- if all the objects preceeding that valid object have valid pointers in the dict.
We would also be able to drop the existing "reconstruct index" recovery method.
Step 0. a. Migrating stores that only knew minimal indexing
The simplest solution would be a migration that happens at open_rw
time of the file manager. It would traverse Index and convert it to the very first index file. A crash during that migration would not be destructive.
A second solution would be to make the existing Index readonly and use the new index file scheme for the new index entries. The migrated irmin-pack stores would forever keep the Index directory. GC would work normally for the new entries.
A third solution, on top of the second solution, would be to migrate the data out of Index during the first GC. We could then discard Index after the finalise of that first GC.
Step 0. b. Migrating stores that knew non-minimal indexing
We would stick to the "second solution" of the previous section.
Discriminating between case a.
and b.
would be possible by looking at the existing control file. We've already stored these informations in it's current form.
from irmin.
Related Issues (20)
- Release 3.5 HOT 1
- Expose a `gc_commit` function HOT 1
- Expose a `cancel_gc` function
- Implement retry in readonly open
- irmin-graphql: improve default presentation
- set multiple keys at once HOT 2
- irmin-pack: `integrity_check` minimal indexing stores
- irmin-pack.unix: move the branch store in the file manager HOT 1
- Remove Io interface and use Io.Unix directly instead
- irmin-pack.unix: Rename `Ext` module
- irmin-pack: compile-time and run-time configurations
- Update for `mtime.2.0.0`
- broken integrity-check-index command HOT 1
- irmin-pack: successful GC but missing GC result file HOT 1
- LICENSE file discrepency HOT 3
- irmin-pack: improve client experience when calling integrity check
- irmin-pack: handle LRU purge in read-only instances after a GC
- Invalid conduit source address specified HOT 2
- irmin-pack: improve control file's atomic write properties HOT 1
- irmin-pack: extend GC to unlimited history stores HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from irmin.