Giter Site home page Giter Site logo

moby / datakit Goto Github PK

View Code? Open in Web Editor NEW
1.1K 1.1K 146.0 3.82 MB

Connect processes into powerful data pipelines with a simple git-like filesystem interface

License: Apache License 2.0

Makefile 0.25% OCaml 94.38% Shell 0.25% Go 4.45% CSS 0.39% JavaScript 0.06% Dockerfile 0.23%
data-flow database datakit docker filesystem-api pipeline

datakit's Introduction

The Moby Project

Moby Project logo

Moby is an open-source project created by Docker to enable and accelerate software containerization.

It provides a "Lego set" of toolkit components, the framework for assembling them into custom container-based systems, and a place for all container enthusiasts and professionals to experiment and exchange ideas. Components include container build tools, a container registry, orchestration tools, a runtime and more, and these can be used as building blocks in conjunction with other tools and projects.

Principles

Moby is an open project guided by strong principles, aiming to be modular, flexible and without too strong an opinion on user experience. It is open to the community to help set its direction.

  • Modular: the project includes lots of components that have well-defined functions and APIs that work together.
  • Batteries included but swappable: Moby includes enough components to build fully featured container systems, but its modular architecture ensures that most of the components can be swapped by different implementations.
  • Usable security: Moby provides secure defaults without compromising usability.
  • Developer focused: The APIs are intended to be functional and useful to build powerful tools. They are not necessarily intended as end user tools but as components aimed at developers. Documentation and UX is aimed at developers not end users.

Audience

The Moby Project is intended for engineers, integrators and enthusiasts looking to modify, hack, fix, experiment, invent and build systems based on containers. It is not for people looking for a commercially supported system, but for people who want to work and learn with open source code.

Relationship with Docker

The components and tools in the Moby Project are initially the open source components that Docker and the community have built for the Docker Project. New projects can be added if they fit with the community goals. Docker is committed to using Moby as the upstream for the Docker Product. However, other projects are also encouraged to use Moby as an upstream, and to reuse the components in diverse ways, and all these uses will be treated in the same way. External maintainers and contributors are welcomed.

The Moby project is not intended as a location for support or feature requests for Docker products, but as a place for contributors to work on open source code, fix bugs, and make the code more useful. The releases are supported by the maintainers, community and users, on a best efforts basis only, and are not intended for customers who want enterprise or commercial support; Docker EE is the appropriate product for these use cases.


Legal

Brought to you courtesy of our legal counsel. For more context, please see the NOTICE document in this repo.

Use and transfer of Moby may be subject to certain restrictions by the United States and other governments.

It is your responsibility to ensure that your use and/or transfer does not violate applicable laws.

For more information, please see https://www.bis.doc.gov

Licensing

Moby is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.

datakit's People

Contributors

avsm avatar dave-tucker avatar djs55 avatar ebriney avatar hannesm avatar ijc avatar kit-ty-kate avatar mor1 avatar philipdexter avatar rgrinberg avatar samoht avatar simonferquel avatar talex5 avatar thajeztah avatar thegaram avatar tonistiigi avatar yallop avatar yomimono avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datakit's Issues

Add a way to express and control Git remotes

Currently, everything is stored in one local repository and there is no way to pull/push from other repos. We would like to add a way to allow that, for instance:

mkdir /remotes/origin
echo https://github.com/docker/datakit > /remotes/origin/url
echo master > /remotes/origin/pull
89bfdb2c4f920bc043c044c1945bb2e2d836238d

Go GitHub hooks leaking fids?

The Go hooks often start to fail after a few hours. Logs show:

INFO[58782] Received event: status 30f87880-3917-11e6-99eb-2033dc3be24e  host=... name=... user=docker
2016/06/23 07:50:45 Dialling tcp 127.0.0.1:5640
DEBU[58785] user=docker, repo=...
2016/06/23 07:51:02 I don't know how to flush: leaking 1
2016/06/23 07:51:02 fatal error reading msg: read tcp 127.0.0.1:5640: use of closed network connection

At the DataKit end, the client always seems to die at around fid 127. Maybe this leaking message is related?

Customizable merge functions

See https://github.com/docker/pinata/issues/52

It'd be great if DB users can registers their own merge functions to be used when a transaction is committed, (possibly) written in different languages (Lua, Go, OCaml, etc.) or using interesting combinators that we will provide later (such as CRDTs).

Currently the merge functions are statically decided and the one being used is the simplest one: if a file modified in the transaction is modified in the main branch, the then transaction returns a conflict.

irmin-io polling continues after client kills cat command

In running through the tutorial, I try the following command to watch a branch.

cat /db/branch/4.03/watch/tree.live

After an initial irmin.watch [DEBUG] watch [1: 0k/0g]: id=0,
the server starts start to echo
irmin-io [DEBUG] polling /data/.git/refs/heads: no changes! every second. When I Ctrl-C the client cat command the irmin-io logs continue, with no irmin.watch command signifying the watch has ended.

Docker version 1.11.2, build b9f10c9

I am running the datakit server as

docker run -it --net datakit-net --name datakit -v $(realpath .):/data docker/datakit

and the client as

docker run -it --privileged --net datakit-net docker/datakit:client

Datakit exits if github token is missing

Looking inside the /github.com directory causes Datakit to quit.
GitHub support should probably be optional, and the server should continue running if the token is missing or doesn't work.

Problem seeing Git branches

  1. Init bare repository (git init --bare .git)
  2. git push to master branch.
  3. Irmin9p doesn't show master. git branch does show master.
  4. Create a foo branch and make a commit, using irmin9p.
  5. Irmin9p shows only foo. git branch shows only foo (master has gone!).
  6. git push master again. Now everything works.

Backend for S3

Can allow to download and upload big blobs to S3 directly using the filesystem interface.

Pinata CI

This is the tracking issue for the Pinata CI requirements. /cc @MagnusS

The short-term new features that are needed to deploy a local Datakit instance for CI purposes are:

  • a pipeline syntax (#24)
  • a way to express Git remotes (#21)
  • a way to interact with Github API (#19) and repositories (#20)
  • a way to interact with S3 (#20)

Seeking is not allowed in streams

Seems that terminal likes to seek on stdout of files put into background (not sure why, need to investigate):

+321731243us fs9p       [DEBUG] S tag 1 Read(len(data): 8)
+321731281us fs9p       [DEBUG] S tag 2 Read(len(data): 8)
+321762447us fs9p       [DEBUG] C ((tag (1))
                                    (payload
                                     (Read
                                      ((fid 9) (offset 16) (count 8168)))))
+321762511us vfs        [DEBUG] offset:16 current-offset:16
+321762530us fs9p       [DEBUG] S tag 1 Read(len(data): 0)
+321762808us fs9p       [DEBUG] C ((tag (2))
                                    (payload
                                     (Read
                                      ((fid 10) (offset 8) (count 8168)))))
+321762977us vfs        [DEBUG] offset:8 count:16
+321763229us fs9p       [DEBUG] S ((tag (2))
                                    (payload
                                     (Err
                                      ((ename "Attempt to seek in stream")
                                       (errno ())))))
+321763377us fs9p       [DEBUG] C ((tag (1))
                                    (payload
                                     (Read
                                      ((fid 9) (offset 16) (count 8168)))))
+321763559us vfs        [DEBUG] offset:16 current-offset:16
+321763578us vgithub    [DEBUG] wait
+321803168us fs9p       [DEBUG] C ((tag (2))
                                    (payload
                                     (Read
                                      ((fid 10) (offset 8) (count 8168)))))
+321803252us vfs        [DEBUG] offset:8 current-offset:16
+321803278us fs9p       [DEBUG] S ((tag (2))
                                    (payload
                                     (Err
                                      ((ename "Attempt to seek in stream")
                                       (errno ())))))
+321803514us fs9p       [DEBUG] C ((tag (2))
                                    (payload
                                     (Read
                                      ((fid 10) (offset 8) (count 8168)))))
+321803670us vfs        [DEBUG] offset:8 current-offset:16
+321803702us fs9p       [DEBUG] S ((tag (2))
                                    (payload
                                     (Err
                                      ((ename "Attempt to seek in stream")
                                       (errno ())))))
+321803967us fs9p       [DEBUG] C ((tag (2))
                                    (payload
                                     (Read
                                      ((fid 10) (offset 8) (count 8168)))))
+321804020us vfs        [DEBUG] offset:8 current-offset:16
+321804039us fs9p       [DEBUG] S ((tag (2))
                                    (payload
                                     (Err
                                      ((ename "Attempt to seek in stream")
                                       (errno ())))))
+321804771us fs9p       [DEBUG] C ((tag (2)) (payload (Clunk ((fid 10)))))
+321804959us fs9p       [DEBUG] S ((tag (2)) (payload (Clunk ())))

To repro:

$ cat /db/github.com/samoht/test/pr/8/updates
< PRESS CTL-G >
$ bg
$ echo pending db/github.com/samoht/test/pr/8/state
pending
pending
cat: read error: No error information

/db failed: No such device

I'm on Centos 7 using datakit 51bcded and ran in one terminal:

$ ./scripts/start-datakit.sh

After building finished, in another term I ran:

 datakit-mount
mount: mounting 172.17.0.2 on /db failed: No such device

Is this expected behavior?

Pipeline syntax

We need a way to describe pipeline syntax. Initially, shell scripts are fine, but we might want something nicer. @talex5 can you describe what you have in mind and what is missing?

Git metadata (symlink and exec) are not exposed by Irmin

See https://github.com/docker/pinata/issues/45

Currently Irmin conflates the Git metadata which are not directories.

The full list of possible Git metadata is here:

type perm =
  [ `Normal
  | `Exec
  | `Link
  | `Dir
  | `Commit ]

which means that symlink and executable are considered as normal file. (and not sure what to do with the Commit bit, though...). A solution would be to (i) write an Irmin backend which can read/write Git metadata; (ii) when importing a Git repository into the DB, import both the contents and the metadata. When serialising the DB into the container filesystem, use both the imported contents and metadata.

Note: we might need to have a more general system based on continuity at one point.

Linux doesn't handle "Buffer too small" error

Linux reported:

Error checking context: 'readdirent: errno 526'.

The Datakit logs show:

+22823959us fs9p       [DEBUG] C ((tag (1))
                                   (payload
                                    (Walk ((fid 6) (newfid 7) (wnames ())))))
+22824016us fs9p       [DEBUG] S ((tag (1)) (payload (Walk ((wqids ())))))
+22824176us fs9p       [DEBUG] C ((tag (1))
                                   (payload
                                    (Open
                                     ((fid 7)
                                      (mode
                                       ((io Read) (truncate false)
                                        (rclose false) (append false)))))))
+22824287us fs9p       [DEBUG] S ((tag (1))
                                   (payload
                                    (Open
                                     ((qid
                                       ((flags (Directory)) (version 0)
                                        (id 92)))
                                      (iounit 0)))))
+22824465us fs9p       [DEBUG] C ((tag (1))
                                   (payload
                                    (Read ((fid 7) (offset 0) (count 8168)))))
+22824611us fs9p       [DEBUG] S tag 1 Read(len(data): 8129)
+22824847us fs9p       [DEBUG] C ((tag (1))
                                   (payload
                                    (Read ((fid 7) (offset 8129) (count 39)))))
+22824866us fs9p       [DEBUG] S ((tag (1))
                                   (payload
                                    (Err
                                     ((ename "Buffer too small") (errno ())))))

So, it tried to read into a buffer, got a short read and retried with the remaining space in the buffer, which was too small for even a single entry.

The spec says:

For directories, read returns an integral number of directory entries

Create directory in database root fails with "Directory is read-only"

Expected Result

When attempting to create a directory in the database root, Datakit should respond to without error.

Actual Result

Datakit responds with Rerror, Directory is read-only

Information

Discovered this while writing C# bindings for Datakit.
When you attempt to create a directory in the database root, branch in this example, Datakit will respond with an Rerror message claiming that the Directory is read-only

+4968us Datakit     Starting com.docker.db...
+5041us Datakit    [DEBUG] Using Git-format store "/data"
+5516us irmin-io   [DEBUG] mkdir /data/.git
+5727us irmin-io   [DEBUG] mkdir /data/.git/tmp
+6009us irmin-io   [INFO] Writing /data/.git/HEAD (/data/.git/tmp/HEADd54231write)
+6405us vgithub    [DEBUG] Datakit does not use the Github bindings
+6580us Datakit    [DEBUG] Waiting for connections on socket "tcp://0.0.0.0:5640"
+101477671us Datakit    [DEBUG] New unix client
+101477709us ivfs-remote [DEBUG] create
+101485531us fs9p       [DEBUG] C ((tag (0))
                                    (payload
                                     (Version
                                      ((msize 16384) (version TwoThousand)))))
+101485591us fs9p       [DEBUG] S ((tag (0))
                                    (payload
                                     (Version
                                      ((msize 16384) (version TwoThousand)))))
+101485685us fs9p       [INFO] Using protocol TwoThousand msize 16384
+101490338us fs9p       [DEBUG] C ((tag (1))
                                    (payload
                                     (Attach
                                      ((fid 1) (afid 0) (uname Dave)
                                       (aname /) (n_uname ())))))
+101490396us fs9p       [DEBUG] S ((tag (1))
                                    (payload
                                     (Attach
                                      ((qid
                                        ((flags (Directory)) (version 0)
                                         (id 5)))))))
+101494532us fs9p       [DEBUG] C ((tag (2))
                                    (payload
                                     (Walk ((fid 1) (newfid 2) (wnames ())))))
+101494581us fs9p       [DEBUG] S ((tag (2)) (payload (Walk ((wqids ())))))
+101495497us fs9p       [DEBUG] C ((tag (3))
                                    (payload
                                     (Walk ((fid 2) (newfid 3) (wnames ())))))
+101495527us fs9p       [DEBUG] S ((tag (3)) (payload (Walk ((wqids ())))))
+101497487us fs9p       [DEBUG] C ((tag (4))
                                    (payload
                                     (Create
                                      ((fid 3) (name branch)
                                       (perm
                                        ((owner (Read Write Execute))
                                         (group (Read Execute))
                                         (other (Read)) (is_directory true)
                                         (append_only false)
                                         (exclusive false) (is_mount false)
                                         (is_auth false) (temporary false)
                                         (is_device false) (is_symlink false)
                                         (is_hardlink false)
                                         (is_namedpipe false)
                                         (is_socket false) (is_setuid false)
                                         (is_setgid false) (is_any false)))
                                       (mode
                                        ((io Read) (truncate false)
                                         (rclose false) (append false)))
                                       (extension ())))))
+101497586us fs9p       [DEBUG] S ((tag (4))
                                    (payload
                                     (Err
                                      ((ename "Directory is read-only")
                                       (errno ())))))

As you can see, a subsequent Walk and Create operations within the branch directory DO succeed...

+101515630us fs9p       [DEBUG] C ((tag (5)) (payload (Clunk ((fid 3)))))
+101515671us fs9p       [DEBUG] S ((tag (5)) (payload (Clunk ())))
+101516094us fs9p       [DEBUG] C ((tag (6))
                                    (payload
                                     (Walk
                                      ((fid 2) (newfid 2) (wnames (branch))))))
+101516133us fs9p       [DEBUG] S ((tag (6))
                                    (payload
                                     (Walk
                                      ((wqids
                                        (((flags (Directory)) (version 0)
                                          (id 4))))))))
+101516351us fs9p       [DEBUG] C ((tag (7))
                                    (payload
                                     (Walk ((fid 2) (newfid 4) (wnames ())))))
+101516386us fs9p       [DEBUG] S ((tag (7)) (payload (Walk ((wqids ())))))
+101516688us fs9p       [DEBUG] C ((tag (8))
                                    (payload
                                     (Create
                                      ((fid 4) (name master)
                                       (perm
                                        ((owner (Read Write Execute))
                                         (group (Read Execute))
                                         (other (Read)) (is_directory true)
                                         (append_only false)
                                         (exclusive false) (is_mount false)
                                         (is_auth false) (temporary false)
                                         (is_device false) (is_symlink false)
                                         (is_hardlink false)
                                         (is_namedpipe false)
                                         (is_socket false) (is_setuid false)
                                         (is_setgid false) (is_any false)))
                                       (mode
                                        ((io Read) (truncate false)
                                         (rclose false) (append false)))
                                       (extension ())))))
+101516792us fs9p       [DEBUG] S ((tag (8))
                                    (payload
                                     (Create
                                      ((qid
                                        ((flags (Directory)) (version 0)
                                         (id 13)))
                                       (iounit 0)))))

I've verified the same behaviour with @djs55's Go bindings so I think this is likely a problem with Datakit...

This is the same whether running over a named pipe (on Windows) or over TCP to Datakit running in a Linux container.

Split modules on irmin9p/9p lines

The current structure is a bit odd:

  • The sdk Dockerfile builds the i9p library, which is a combination of a 9p pseudo-filesystem support library and most of irmin9p. It does not appear to contain an SDK.
  • The db Dockerfile builds the rest of irmin9p (the command-line interface and transports) on top of sdk.

I suggest:

  • Move everything to a single Dockerfile.
  • Move the Irmin-specific bits of i9p in with the rest of the irmin9p code.
  • Leave the rest of i9p as a generic 9p support package.

Any objection to this @samoht?

"Out of memory" with large files

Is there a maximum supported size for files? I get the following error when trying to ls a file with size of 1.5gb:

fs9p [DEBUG] S ((tag (1))
                               (payload
                                (Err
                                 ((ename "Out of memory") (errno ())))))

With a slightly smaller file (700mb) I get the following error:

fs9p [DEBUG] S ((tag (1))
                                     (payload
                                      (Err
                                       ((ename
                                         "Unix.Unix_error(Unix.EACCES, \"open\", \"/data/.git/objects/pack/pack-5b5629393a4b074d789894c51cc237f0cc1232f2.idx\")")
                                        (errno ())))))

Smaller files I can successfully ls and cat

First class windows support

I need to make a datakit server which

  • has logging customised for Windows (probably though a custom log reporter)
  • listens on a named pipe
  • listens on Hyper-V sockets
  • is built on Windows in CI (e.g. on Appveyor)

What do you think is the best approach?

head.live doesn't work on Linux 3.16.0 (Debian 8)

Moby (Linux 4.1.18) requires an EOF to push data to the application, with two EOFs in a row meaning a real end-of-file. Perhaps older versions treat a single EOF as end-of-file, because I see:

user@nuc1:~$ uname -a
Linux nuc1 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u3 (2016-01-17) x86_64 GNU/Linux
user@nuc1:~$ cat /mnt/datakit/branch/master/head.live
89bfdb2c4f920bc043c044c1945bb2e2d836238d
user@nuc1:~$ 

Removing empty directories doesn't work

Because Git doesn't store empty directories, mkdir in a transaction's rw directory doesn't change the tree. However, because we do want the new directory to appear in ls listings we add it to extra_dirs.

However, when we remove a directory we don't delete it from its parent's extra_dirs and it still appears in the ls output.

Dangling commits

On a fresh install of pinata, I got lots of dangling commits:

~/Library/Containers/com.docker.docker/Data/database [master L|โœ”] $ git fsck
Checking object directories: 100% (256/256), done.
dangling commit fe6900771b5713a058474bc5d1ef01382f387180
dangling commit fea2689b2773f602cfc769b5f7178269996debc4
dangling commit 0e922051e4e107552bae0297c398e9de88c7f6a9
dangling commit 11e6239f98cef8b560b0e73e7544bcc88631751c
dangling commit 2598545173b2d5189ef930b3699f14169b24f334
dangling commit 36990a0af1f1abf26cd33151bb0a96c4e6aa2395
dangling commit 3afdc07295054a9f43306f3e5a51d9af8e92f6ee
dangling commit 4708d2968b9859bd714f209997715ad70d220d7c
dangling commit 58fbf2460f7c8efc51ca5d512ac677cab437f9ca
dangling commit 674be8a6e54bbcdeca960261192804ddc4ecb566
dangling commit 6795e43af803f90e2d2013acc0b8db39bf5b3c0c
dangling commit 81c25fa23c220548b27a2e17dc0f95c7bbb3ba4a
dangling commit 89d69dd1aa43042abb7ed4278b304655ebb8e5b6
dangling commit 8e807b65b736cb63b351f79b50d0b1c2d800a5ac
dangling commit b82e074b08d9a2c3dd6d16466909d5ca80fe97fa
dangling commit f1256ee9e5cd87f3b78ab57bf4dfe1e09db2e03d
dangling commit f564753f8d22aac948591057e280170621feb6ce
dangling commit fb21948c079c0160fcbf0e747d4cf7f9d0bdb947

Again I'm not sure if the issue is with the client code or with i9p. Note: this doesn't seem to affect the repo sanity.

Host docs in datakit

Currently with host the API docs on our local server, and everything is very manual. Would be nice if we can integrate this in the rest of the CI flow (e.g. re-publish automatically on every PR, in a location available to all).

Note: usually Github pages are great for that, but they don't work well for private repos (ie they are public).

Better username in commit metadata

Currently, the Git history in pinata looks like this:

commit 7be99481c37abb6a4d92a3686d72491c793af810
Merge: 390f6e1 390f6e1
Author: irmin9p <[email protected]>
Date:   Mon Feb 29 10:56:34 2016 +0000

    (no commit message)

commit 390f6e1b55ab3e7f0b5aaaff9b1d33c779d5d933
Merge: 4209536 4209536
Author: irmin9p <[email protected]>
Date:   Mon Feb 29 10:56:31 2016 +0000

    (no commit message)

Would be great to make this a bit more useful...

Kernel panic with 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.5-1~bpo8+1 (2016-02-23)

On:

root@ucp:~# uname -a
Linux ucp 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.5-1~bpo8+1 (2016-02-23) x86_64 GNU/Linux

Moby build under Datakit failed, with dmesg reporting:

[259075.913336] ------------[ cut here ]------------
[259075.913347] WARNING: CPU: 2 PID: 6525 at /home/zumbi/linux-4.3.5/fs/inode.c:273 drop_nlink+0x39/0x40()
[259075.913353] Modules linked in: 9p 9pnet fscache xt_nat xt_tcpudp veth xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter ip_tables x_tables nf_nat nf_conntrack br_netfilter bridge stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c loop dm_mod intel_rapl iosf_mbi x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul jitterentropy_rng sha256_ssse3 sha256_generic hmac drbg ansi_cprng aesni_intel evdev aes_x86_64 lrw pcspkr gf128mul glue_helper ablk_helper cryptd autofs4 ext4 crc16 mbcache jbd2 xen_netfront xen_blkfront crc32c_intel
[259075.913404] CPU: 2 PID: 6525 Comm: git Tainted: G        W       4.3.0-0.bpo.1-amd64 #1 Debian 4.3.5-1~bpo8+1
[259075.913409]  0000000000000000 0000000073b84b68 ffffffff812e1ac9 0000000000000000
[259075.913415]  ffffffff81074451 ffff88001064ba80 ffff88006cc23060 0000000000000200
[259075.913421]  ffff8800102a6c18 ffff8800e975b000 ffffffff811ea029 ffff88001064ba80
[259075.913426] Call Trace:
[259075.913431]  [<ffffffff812e1ac9>] ? dump_stack+0x40/0x57
[259075.913435]  [<ffffffff81074451>] ? warn_slowpath_common+0x81/0xb0
[259075.913438]  [<ffffffff811ea029>] ? drop_nlink+0x39/0x40
[259075.913443]  [<ffffffffa02c6791>] ? v9fs_remove+0x71/0xb0 [9p]
[259075.913447]  [<ffffffff811dc6dc>] ? vfs_rmdir+0xac/0x120
[259075.913451]  [<ffffffff811dfbd0>] ? do_rmdir+0x1e0/0x200
[259075.913455]  [<ffffffff8158a9f6>] ? system_call_fast_compare_end+0xc/0x6b
[259075.913458] ---[ end trace ce481105300b8298 ]---

Duplicated parents

In the pinata database, everytime a key is changed the commit seems to have a duplicate parent:

$ git cat-file d52cdb50bd829f13c7c6aaacfa0fe8ecc3092660 -p
tree 145d2e6a4ca8f9c665947085dd2b147bd09ca019
parent 910c3dc127eb6e35a88d8316eb1d96232acb39b2
parent 910c3dc127eb6e35a88d8316eb1d96232acb39b2
author irmin9p <[email protected]> 1456743397 +0000
committer irmin9p <[email protected]> 1456743397 +0000

(no commit message)
$ git show d52cdb50bd829f13c7c6aaacfa0fe8ecc3092660
commit d52cdb50bd829f13c7c6aaacfa0fe8ecc3092660
Merge: 910c3dc 910c3dc
Author: irmin9p <[email protected]>
Date:   Mon Feb 29 10:56:37 2016 +0000

    (no commit message)

diff --cc com.docker.driver.amd64-linux/on-sleep
index 316fcb7,316fcb7..280e4ae
--- a/com.docker.driver.amd64-linux/on-sleep
+++ b/com.docker.driver.amd64-linux/on-sleep
@@@ -1,1 -1,1 +1,1 @@@
--shutdown
++freeze
diff --cc com.docker.driver.amd64-linux/schema-version
index d8263ee,d8263ee..e440e5c
--- a/com.docker.driver.amd64-linux/schema-version
+++ b/com.docker.driver.amd64-linux/schema-version
@@@ -1,1 -1,1 +1,1 @@@
--2
++3

Is this an issue with the client library or with i9p?

No rule to make target '../../VERSION'

Following instructions in README:

tal@Thomass-MBP ~/d/db> make
docker build -t datakit-db .
Sending build context to Docker daemon 233.5 kB
Step 1 : FROM datakit-sdk
 ---> 9ddeae81a768
Step 2 : COPY src /home/opam/src/db
 ---> Using cache
 ---> 497f2b16c136
Step 3 : RUN opam pin add db.dev /home/opam/src/db
 ---> Using cache
 ---> c0a41ad7457a
Step 4 : RUN sudo chown -R opam.nogroup /home/opam/src
 ---> Using cache
 ---> 0735a3e9d444
Step 5 : WORKDIR /home/opam/src/db
 ---> Using cache
 ---> 5c6e5fbdb27b
Step 6 : RUN opam config exec make
 ---> Running in c169cb14edcc
make: *** No rule to make target '../../VERSION', needed by 'version.ml'.  Stop.
The command '/bin/sh -c opam config exec make' returned a non-zero code: 2
make: *** [sdk] Error 1

GitHub sync: Please tell me who you are

With auto-push mode, DataKit takes a long time to start. It prints lots of messages like this (one per branch I think), with a pause after each one:

*** Please tell me who you are.

Run

  git config --global user.email "[email protected]"
  git config --global user.name "Your Name"

to set your account's default identity.
Omit --global to set the identity only in this repository.

Can we speed this up?

Use Datakit as CI for Moby

Current status:

  • Builds alpine/kernel OK
  • Builds lots of packages
  • Fails cloning docker repository
Cloning into 'docker.git'...
error: Unable to create /database/branch/initrd-of-5da84d3c468a6ee03d550e74b58bba9439e362a8/transactions/log/rw/alpine/packages/docker/docker.git/.git/HEAD

The immediate issue is that we don't support rename in transactions. However, running Git inside a transaction is probably a bad idea anyway (will need cross-directory rename sooner or later, which 9p doesn't support).

Ideally, I think we want to handle cloning of repositories inside Datakit itself (which would be more efficient than downloading the whole thing on every build). However, a short-term fix would be to clone the repository into /tmp or similar.

Copy fails in transactions

Looks like the inode number changes:

root@ucp:/srv/moby/branch/test/transactions/a/rw# strace -v -y cp foo bar
[...]
stat("bar", 0x7ffcd0cb1740)             = -1 ENOENT (No such file or directory)
stat("foo", {st_dev=makedev(0, 153), st_ino=18456, st_mode=S_IFREG|0664, st_nlink=1, st_uid=4294967294, st_gid=4294967294, st_blksize=8192, st_blocks=0, st_size=0, st_atime=0, st_mtime=0, st_ctime=0}) = 0
stat("bar", 0x7ffcd0cb14d0)             = -1 ENOENT (No such file or directory)
open("foo", O_RDONLY)                   = 3</srv/moby/branch/test/transactions/a/rw/foo>
fstat(3</srv/moby/branch/test/transactions/a/rw/foo>, {st_dev=makedev(0, 153), st_ino=18457, st_mode=S_IFREG|0664, st_nlink=1, st_uid=4294967294, st_gid=4294967294, st_blksize=8192, st_blocks=0, st_size=0, st_atime=0, st_mtime=0, st_ctime=0}) = 0

Make appending efficient

The Linux 9p client writes large files (e.g. the Moby iso) 8KB at a time. Datakit handles each write by allocating a new buffer, copying the old version into it, and adding the new 8KB. This gets slow quickly ;-)

Change the type for files in Ivfs_tree from

  | Blob of Cstruct.t

to

  | Blob of Blob.t

and make a suitable Blob module for efficient non-destructive updates. e.g. copy-on-write, or just use a list of blocks in reverse order (and concat them all together at the end).

Backend for Github API

This would allow to watch for Github events directly in the 9p filesystem, and interact with PR (can toggle the green tick on successful builds for instance)

git.sync: Side_band: 65 is not a valid message type

After echo master > fetch in a remote dir on a client I get the following trace on the datakit server

+158506053us   git.sync [DEBUG] RECEIVED: b5a3d612f75867c7dfd00c48ecf13f325f1ea21c HEAD\x00multi_ack thin-pack side-band side-band-64k ofs-de[..] (197)                                                             
+158506147us   git.sync [DEBUG] PacketLine.input                                                                                                                                                                    
+158506174us   git.sync [DEBUG] PacketLine.input_raw                                                                                                                                                                
+158507520us   git.sync [DEBUG] RECEIVED: b5a3d612f75867c7dfd00c48ecf13f325f1ea21c refs/heads/master\x0A (59)                                                                                                       
+158507560us   git.sync [DEBUG] PacketLine.input                                                                                                                                                                    
+158507588us   git.sync [DEBUG] PacketLine.input_raw                                                                                                                                                                
+158507619us   git.sync [DEBUG] RECEIVED: a95c7d33869a7e485a135b0b5d6887e0ec75ebcb refs/remotes/origin/master\x0A (68)                                                                                              
+158507656us   git.sync [DEBUG] PacketLine.input                                                                                                                                                                    
+158507683us   git.sync [DEBUG] PacketLine.input_raw                                                                                                                                                                
+158509422us   git.sync [DEBUG] RECEIVED: FLUSH                                                                                                                                                                     
+158509460us   git.sync [DEBUG] listing:                                                                                                                                                                            
 CAPABILITIES:                                                                                                                                                                                                      
multi_ack, thin-pack, side-band, side-band-64k, ofs-delta, shallow, no-progress, include-tag, multi_ack_detailed, symref=HEAD:refs/heads/master, agent=git/2.8.3                                                    

REFERENCES:                                                                                                                                                                                                         
a95c7d33869a7e485a135b0b5d6887e0ec75ebcb refs/remotes/origin/master                                                                                                                                                 
b5a3d612f75867c7dfd00c48ecf13f325f1ea21c refs/heads/master                                                                                                                                                          
b5a3d612f75867c7dfd00c48ecf13f325f1ea21c HEAD  

+158509535us   git.sync [DEBUG] Sync.fetch_commits b5a3d612f75867c7dfd00c48ecf13f325f1ea21c                                                                                                                         
+158509577us     git.fs [DEBUG] mem: cache miss!                                                                                                                                                                    
+158509624us     git.fs [DEBUG] mem: cache miss!                                                                                                                                                                    
+158509652us git.fs-packed [DEBUG] list /data/.git                                                                                                                                                                  
+158509897us git.fs-packed [DEBUG] mem_in_pack 5b5629393a4b074d789894c51cc237f0cc1232f2:b5a3d612f75867c7dfd00c48ecf13f325f1ea21c                                                                                    
+158509926us git.fs-packed [DEBUG] read_pack_index 5b5629393a4b074d789894c51cc237f0cc1232f2                                                                                                                         
+158509968us   irmin-io [INFO] Reading /data/.git/objects/pack/pack-5b5629393a4b074d789894c51cc237f0cc1232f2.idx                                                                                                    
+158510013us   git.sync [DEBUG] PHASE1                                                                                                                                                                              
+158510036us   git.sync [DEBUG] Upload.phase1                                                                                                                                                                       
+158510064us   git.sync [DEBUG] Upload.output                                                                                                                                                                       
+158510099us   git.sync [INFO] SENDING: "0069want b5a3d612f75867c7dfd00c48ecf13f325f1ea21c agent=git/ogit.1.8.0 side-band-64k ofs-delta thin-pack\n"                                                                
+158510195us   git.sync [INFO] SENDING: "0000"                                                                                                                                                                      
+158510233us   git.sync [DEBUG] PHASE2                                                                                                                                                                              
+158510380us git.pack-index [DEBUG] create: entering fanout table (ofs=8)                                                                                                                                           
+158510513us git.pack-index [DEBUG] create: n_hashs:3                                                                                                                                                               
+158510561us git.pack-index [DEBUG] create: entering hash listing (ofs=1032)                                                                                                                                        
+158510596us git.pack-index [DEBUG] create: entering crc checksums (ofs=1092)                                                                                                                                       
+158510628us git.pack-index [DEBUG] create: entering packfile offsets (ofs=1104)                                                                                                                                    
+158510658us git.pack-index [DEBUG] create: entering large packfile offsets (ofs=1116)                                                                                                                              
+158510711us git.pack-index [DEBUG] mem: b5a3d612f75867c7dfd00c48ecf13f325f1ea21c                                                                                                                                   
+158510741us git.pack-index [DEBUG] find_offset: b5a3d612f75867c7dfd00c48ecf13f325f1ea21c                                                                                                                           
+158510762us git.pack-index [DEBUG] get_hash_idx: b5a3d612f75867c7dfd00c48ecf13f325f1ea21c                                                                                                                          
+158510790us git.pack-index [DEBUG] find_offset: found:12                                                                                                                                                           
+158510809us git.pack-index [DEBUG] mem: true                                                                                                                                                                       
+158510830us  git.graph [DEBUG] closure                                                                                                                                                                             
+158510871us     git.fs [DEBUG] mem: cache miss!                                                                                                                                                                    
+158510901us     git.fs [DEBUG] mem: cache miss!                                                                                                                                                                    
+158510923us git.fs-packed [DEBUG] list /data/.git                                                                                                                                                                  
+158510982us git.fs-packed [DEBUG] mem_in_pack 5b5629393a4b074d789894c51cc237f0cc1232f2:b5a3d612f75867c7dfd00c48ecf13f325f1ea21c                                                                                    
+158511003us git.fs-packed [DEBUG] read_pack_index 5b5629393a4b074d789894c51cc237f0cc1232f2                                                                                                                         
+158511028us git.fs-packed [DEBUG] read_pack_index cache hit!                                                                                                                                                       
+158511048us git.pack-index [DEBUG] mem: b5a3d612f75867c7dfd00c48ecf13f325f1ea21c                                                                                                                                   
+158511070us git.pack-index [DEBUG] find_offset: b5a3d612f75867c7dfd00c48ecf13f325f1ea21c                                                                                                                           
+158511090us git.pack-index [DEBUG] find_offset: cache hit!                                                                                                                                                         
+158511113us git.pack-index [DEBUG] mem: true                                                                                                                                                                       
+158511182us   git.sync [DEBUG] Upload.output                                                                                                                                                                       
+158511216us   git.sync [INFO] SENDING: "0032have b5a3d612f75867c7dfd00c48ecf13f325f1ea21c\n"                                                                                                                       
+158511259us   git.sync [INFO] SENDING: "0032have a95c7d33869a7e485a135b0b5d6887e0ec75ebcb\n"                                                                                                                       
+158511303us   git.sync [INFO] SENDING: "0009done\n"   
+158511588us   git.sync [DEBUG] Ack.input                                                                                                                                                                           
+158511612us   git.sync [DEBUG] PacketLine.input                                                                                                                                                                    
+158511631us   git.sync [DEBUG] PacketLine.input_raw                                                                                                                                                                
+158511655us   git.sync [DEBUG] RECEIVED: ACK b5a3d612f75867c7dfd00c48ecf13f325f1ea21c\x0A (45)                                                                                                                     
+158511679us   git.sync [DEBUG] PHASE3                                                                                                                                                                              
+158511697us   git.sync [DEBUG] Side_band.input                                                                                                                                                                     
+158511718us   git.sync [DEBUG] PacketLine.input_raw                                                                                                                                                                
+158511831us   git.sync [DEBUG] RECEIVED: ACK a95c7d33869a7e485a135b0b5d6887e0ec75ebcb\x0A (45)                                                                                                                     
+158511870us   git.sync [ERROR] Side_band: 65 is not a valid message type                                                                                                                                           
+158511976us       fs9p [DEBUG] S ((tag (1))                                                                                                                                                                        
                                    (payload                                                                                                                                                                        
                                     (Err                                                                                                                                                                           
                                      ((ename "Sync.Make(IO)(Store).Error")                                                                                                                                         
                                       (errno ())))))  

The url is [email protected]:/data.

I am successfully able to ssh into the same url and git clone from the same url.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.