Giter Site home page Giter Site logo

doozerd's Introduction

Doozer

logo

Build Status

What Is It?

Doozer is a highly-available, completely consistent store for small amounts of extremely important data. When the data changes, it can notify connected clients immediately (no polling), making it ideal for infrequently-updated data for which clients want real-time updates. Doozer is good for name service, database master elections, and configuration data shared between several machines. See When Should I Use It?, below, for details.

See the mailing list to discuss doozer with other users and developers.

Quick Start

  1. Download doozerd

  2. Unpack the archive and put doozerd in your PATH

  3. Repeat for doozer

  4. Start a doozerd with a WebView listening on :8080

     $ doozerd -w ":8080"
    
  5. Set a key and read it back

     $ echo "hello, world" | doozer add /message
     $ doozer get /message
     hello, world
    
  6. Open http://localhost:8080 and see your message

doozer web view

How Does It Work?

Doozer is a network service. A handful of machines (usually three, five, or seven) each run one doozer server process. These processes communicate with each other using a standard fully-consistent distributed consensus algorithm. Clients dial in to one or more of the doozer servers, issue commands, such as GET, SET, and WATCH, and receive responses.

(insert network diagram here)

Each doozerd process has a complete copy of the datastore and serves both read and write requests; there is no distinguished "master" or "leader". Doozer is designed to store data that fits entirely in memory; it never writes data to permanent files. A separate tool provides durable storage for backup and recovery.

When Should I Use It?

Here are some example scenarios:

  1. Name Service

    You have a set of machines that serve incoming HTTP requests. Due to hardware failure, occasionally one of these machines will fail and you replace it with a new machine at a new network address. A change to DNS data would take time to reach all clients, because the TTL of the old DNS record would cause it to remain in client caches for some time.

    Instead of DNS, you could use Doozer. Clients can subscribe to the names they are interested in, and they will get notified when any of those names’ addresses change.

  2. Database Master Election

    You are deploying a MySQL system. You want it to have high availability, so you add slaves on separate physical machines. When the master fails, you might promote one slave to become the new master. At any given time, clients need to know which machine is the master, and the slaves must coordinate with each other during failover.

    You can use doozer to store the address of the current master and all information necessary to coordinate failover.

  3. Configuration

    You have processes on several different machines, and you want them all to use the same config file, which you must occasionally update. It is important that they all use the same configuration.

    Store the config file in doozer, and have the processes read their configuration directly from doozer.

What can I do with it?

We have a detailed description of the data model.

For ways to manipulate or read the data, see the protocol spec.

Try out doozer's fault-tolerance with some fire drills.

Similar Projects

Doozer is similar to the following pieces of software:

Hacking on Doozer

License and Authors

Doozer is distributed under the terms of the MIT License. See LICENSE for details.

Doozer was created by Blake Mizerany and Keith Rarick. Type git shortlog -s for a full list of contributors.

doozerd's People

Contributors

4ad avatar antifuchs avatar bernerdschaefer avatar bketelsen avatar bmizerany avatar chrismoos avatar dgrijalva avatar geetarista avatar kr avatar mreiferson avatar varadharajan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

doozerd's Issues

0.8.0-darwin-amd64 web process closed

I followed the main README to get a feel for doozerd. The web process returns just: closed retrying in 2s [Try now]

I can get and set messages successfully:

$ doozerd -w ":8080"
$ echo "hello, world" | doozer add /message
81
$ doozer get /message
hello, world

bin/doozer_init talks about "my_cal" (based on two digits at end of hostname) but /ctl/cal seems to be first come first serve

I don't see anything that would make it reliably "my" cal.

doozer_init happily deletes the cal entry, thus making whatever node had that CAL entry hang (it doesn't even realize it got kicked out; other nodes remove it from /ctl/node).

Even if this was enough to kick "others" out from using "my_cal" (which isn't, there's a huge risk of more and more nodes hanging), nothing guarantees the new doozerd started by doozer_init gets "my_cal". It's racy.

Can't compile - missing pretty.go

I get this error when running make.sh:

package github.com/ha/doozer
    imports code.google.com/p/goprotobuf/proto
    imports github.com/kr/pretty.go
    imports github.com/kr/pretty.go
    imports github.com/kr/pretty.go: no Go source files in .go/src/github.com/kr/pretty.go

Web Interface

Doozerd needs a full admin interface with the ability to view node information, view data, see topology, and more. Add feature requests to this issue.

Test fails on peer_test

it_passes_pkg_peer: [FAIL]
+ cd pkg/peer
+ gotest
rm -f _test/doozer/peer.a
6g -o gotest.6 peer.go liveness.go version.go bench_test.go liveness_test.go misc_test.go peer_test.go
peer_test.go:346: undefined: exec.Run
make: *** [gotest.6] Error 1

gotest: "/local/go/bin/gomake testpackage GOTESTFILES=bench_test.go liveness_test.go misc_test.go peer_test.go" failed: exit status 2

Tests: 9 | Passed: 8 | Failed: 1

latest Go weekly replaces exec.Run() with exec.Command + command.Run

Here are diffs for a (partial) fix to peer_test.go:

rnm-macpro(1063)packages: diff doozerd/src/pkg/peer/peer_test.go doozerd.fixed/src/pkg/peer/peer_test.go
344,354c344,347
< func runDoozer(a ...string) *exec.Cmd {
< path := "/home/kr/src/go/bin/doozerd"
< p, err := exec.Run(
< path,
< append([]string{path}, a...),
< nil,
< "/",
< 0,
< 0,
< 0,

< )

func runDoozer(args ...string) {
path := "/local/bin/doozerd"
p := exec.Command(path, args...)
err := p.Run()
358d350
< return p
rnm-macpro(1064)packages:

doozer / doozerd / go versions

hey,
I am currently trying to compile a working version of doozer and doozerd and have a lot of problems. Which versions play well together?
With go r59, I can compile both projects after adjusting the deprecated StringSort, IntSort and Split functions but there is a protobuffer problem: whenever I try to set a value, I get

  proto: no encoder for Tag *int32 [GetProperties]
  proto: no encoder for Verb *doozer.request_Verb [GetProperties]
  proto: no encoder for Path *string [GetProperties]
  proto: no encoder for Value []uint8 [GetProperties]
  proto: no encoder for OtherTag *int32 [GetProperties]
  proto: no encoder for Offset *int32 [GetProperties]
  proto: no encoder for Rev *int64 [GetProperties]
  proto: no encoder for Tag *int32 [GetProperties]
  proto: no encoder for Flags *int32 [GetProperties]
  proto: no encoder for Rev *int64 [GetProperties]
  proto: no encoder for Path *string [GetProperties]
  proto: no encoder for Value []uint8 [GetProperties]
  proto: no encoder for Len *int32 [GetProperties]
  proto: no encoder for ErrCode *doozer.response_Err [GetProperties]
  proto: no encoder for ErrDetail *string [GetProperties]

if I download the compiled version I get segfaults in golibc ... probably because I have the wrong version.

So: which go version do I need to have installed in order to run the binary version, or -- even better -- which go version will compile doozer and doozerd versions 0.8 fine?

Or (even much better), how can I compile this with protobuffers working with version r59?

Thanks a lot for any hint

service bind port traffic

My Node.js app is on the same machine with Doozerd.Node.js App it dynamic allocate port to bind, but when it get one free port then the doozerd service(here is account_service) bind to that port, then node.js app bind failed.

Configuration Management

Configuration management needs to be improved. Placeholder issue. Specifics should be added as separate GH issues.

make 4ad/doozerd the primary repository

Currently, 4ad/doozerd compiles against go1 and is more actively maintained. A support ticket to Github made by @ha could "flip" the fork status of the ha/doozerd and 4ad/doozerd.

Similarly, with ha/doozer and 4ad/doozer, I believe.

This would make it more clear to new users which repo is more available (and might help clear up the strong google juice ha/doozerd has over 4ad's).

dependency on code.google.com/p/goprotobuf/proto

I pulled down code... running ./all.sh I get:
package github.com/ha/doozer
imports code.google.com/p/goprotobuf/proto: unable to detect version control system for code.google.com/ path

I made changes in the files, replace the code.google.com with proper git hub imports in:
modified: .travis.yml
modified: consensus/m.pb.go
modified: consensus/m_test.go
modified: consensus/manager.go
modified: consensus/manager_test.go
modified: consensus/run.go
modified: consensus/run_test.go
modified: doc/hacking.md
modified: doc/proto.md
modified: server/conn.go
modified: server/msg.pb.go
modified: server/server_test.go
modified: server/txn.go
modified: web/web.go

I can't find any more references, but somewhere I the code is still looking for code.google.com, preventing build.

Temporary packet loss causes permanent node hang

With the firedrill 3-node setup, dropping packets for >5 seconds:

sudo iptables -I INPUT --proto udp --dport 8047 -j DROP; sleep 7; sudo iptables -D INPUT --proto udp --dport 8047 -j DROP

(whether I block packets in one direction or both doesn't seem to affect behavior)

causes one or more of the nodes to get kicked out of the cluster, but the victim doesn't realize this happened and just hangs. This is true even after network connectivity is restored.

Interestingly, temporarily blocking the node on port 8047 often causes a different node get kicked. My latest run actually kicked the nodes on port 8046 and 8048, thus translating a single-node temporary outage into a cluster failure (as mailing list has told me, doozer doesn't recover from loss of quorum).

The kicked node never recovers, unless the process is restarted as a whole, but this might be related to #44.

unable to detect version control system for code.google.com

package code.google.com/p/goprotobuf/proto: unable to detect version control system for code.google.com/ path
package code.google.com/p/go.net/websocket: unable to detect version control system for code.google.com/ path

these two libs are not hosted in code.google.com
code.google.com/p/goprotobuf/proto ==> github.com/golang/protobuf/proto
code.google.com/p/go.net/websocket ==> golang.org/x/net/websocket

So to change all files

import "code.google.com/p/goprotobuf/proto"  to  import "github.com/golang/protobuf/proto"
import "code.google.com/p/go.net/websocket"  to  import "golang.org/x/net/websocket"

Also:
cause it import itself ha/doozerd & ha/doozer
so need to change the directories ${GOPATH}/src/github.com/ha/doozer & ${GOPATH}/src/github.com/ha/doozerd all files import like above

Unable to build doozerd with ./all.sh, errors related protobuf

# github.com/ha/doozer
../doozer/conn.go:182: cannot use &t.req (type *request) as type proto.Message in function argument:
    *request does not implement proto.Message (missing ProtoMessage method)
../doozer/conn.go:196: cannot use &r (type *response) as type proto.Message in function argument:
    *response does not implement proto.Message (missing ProtoMessage method)
../doozer/conn.go:292: undefined: proto.GetInt64
../doozer/conn.go:324: undefined: proto.GetInt64
../doozer/conn.go:410: undefined: proto.GetInt32
../doozer/conn.go:410: undefined: proto.GetInt64
../doozer/err.go:33: undefined: proto.GetString
../doozer/msg.pb.go:10: undefined: proto.GetString
../doozer/msg.pb.go:127: cannot use this (type *request) as type proto.Message in function argument:
    *request does not implement proto.Message (missing ProtoMessage method)
../doozer/msg.pb.go:142: cannot use this (type *response) as type proto.Message in function argument:
    *response does not implement proto.Message (missing ProtoMessage method)
../doozer/msg.pb.go:142: too many errors
# github.com/ha/doozerd/consensus
consensus/m.pb.go:10: undefined: proto.GetString
consensus/m.pb.go:65: cannot use this (type *msg) as type proto.Message in function argument:
    *msg does not implement proto.Message (missing ProtoMessage method)
consensus/manager.go:230: cannot use &m (type *msg) as type proto.Message in function argument:
    *msg does not implement proto.Message (missing ProtoMessage method)
consensus/manager.go:240: cannot use &p.msg (type *msg) as type proto.Message in function argument:
    *msg does not implement proto.Message (missing ProtoMessage method)
consensus/run.go:65: cannot use m (type *msg) as type proto.Message in function argument:
    *msg does not implement proto.Message (missing ProtoMessage method)

building from the go:master and doozer:master

$ ./all.sh 

--- cd pkg/quiet
6g  -o _go_.6 quiet.go 
rm -f _obj/doozer/quiet.a
gopack grc _obj/doozer/quiet.a _go_.6 
cp _obj/doozer/quiet.a "/home/me/go/pkg/linux_amd64/doozer/quiet.a"

--- cd pkg/store
6g  -o _go_.6 event.go getter.go glob.go node.go store.go 
getter.go:59: undefined: sort.SortStrings
store.go:113: too many arguments in call to strings.Split
store.go:173: too many arguments in call to strings.Split
store.go:185: too many arguments in call to strings.Split
make: *** [_go_.6] Error 1

Node bounce causes healthy node to be dropped

Given 3 doozerd nodes listening on 8046, 8047, and 8048 (as in the fire drill), when 8048 is killed and restarted, after the timeout window one of the other nodes is evicted from the cluster.

This script will consistently produce these results for me: https://gist.github.com/bernerdschaefer/5714419

Hitting the web UI after the cluster state has stabilized shows something like this:

/
    ctl/
        cal/
            0     (5)       D4HVNXRRANR4YRGQ
            1   (532)       
            2   (477)       73O2WLRB3DVRPG5V
        node/
            4MMJNJ76M5IDQSBQ/
                applied     (573)       572
            73O2WLRB3DVRPG5V/
                addr        (273)       127.0.0.1:8048
                applied     (574)       573
                hostname    (276)       precise64
                version     (279)       0.8+53+g985ed10
                writable    (538)       true
            D4HVNXRRANR4YRGQ/
                addr          (2)       127.0.0.1:8046
                applied     (575)       574
                hostname      (3)       precise64
                version       (4)       0.9.0-alpha
                writable     (57)       true
        ns/
            test/
                4MMJNJ76M5IDQSBQ    (123)       127.0.0.1:8047
                6D32P3ZOQDJIMVEV    (131)       127.0.0.1:8048
                73O2WLRB3DVRPG5V    (533)       127.0.0.1:8048
                D4HVNXRRANR4YRGQ      (6)       127.0.0.1:8046
        err     (541)       rev mismatch
        name      (1)       test

Where in this case, node 8047 has been (partially) evicted from the cluster: it's been removed from /cal/ctl, and everything except "applied" has been removed from the node info.

At this point, node 8047 is still running but produces no messages in the log. If node 8047 is killed and restarted, the cluster returns to normal operation.

create a service definition in proto?

Maybe this is just my newbishness to protocol buffers, but I'm completely lost.

Maybe it would help if the proto definition defined a service that one could then intiialize and call? Because right now I can't seem to do jack with doozerd and protocol buffers by hand. I tried building a request and sending it over a tcp socket to localhost on port 8043 but running dozer from the commandline for the path gives me nothing.

Find depth or list at path

We have quite a few groups or roles for machines in our infrastructure and I would like to use doozer to interact with those roles. A couple examples would be to list all services for a specific role in DNS lookups or to build a list of nodes to send a command to from another service. Currently if I were to doozer find /services/production/east/web it would like the following:

/services/production/east/web
/services/production/east/web/001
/services/production/east/web/001/ip
/services/production/east/web/001/status
/services/production/east/web/001/instance-id
/services/production/east/web/002
/services/production/east/web/002/ip
/services/production/east/web/002/status
/services/production/east/web/002/instance-id
/services/production/east/web/003
/services/production/east/web/003/ip
/services/production/east/web/003/status
/services/production/east/web/003/instance-id

What I would love to be able to do is provide a depth option for find or to add a new command list that would only list the items I want for a given directory. It would be the equivalent of get, but for a directory and not a file. With this I would be able to see this instead:

/services/production/east/web/001
/services/production/east/web/002
/services/production/east/web/003

Thoughts? Am I going about this the wrong way? Or would this be a welcomed feature?

Web ui doesn't seem to work v0.8

Downloaded the prepacked binary (v0.8 amd64)

executed the following commands:

doozerd
ps aux | grep doozerd
doozerd 9660 james    6u  0000     0,9        0     6343 anon_inode
doozerd 9660 james    7u  IPv6 4737813      0t0      UDP localhost:8046 
doozerd 9660 james    8u  IPv6 4737814      0t0      TCP localhost:8000 (LISTEN)
doozerd 9660 james    9r   CHR     1,9      0t0     1034 /dev/urandom

Browsed to localhost:8000/ page just constantly trys to reconnect to the events web socket. and the socket closes immediately prior to hand shaking.

Attempted to add a piece a data:

echo "hello, world" | doozer add /message

same deal. Also tried starting with doozerd -w ":8080" and browsing to 8080, same problem.

what am I missing?

Note: encountering same issue on v0.7, pretty sure I'm doing something wrong but have no clue what,
lack of build instructions/requirements are also make it extremely difficult to manually build and debug.

Clean up logging

Logging is too verbose. Fix logging to make it consistent, possibly add log levels for more verbosity when necessary.

Option: Change log level of running doozerd instances with signals.

single instance in cluster left standing

it seems that if a cluster of N nodes is stood up and all but 1 node is taken down, that single node is no longer responsive as demonstrated by the log output below (note: it contains additional debugging statements added as I've been poking around):

reproducing this case should be relatively simple:

$ doozerd &
$ doozerd -l 127.0.0.1:8047 -w false -a 127.0.0.1:8046 &
$ export SECONDPID=$!
$ echo -n | doozer add /ctl/cal/1
$ kill -9 $SECONDPID

now the remaining instance is in a loop where it is unable to meet the quorom and therefore unable to continue (nor will it respond to any state mutations).

i'm still thinking through what I would expect the daemon to do in this case, at the very least I would expect it to respond (even if those responses were errors for most operations... allowing one to continue to administrate the cluster and ideally bring it back operationally)

DOOZER 2012/02/27 23:14:48.799495 applying 443 TICK
DOOZER 2012/02/27 23:14:48.799504 applied m.tick 1
DOOZER 2012/02/27 23:14:48.799509 p.seqn=443 m.next=494
DOOZER 2012/02/27 23:14:49.278193 p.seqn=444 m.next=494
DOOZER 2012/02/27 23:14:49.278264 __DEBUG: acceptor.update cmd:INVITE seqn:444 crnd:2 
DOOZER 2012/02/27 23:14:49.278339 __DEBUG: learner.update &{1 2 2 map[] [false false]  false} cmd:INVITE seqn:444 crnd:2  0
DOOZER 2012/02/27 23:14:49.278485 p.seqn=444 m.next=494
DOOZER 2012/02/27 23:14:49.278558 __DEBUG: acceptor.update cmd:RSVP seqn:444 crnd:2 vrnd:0 value:"" 
DOOZER 2012/02/27 23:14:49.278642 __DEBUG: learner.update &{1 2 2 map[] [false false]  false} cmd:RSVP seqn:444 crnd:2 vrnd:0 value:""  1
DOOZER 2012/02/27 23:14:49.278669 p.seqn=444 m.next=494
DOOZER 2012/02/27 23:14:49.278715 __DEBUG: acceptor.update cmd:RSVP seqn:444 crnd:2 vrnd:0 value:"" 
DOOZER 2012/02/27 23:14:49.278778 __DEBUG: learner.update &{1 2 2 map[] [false false]  false} cmd:RSVP seqn:444 crnd:2 vrnd:0 value:""  0
DOOZER 2012/02/27 23:14:49.278897 p.seqn=444 m.next=494
DOOZER 2012/02/27 23:14:49.278986 __DEBUG: acceptor.update cmd:NOMINATE seqn:444 crnd:2 value:"-1:/ctl/node/T6JCULE4RZN2CE6L/applied=443" 
DOOZER 2012/02/27 23:14:49.279087 __DEBUG: learner.update &{1 2 2 map[] [false false]  false} cmd:NOMINATE seqn:444 crnd:2 value:"-1:/ctl/node/T6JCULE4RZN2CE6L/applied=443"  0
DOOZER 2012/02/27 23:14:49.279221 p.seqn=444 m.next=494
DOOZER 2012/02/27 23:14:49.279341 __DEBUG: acceptor.update cmd:VOTE seqn:444 vrnd:2 value:"-1:/ctl/node/T6JCULE4RZN2CE6L/applied=443" 
DOOZER 2012/02/27 23:14:49.279448 __DEBUG: learner.update &{1 2 2 map[] [false false]  false} cmd:VOTE seqn:444 vrnd:2 value:"-1:/ctl/node/T6JCULE4RZN2CE6L/applied=443"  0
DOOZER 2012/02/27 23:14:49.279480 p.seqn=444 m.next=494
DOOZER 2012/02/27 23:14:49.279534 __DEBUG: acceptor.update cmd:VOTE seqn:444 vrnd:2 value:"-1:/ctl/node/T6JCULE4RZN2CE6L/applied=443" 
DOOZER 2012/02/27 23:14:49.279647 __DEBUG: learner.update &{2 2 2 map[-1:/ctl/node/T6JCULE4RZN2CE6L/applied=443:1] [true false]  false} cmd:VOTE seqn:444 vrnd:2 value:"-1:/ctl/node/T6JCULE4RZN2CE6L/applied=443"  1
DOOZER 2012/02/27 23:14:49.279667 learn seqn=444
DOOZER 2012/02/27 23:14:49.279873 event {444 /ctl/node/T6JCULE4RZN2CE6L/applied 443 444 -1:/ctl/node/T6JCULE4RZN2CE6L/applied=443 <nil> <node>}
DOOZER 2012/02/27 23:14:49.279956 del run 444
DOOZER 2012/02/27 23:14:49.280175 __DEBUG: isLeader T6JCULE4RZN2CE6L VBY7NDUVYVUBIJNF 0 0
DOOZER 2012/02/27 23:14:49.280219 __DEBUG: isLeader VBY7NDUVYVUBIJNF VBY7NDUVYVUBIJNF 0 1
DOOZER 2012/02/27 23:14:49.280236 add run 494
DOOZER 2012/02/27 23:14:49.280287 runs: ..................................................
DOOZER 2012/02/27 23:14:49.280296 avg tick delay: -1
DOOZER 2012/02/27 23:14:49.280304 avg fill delay: -1
DOOZER 2012/02/27 23:14:49.280320 p.seqn=444 m.next=495
DOOZER 2012/02/27 23:14:49.280335 p.seqn=444 m.next=495
DOOZER 2012/02/27 23:14:49.800186 prop &{445 [45 49 58 47 99 116 108 47 110 111 100 101 47 86 66 89 55 78 68 85 86 89 86 85 66 73 74 78 70 47 97 112 112 108 105 101 100 61 52 52 52]}
DOOZER 2012/02/27 23:14:49.800214 p.seqn=445 m.next=495
DOOZER 2012/02/27 23:14:49.800234 sched tick=1 seqn=445 t=197605
DOOZER 2012/02/27 23:14:49.800333 __DEBUG: acceptor.update cmd:PROPOSE seqn:445 value:"-1:/ctl/node/VBY7NDUVYVUBIJNF/applied=444" 
DOOZER 2012/02/27 23:14:49.800464 __DEBUG: learner.update &{1 2 2 map[] [false false]  false} cmd:PROPOSE seqn:445 value:"-1:/ctl/node/VBY7NDUVYVUBIJNF/applied=444"  -1
DOOZER 2012/02/27 23:14:49.800548 p.seqn=445 m.next=495
DOOZER 2012/02/27 23:14:49.800643 __DEBUG: acceptor.update cmd:INVITE seqn:445 crnd:3 
DOOZER 2012/02/27 23:14:49.800715 __DEBUG: learner.update &{1 2 2 map[] [false false]  false} cmd:INVITE seqn:445 crnd:3  1
DOOZER 2012/02/27 23:14:49.800845 p.seqn=445 m.next=495
DOOZER 2012/02/27 23:14:49.800905 __DEBUG: acceptor.update cmd:RSVP seqn:445 crnd:3 vrnd:0 value:"" 
DOOZER 2012/02/27 23:14:49.800981 __DEBUG: learner.update &{1 2 2 map[] [false false]  false} cmd:RSVP seqn:445 crnd:3 vrnd:0 value:""  1
DOOZER 2012/02/27 23:14:49.810202 applying 445 TICK
DOOZER 2012/02/27 23:14:49.810221 applied m.tick 1
DOOZER 2012/02/27 23:14:49.810231 p.seqn=445 m.next=495
DOOZER 2012/02/27 23:14:49.810242 tick wasteful=false
DOOZER 2012/02/27 23:14:49.810261 sched tick=2 seqn=445 t=3929262
DOOZER 2012/02/27 23:14:49.810332 __DEBUG: acceptor.update cmd:TICK seqn:445 
DOOZER 2012/02/27 23:14:49.810443 __DEBUG: learner.update &{1 2 2 map[] [false false]  false} cmd:TICK seqn:445  -1
DOOZER 2012/02/27 23:14:49.810513 p.seqn=445 m.next=495
DOOZER 2012/02/27 23:14:49.810588 __DEBUG: acceptor.update cmd:INVITE seqn:445 crnd:5 
DOOZER 2012/02/27 23:14:49.810637 __DEBUG: learner.update &{1 2 2 map[] [false false]  false} cmd:INVITE seqn:445 crnd:5  1
DOOZER 2012/02/27 23:14:49.810749 p.seqn=445 m.next=495
DOOZER 2012/02/27 23:14:49.810814 __DEBUG: acceptor.update cmd:RSVP seqn:445 crnd:5 vrnd:0 value:"" 
DOOZER 2012/02/27 23:14:49.810865 __DEBUG: learner.update &{1 2 2 map[] [false false]  false} cmd:RSVP seqn:445 crnd:5 vrnd:0 value:""  1
DOOZER 2012/02/27 23:14:49.820312 applying 445 TICK

update docs w/r/t builds

Anyone is using this project in production? Builds are failing? Any ideas what is wrong and when it will be fixed? Thanks.

garbage input crashes doozerd

Steps to Reproduce

$ ./doozerd &
$ echo foo | nc localhost 8046

Result

doozerd crashes here:

throw: runtime: out of memory
goroutine 26 [running]:
github.com/ha/doozerd/server.(*conn).read(0xf8400dfa00, 0xf840142008, 0xf800000000, 0x7f8ef9422f70, 0x51d0ee, ...)
    /home/ubuntu/doozer_compile/src/github.com/ha/doozerd/server/conn.go:50 +0x220
github.com/ha/doozerd/server.(*conn).serve(0xf8400dfa00, 0x67cb94)
    /home/ubuntu/doozer_compile/src/github.com/ha/doozerd/server/conn.go:31 +0x4a
github.com/ha/doozerd/server.serve(0xf84007a3c0, 0xf84008c590, 0xf840000690, 0xf8400d5300, 0xf8400bc6a0, ...)
    /home/ubuntu/doozer_compile/src/github.com/ha/doozerd/server/server.go:52 +0x191
created by github.com/ha/doozerd/server.ListenAndServe
    /home/ubuntu/doozer_compile/src/github.com/ha/doozerd/server/server.go:35 +0x342

The reason is that this:

err := binary.Read(c.c, binary.BigEndian, &size)
buf := make([]byte, size)

returns a very big size from Read in this case (hence the oom)

Security : Encrypted communication between doozerd instances

If I want to run doozerd on 3 machines, how do i make sure: No one can access them except those machines (probably just iptables? Password support?). Also, I want the communication between the doozerd's to be over ssl. Can this be done, and if yes, how?

doozerd leaks memory?

I left doozerd running for less than two days, and it ended up consuming surprising amounts of RAM:

$ date
Sat Apr 23 11:17:45 PDT 2011
$ ps uww 27644
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
tv       27644  0.7 10.5 454924 400816 pts/172 Sl+  Apr21  17:01 doozerd

The sequence numbers had reached ~80k, but there was nothing else interesting in the logs:

DOOZER 2011/04/23 11:17:55.065502 learn seqn=79952

For comparison, here's doozerd at seqn=100, which seems like an adequate warmup (though it probably won't read max size before gc):

``
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
tv 19181 1.4 0.1 42252 4220 pts/172 Sl 11:24 0:00 doozerd


This is on an amd64 ubuntu 10.10 machine, with 6g from go.git f1f3e6292aecd165692c41cd2dd2f5886b5958eb (2011-04-15).

I don't know yet if this is an issue with Go's garbage collection, but given how nice (really nice!) and simple doozerd is, I didn't expect the 10x growth, and no other Go app has made me really wonder about the size.

wish: automatic bootstrapping

Having a special node that needs to be started without -a, but only once, makes automation painful. Every doozerd should be able to always be started with a -a listing all the other doozerd nodes, and they should be able to figure out bootstrapping by themselves. (Though this may require persistency.)

In an earlier project called Ceph, that uses Paxos for cluster state, we solved this by letting admin list "seed nodes", and a cluster would bootstrap once a quorum of those seed nodes was willing to agree on bootstrapping: http://ceph.com/docs/master/rados/configuration/mon-config-ref/#initial-members . This lets Ceph do e.g. automatic deploys with 7 monitors, let them come up at any time, and when they're ready, they reach quorum. No human intervention!

Persistence

Add (maybe optional?) persistence to Doozerd. Needs much thought and discussion.

0.8 (everything) and current master (web view) not working on Mac OS 10.8

When compiling from today's master, I get the same issue originally reported as #32: the web view doesn't work (can't connect). When using the 0.8 packages in downloads, both 32 and 64 bit binaries give the same result on my machine:

$ doozerd
libcgo: thread-local storage 0x108 not at %gs:0x8a0 - x=0 y=0
Abort trap: 6

Almost the same result as root:

$ sudo doozerd
libcgo: thread-local storage 0x108 not at %gs:0x8a0 - x=0 y=0

And it's the same trying to use the help command:

$ doozerd --help
libcgo: thread-local storage 0x108 not at %gs:0x8a0 - x=0 y=0
Abort trap: 6

Do you need more information?

Distinguish between create and update during change notifications

When Doozerd sends out notifications about changes that have happened to data it includes a flag that indicates whether it was a modification or a delete.

Can Doozerd send back a flag to distinguish between updating existing data and the creation of new data?

This would allow clients to implement code based on whether data is new or being modified.

Are there consequences to storing additional information in each of the ctl/node entries?

I've played around with storing a node's public ssh key (among other things) directly within its ctl/node entry. It seems to work nicely, avoiding the need to duplicate the same in at user-defined path. But i'm concerned with the consequences of using a system-defined path for such things. Is this safe? If not currently, can you devise a standard for allowing it so that the system will never step on user-defined files?

Remove DZNS

DZNS is confusing, and not used in practice. Remove documentation for dzns, and relics of dzns code.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.