Giter Site home page Giter Site logo

murder's Introduction

Murder by Larry Gadea [email protected] and Matt Freels [email protected]

Copyright 2010-2012 Twitter Inc.

WARNING: This project, though still functional, is no longer maintained

DESCRIPTION

Murder is a method of using Bittorrent to distribute files to a large amount of servers within a production environment. This allows for scaleable and fast deploys in environments of hundreds to tens of thousands of servers where centralized distribution systems wouldn't otherwise function. A "Murder" is normally used to refer to a flock of crows, which in this case applies to a bunch of servers doing something.

For an intro video, see: Twitter - Murder Bittorrent Deploy System

QUICK START

For the impatient, gem install murder and add these lines to your Capfile:

require 'murder'

set :deploy_via, :murder
after 'deploy:setup', 'murder:distribute_files'
before 'murder:start_seeding', 'murder:start_tracker'
after 'murder:stop_seeding', 'murder:stop_tracker'

HOW IT WORKS

In order to do a Murder transfer, there are several components required to be set up beforehand -- many the result of BitTorrent nature of the system. Murder is based on BitTornado.

  • A torrent tracker. This tracker, started by running the 'murder_tracker.py' script, runs a self-contained server on one machine. Although technically this is still a centralized system (everyone relying on this tracker), the communication between this server and the rest is minimal and normally acceptable. To keep things simple tracker-less distribution (DHT) is currently not supported. The tracker is actually just a mini-httpd that hosts a /announce path which the Bittorrent clients update their state onto.

  • A seeder. This is the server which has the files that you'd like to deploy onto all other servers. The files are placed into a directory that a torrent gets created from. Murder will tgz up the directory and create a .torrent file (a very small file containing basic hash information about the tgz file). This .torrent file lets the peers know what they're downloading. The tracker keeps track of which .torrent files are currently being distributed. Once a Murder transfer is started, the seeder will be the first server many machines go to to get pieces. These pieces will then be distributed in a tree-fashion to the rest of the network, but without necessarily getting the parts from the seeder.

  • Peers. This is the group of servers (hundreds to tens of thousands) which will be receiving the files and distributing the pieces amongst themselves. Once a peer is done downloading the entire tgz file, it will continue seeding for a while to prevent a hotspot effect on the seeder.

CONFIGURATION AND USAGE

Murder integrates with Capistrano. The most simple way to use it is as a deploy strategy, by setting :deploy_via to :murder. By default, murder makes the same assumptions that cap makes. All servers without :no_release => true will act as peers. Additionally, murder will automatically use the first peer as both tracker and seeder. you may redefine the tracker, seeder and peer roles yourself to change these defaults, for instance, if you want to set up a dedicated tracker.

All involved servers must have python installed and the related murder support files (BitTornado, etc.). To upload the support files to the tracker, seeder, and peers, run:

cap murder:distribute_files

By default, these will go in shared/murder in your apps deploy directory. Override this by setting the variable remote_murder_path. For convenience, you can add an after hook to run this on deploy:setup:

after 'deploy:setup', 'murder:distribute_files'

Before deploying, you must start the tracker:

cap murder:start_tracker

To have this happen automatically during a deploy, add the following hooks:

before 'murder:start_seeding', 'murder:start_tracker'
after 'murder:stop_seeding', 'murder:stop_tracker'

At this point you should be able to deploy normally:

cap deploy

MANUAL USAGE (murder without a deploy strategy)

Murder can also be used as a general mechanism to distribute files across a generic set of servers. To do so create a Capfile, require murder, and manually define roles:

require 'rubygems'
require 'murder'

set :remote_murder_path, '/opt/local/murder' # or some other directory

role :peer, 'host1', 'host2', 'host3', 'host4', 'host5', host6', host7'
role :seeder, 'host1'
role :tracker, 'host1'

To distribute a directory of files, first make sure that murder is set up on all hosts, then manually run the murder cap tasks:

  1. Start the tracker:

     cap murder:start_tracker
    
  2. Create a torrent from a directory of files on the seeder, and start seeding:

     scp -r ./files host1:~/files
     cap murder:create_torrent tag="Deploy1" files_path="~/files"
     cap murder:start_seeding tag="Deploy1"
    
  3. Distribute the torrent to all peers:

     cap murder:peer tag="Deploy1" destination_path="/tmp"
    
  4. Stop the seeder and tracker:

     cap murder:stop_seeding tag="Deploy1"
     cap murder:stop_tracker
    

When this finishes, all peers will have the files in /tmp/Deploy1

TASK REFERENCE

distribute_files: SCPs a compressed version of all files from ./dist (the python Bittorrent library and custom scripts) to all server. The entire directory is sent, regardless of the role of each individual server. The path on the server is specified by remote_murder_path and will be cleared prior to transferring files over.

start_tracker: Starts the Bittorrent tracker (essentially a mini-web-server) listening on port 8998.

stop_tracker: If the Bittorrent tracker is running, this will kill the process. Note that if it is not running you will receive an error.

create_torrent: Compresses the directory specified by the passed-in argument 'files_path' and creates a .torrent file identified by the 'tag' argument. Be sure to use the same 'tag' value with any following commands. Any .git directories will be skipped. Once completed, the .torrent will be downloaded to your local /tmp/TAG.tgz.torrent.

download_torrent: Although not necessary to run, if the file from create_torrent was lost, you can redownload it from the seeder using this task. You must specify a valid 'tag' argument.

start_seeding: Will cause the seeder machine to connect to the tracker and start seeding. The ip address returned by the 'host' bash command will be announced to the tracker. The server will not stop seeding until the stop_seeding task is called. You must specify a valid 'tag' argument (which identifies the .torrent in /tmp to use)

stop_seeding: If the seeder is currently seeding, this will kill the process. Note that if it is not running, you will receive an error. If a peer was downloading from this seed, the peer will find another host to receive any remaining data. You must specify a valid 'tag' argument.

stop_all_seeding: Identical to stop_seeding, except this will kill all seeding processes. No 'tag' argument is needed.

peer: Instructs all the peer servers to connect to the tracker and start download and spreading pieces and files amongst themselves. You must specify a valid 'tag' argument. Once the download is complete on a server, that server will fork the download process and seed for 30 seconds while returning control to Capistrano. Cap will then extract the files to the passed in 'destination_path' argument to destination_path/TAG/*. To not create this tag named directory, pass in the 'no_tag_directory=1' argument. If the directory is empty, this command will fail. To clean it, pass in the 'unsafe_please_delete=1' argument. The compressed tgz in /tmp is never removed. When this task completes, all files have been transferred and moved into the requested directory.

stop_all_peering: Sometimes peers can go on forever (usually because of an error). This command will forcibly kill all "murder_client.py peer" commands that are running.

CONFIG REFERENCE

Variables

default_tag: A tag name to use by default such that a tag parameter doesn't need to be manually entered on every task. Not recommended to be used since files will be overwritten.

default_seeder_files_path: A path on the seeder's file system where the files to be distributed are stored.

default_destination_path: A path on the peers' file system where the files that were distributed should be decompressed into.

remote_murder_path: A path where murder will look for its support files on each host. cap murder:distribute_files will upload murder support files here.

default_temp_path: The base path to use for temporary files, the default for this is /tmp. This can be overridden with the temp_path environment variable.

Roles

tracker: Host on which to run the BitTorrent tracker

seeder: Host which will be the source of the files to be distributed via BitTorrent

peers: All hosts to which files should be distributed

murder's People

Contributors

calavera avatar freels avatar lg avatar rcohen avatar robinbowes avatar rthomas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

murder's Issues

Transfert monitoring

Is there any way to see percentage of transfert ?
I understood that, that wasnt possible with "pv" because bittorrent preallocates space on disk.

Source location for disabling DHT, UPnP, Encryption?

In the introduction video, the functionality of disabling DHT, UPnP, Encryption was mentioned. Could please point me to the files doing them? I cannot find them in the code as I went through murder code and also did a diff on bittornado directory. Thanks.

call failed on task murder:stop_peering

Hello,

I get this error while trying to deploy:

call failed #<Capistrano::CommandError: failed: "sh -c 'pkill -f \"murder_client.py peer.*/tmp/20110927221958.tar.gz.tgz\"'" on server1,server2...>
*** [deploy:update_code] rolling back

I can't figure out why it is failing.

  • I tried modifing the commands to append a "; true"
  • using sudo
  • re-writing it using ps|grep|awk|kill

I ended up ignoring the issue by using the following code:

namespace :murder do

 task :stop_peering, :roles=>:peer do
  logger.info 'override'
 end

 task :stop_seeding, :roles=>:peer do
  logger.info 'override'
 end

 task :stop_tracker, :roles=>:peer do
  logger.info 'override'
 end

end

Then after each deploy, i have to go and kill murder (lol) processes.

ENV:

Thanks,
-Dave

Options for screen cmd

Hi,
I am a little bit confused for "screen -dms". Below is what I get when man screen on ubuntu 14.04.

-s program
sets the default shell to the program specified, instead of the value in the environment variable $SHELL (or "/bin/sh" if not defined). This can also be defined through the "shell" .screenrc command.

Thus I guess what code wants for is "screen -dmS".
-S sessionname
When creating a new session, this option can be used to specify a meaningful name for the session. This name identifies the session for "screen -list" and "screen -r" actions. It substitutes the default [tty.host] suffix.

Pls correct me if I'm wrong. And if necessary I can help provide a patch.
Thanks a lot.

Peer is throwing an error after completing the download

Environment:
OS: CentOS 5.5 (Final)
Python 2.4.3


Traceback (most recent call last):
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/RawServer.py", line 142, in listen_forever
self.sockethandler.handle_events(events)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/SocketHandler.py", line 319, in handle_events
s.handler.data_came_in(s, data)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Encrypter.py", line 190, in data_came_in
x = self.next_func(m)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Encrypter.py", line 148, in read_message
self.connecter.got_message(self, s)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Connecter.py", line 285, in got_message
if c.download.got_piece(i, toint(message[5:9]), message[9:]):
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Downloader.py", line 182, in got_piece
self._request_more()
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Downloader.py", line 189, in _request_more
self.fix_download_endgame(new_unchoke)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Downloader.py", line 256, in fix_download_endgame
del want[self.backlog - len(self.active_requests):]
TypeError: slice indices must be integers or None

Traceback (most recent call last):
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/RawServer.py", line 142, in listen_forever
self.sockethandler.handle_events(events)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/SocketHandler.py", line 319, in handle_events
s.handler.data_came_in(s, data)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Encrypter.py", line 190, in data_came_in
x = self.next_func(m)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Encrypter.py", line 148, in read_message
self.connecter.got_message(self, s)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Connecter.py", line 285, in got_message
if c.download.got_piece(i, toint(message[5:9]), message[9:]):
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Downloader.py", line 182, in got_piece
self._request_more()
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Downloader.py", line 189, in _request_more
self.fix_download_endgame(new_unchoke)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Downloader.py", line 256, in fix_download_endgame
del want[self.backlog - len(self.active_requests):]
TypeError: slice indices must be integers or None

Traceback (most recent call last):
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/RawServer.py", line 142, in listen_forever
self.sockethandler.handle_events(events)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/SocketHandler.py", line 319, in handle_events
s.handler.data_came_in(s, data)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Encrypter.py", line 190, in data_came_in
x = self.next_func(m)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Encrypter.py", line 148, in read_message
self.connecter.got_message(self, s)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Connecter.py", line 285, in got_message
if c.download.got_piece(i, toint(message[5:9]), message[9:]):
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Downloader.py", line 182, in got_piece
self._request_more()
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Downloader.py", line 189, in _request_more
self.fix_download_endgame(new_unchoke)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Downloader.py", line 256, in fix_download_endgame
del want[self.backlog - len(self.active_requests):]
TypeError: slice indices must be integers or None

Traceback (most recent call last):
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/RawServer.py", line 142, in listen_forever
self.sockethandler.handle_events(events)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/SocketHandler.py", line 319, in handle_events
s.handler.data_came_in(s, data)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Encrypter.py", line 195, in data_came_in
self.close()
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Encrypter.py", line 161, in close
self.sever()
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Encrypter.py", line 167, in sever
self.connecter.connection_lost(self)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Connecter.py", line 206, in connection_lost
c.download.disconnected()
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Downloader.py", line 88, in disconnected
self._letgo()
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/Downloader.py", line 101, in _letgo
self.downloader.storage.request_lost(index, begin, length)
File "/home/rmartinez/murder-deploy/murder/dist/BitTornado/BT1/StorageWrapper.py", line 738, in request_lost
assert not (begin, length) in self.inactive_requests[index]
AssertionError

done and done

cap murder:peer errors

when I run "cap murder: peer" the following error occurs

** [out :: LABO03] /tmp/Deploy0000002.tgz.torrent is not a valid responsefile
** [out :: LABO02] Incorrect number of arguments
** [out :: LABO02]
** [out :: LABO02] Usage:
** [out :: LABO02] python murder_client.py peer/seed out.torrent OUT.OUT 127.0.0.1
** [out :: LABO02]
** [out :: LABO02] The last parameter is the local ip address, normally 10.x.x.x
** [out :: LABO02]
** [out :: LABO01] /tmp/Deploy0000002.tgz.torrent is not a valid responsefile

Any idea?

Regards

Cant start peering.. (everything before that works fine)

after i enter the peer command, it shows the following error:

** [out :: 172.23.99.7] Incorrect number of arguments
** [out :: 172.23.99.7]
** [out :: 172.23.99.7] Usage:
** [out :: 172.23.99.7] python murder_client.py peer/seed out.torrent OUT.OUT 127.0.0.1
** [out :: 172.23.99.7]
** [out :: 172.23.99.7] The last parameter is the local ip address, normally 10.x.x.x
** [out :: 172.23.99.7]
** [out :: 172.23.99.5] Incorrect number of arguments
** [out :: 172.23.99.5]
** [out :: 172.23.99.5] Usage:
** [out :: 172.23.99.5] python murder_client.py peer/seed out.torrent OUT.OUT 127.0.0.1
** [out :: 172.23.99.5]
** [out :: 172.23.99.5] The last parameter is the local ip address, normally 10.x.x.x
** [out :: 172.23.99.5]
** [out :: 172.23.98.78] Incorrect number of arguments
** [out :: 172.23.98.78]
** [out :: 172.23.98.78] Usage:
** [out :: 172.23.98.78] python murder_client.py peer/seed out.torrent OUT.OUT 127.0.0.1
** [out :: 172.23.98.78]
** [out :: 172.23.98.78] The last parameter is the local ip address, normally 10.x.x.x
** [out :: 172.23.98.78]
command finished
failed: "sh -c 'python /u/apps/example-app/shared/murder/murder_client.py peer '''/tmp/Test_2.tgz.torrent''' '''/tmp/Test_2.tgz''' LC_ALL=C host 172.23.99.7 | awk '\\''/has address/ {print $4}'\\'' | head -n 1'" on 172.23.99.7; "sh -c 'python /u/apps/example-app/shared/murder/murder_client.py peer '''/tmp/Test_2.tgz.torrent''' '''/tmp/Test_2.tgz''' LC_ALL=C host 172.23.98.78 | awk '\\''/has address/ {print $4}'\\'' | head -n 1'" on 172.23.98.78; "sh -c 'python /u/apps/example-app/shared/murder/murder_client.py peer '''/tmp/Test_2.tgz.torrent''' '''/tmp/Test_2.tgz''' LC_ALL=C host 172.23.99.5 | awk '\\''/has address/ {print $4}'\\'' | head -n 1'" on 172.23.99.5

As you can see, it throws an "Incorrect number of arguments" error on every peer.

big files support ?

Hi,

I successfully deployed a 2mo image file with only python scripts.
I now try with a tar.xz of 3,9Go, the .torrent creation is a bit longer.
When i start seeding, i dont have the same message "done and done" that i saw with the little file.
Same if i start peers, the file is created but keep showing 0b wih du.
Any idea ?
Thx for help

Error on start seeding

lrl@Ubuntu4-8LW3:cap murder:start_seeding tag="Deploy1"

  • 2014-01-23 08:57:59 executing `murder:start_seeding'
  • executing "screen -dms 'seeder-Deploy1' python /home/lrl/Documents/murder/dist/murder_client.py seeder '/tmp/Deploy1.tgz.torrent' '/tmp/Deploy1.tgz' LC_ALL=C host $HOSTNAME | awk '/has address/ {print $4}' | head -n 1"
    servers: ["10.0.0.104"]
    Password:
    [10.0.0.104] executing command
    *** [err :: 10.0.0.104] Usage: host [-aCdlriTwv] [-c class] [-N ndots] [-t type] [-W time]
    *** [err :: 10.0.0.104] [-R number] [-m flag] hostname [server]
    *** [err :: 10.0.0.104] -a is equivalent to -v -t ANY
    *** [err :: 10.0.0.104] -c specifies query class for non-IN data
    *** [err :: 10.0.0.104] -C compares SOA records on authoritative nameservers
    *** [err :: 10.0.0.104] -d is equivalent to -v
    *** [err :: 10.0.0.104] -l lists all hosts in a domain, using AXFR
    *** [err :: 10.0.0.104] -i IP6.INT reverse lookups
    *** [err :: 10.0.0.104] -N changes the number of dots allowed before root lookup is done
    *** [err :: 10.0.0.104] -r disables recursive processing
    *** [err :: 10.0.0.104] -R specifies number of retries for UDP packets
    *** [err :: 10.0.0.104] -s a SERVFAIL response should stop query
    *** [err :: 10.0.0.104] -t specifies the query type
    *** [err :: 10.0.0.104] -T enables TCP/IP mode
    *** [err :: 10.0.0.104] -v enables verbose output
    *** [err :: 10.0.0.104] -w specifies to wait forever for a reply
    *** [err :: 10.0.0.104] -W specifies how long to wait for a reply
    *** [err :: 10.0.0.104] -4 use IPv4 query transport only
    *** [err :: 10.0.0.104] -6 use IPv6 query transport only
    *** [err :: 10.0.0.104] -m set memory debugging flag (trace|record|usage)
    command finished in 116ms

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.