Giter Site home page Giter Site logo

seaweedfs / seaweedfs Goto Github PK

View Code? Open in Web Editor NEW
21.9K 539.0 2.2K 55.54 MB

SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.

License: Apache License 2.0

Go 88.46% Makefile 0.27% Shell 0.17% Java 10.24% Smarty 0.14% HTML 0.67% Dockerfile 0.02% Lua 0.04%
distributed-storage distributed-systems s3 hdfs fuse distributed-file-system hadoop-hdfs posix tiered-file-system kubernetes replication object-storage s3-storage seaweedfs erasure-coding blob-storage cloud-drive

seaweedfs's Issues

Wrong HTTP verb semantics around file creation

According to the README, the /assign request is a "GET", but performing this "GET" creates a space for a file. HTTP GET actions are supposed to only be used for actions that do not modify state (ie. they can be cached, and a spider re-requesting them thousands of times a second won't have any effect other than a transient increase in server load).

The /assign request should be a POST, as it creates something. This avoids intermediate components breaking things by making assumptions about the behavior of GET requests (ie. a caching layer not forwarding the GET request to the server).

(Reading the further API documentation, this also applies to /vol/vacuum, /vol/grow, and /admin/assign_volume. GET is an appropriate verb for /dir/lookup, /cluster/status, /dir/status, and /status.)

Similarly, the requests to upload file content are "POST" requests, but they should be "PUT" requests, as they are replayable to restore state.

Weed "minimal" docker image

I think the weed docker image which build from the given Dockerfile is too large.

update: I made a pull request

Seems like "-log_dir" not parsed

OS: Ubuntu 14.04.1 LTS
Weed version: 0.68 (downloaded from pre-compiled deb package)

Run Command:

/usr/bin/weed -log_dir="/var/log/weed" -alsologtostderr=false -v=4 master -mdir="/tmp" -port=9333 -volumeSizeLimitMB=30000 -ip="10.130.197.44"

Viewing Log Command:

root@static:/var/log# tail /var/log/weed
tail: cannot open ‘/var/log/weed’ for reading: No such file or directory

export的命令行参数与实现是否不一致?

//export.go

exportVolumeId   = cmdExport.Flag.Int("volumeId", -1, "a volume id. The volume should already exist in the dir. The volume index file should not exist.")

the flag parameter says : "The volume index file should not exist"

indexFile, err := os.OpenFile(path.Join(*exportVolumePath, fileName+".idx"), os.O_RDONLY, 0644)

but at here, open the index file with flag: O_RDONLY

is it a misspelling ?

Reading Multi part error

Getting tons of these errors, unable to upload. Any ideas? Restarted the volume and things are working now.

I0115 15:02:57 00001 needle.go:65] Reading Multi part [ERROR] EOF

(Edit: restarting didn't fix things)

Can I hide port info from volume publicUrl?

We would like to proxy requests over nginx. So even if volume server accepts requests on 8080 our clients will use 80 port. I was able to change volume publicUrl by -publicIp= option but it still returns the same (8080) port. Is it possible to hide port for publicUrl?

-filer.redirectOnRead should imply -filer

If I start a 'weed server -filer.redirectOnRead' without specifying -filer, it gives a puzzling error message because the filer does not start. If the user sets -filer.redirectOnRead, make it start the filer too.

Docker build not working

Seems like it can't opkg-install curl

$ docker build -t weedfs .
Sending build context to Docker daemon 50.78 MB
Sending build context to Docker daemon 
Step 0 : FROM progrium/busybox
Pulling repository progrium/busybox
6f114d3139e3: Download complete 
511136ea3c5a: Download complete 
e2fb46397934: Download complete 
015fb409be0d: Download complete 
21082221cb6e: Download complete 
bbd692fe2ca1: Download complete 
8db8f013bfca: Download complete 
7fff0c6f0b8d: Download complete 
7acf13620725: Download complete 
Status: Downloaded newer image for progrium/busybox:latest
 ---> 6f114d3139e3
Step 1 : WORKDIR /opt/weed
 ---> Running in 028351e07e71
 ---> d1d57d22ef4d
Removing intermediate container 028351e07e71
Step 2 : RUN opkg-install curl
 ---> Running in 2fd9079c8d42
wget: bad address 'downloads.openwrt.org'
wget: bad address 'downloads.openwrt.org'
Downloading http://downloads.openwrt.org/snapshots/trunk/x86_64/packages/base/Packages.gz.
Downloading http://downloads.openwrt.org/snapshots/trunk/x86_64/packages/packages/Packages.gz.
Collected errors:
 * opkg_download: Failed to download http://downloads.openwrt.org/snapshots/trunk/x86_64/packages/base/Packages.gz, wget returned 1.
 * opkg_download: Failed to download http://downloads.openwrt.org/snapshots/trunk/x86_64/packages/packages/Packages.gz, wget returned 1.
Unknown package 'curl'.
Collected errors:
 * opkg_install_cmd: Cannot install package curl.
 ---> 958aa12283bc
Removing intermediate container 2fd9079c8d42
Step 3 : RUN echo insecure >> ~/.curlrc
 ---> Running in 576c9f7a23d3
 ---> c39d8579e570
Removing intermediate container 576c9f7a23d3
Step 4 : RUN curl -Lks https://bintray.com$(curl -Lk http://bintray.com/chrislusf/Weed-FS/seaweed/_latestVersion | grep linux_amd64.tar.gz | sed -n "/href/ s/.*href=['\"]\([^'\"]*\)['\"].*/\1/gp") | gunzip | tar -xf - -C /opt/weed/ &&   mv weed_* bin &&   chmod +x ./bin/weed
 ---> Running in 5d84f61c0e11
/bin/sh: curl: not found
/bin/sh: curl: not found
gunzip: invalid magic
tar: short read
INFO[0095] The command [/bin/sh -c curl -Lks https://bintray.com$(curl -Lk http://bintray.com/chrislusf/Weed-FS/seaweed/_latestVersion | grep linux_amd64.tar.gz | sed -n "/href/ s/.*href=['\"]\([^'\"]*\)['\"].*/\1/gp") | gunzip | tar -xf - -C /opt/weed/ &&   mv weed_* bin &&   chmod +x ./bin/weed] returned a non-zero code: 1 
$ docker --version
Docker version 1.4.1, build 5bc2ff8

Missing build instructions for Mac OSX 10.9.4

I can't find the precompiled binary release for my OS platform (Mac OSX 10.9.4) on your linked page.

When I git clone this repo, I find a bunch of *.go files in some subfolders. Is there a Makefile to compile all of these into a weed binary myself?

Stability of master servers

I'm using weed-fs in production now. Sometimes master server randomly respond with blank line or "no more volumes" error. (even when correct topology in /dir/status)
I used 3 master-servers and it caused to even decrease overall stability of the system - sometimes master servers lost sync with each other / unable to promote the leader, and the only way fixing it was removing all master server files, shutting down all master servers and data nodes, then starting master servers and nodes.

I'm not quite sure how to investigate this problem, because reproducibility is low - I have no idea how to force master server(s) into unavailable state.

Maybe some hardcore unit/integration tests?
This issue not allowing me to use weed-fs in production for my new projects and I'm currently migrating to cloud storage, but leaving possibility for migrating back.

image id format change?

It seems like image ids have dramatically increased in size (I'm assuming since I've updated versions)... is this expected?

old style: 4,92569a439e91
new style: 4,fee4fdc8838a1ec4f305f7fe

I think weed-fs needs a web console

For ops can see the weed-fs running information or some helpful command.

For example, yesterday I met a problem about two master server both thinking itself is the leader,
so that clients can not upload any file (by /submit ) .
And if it has a web console, anyone can easily look up the cluster information any time to find out the problem.

In the end, we restarted the two master server to solve this problem.

what about this logo?

image

My girlfriend made it.
Maybe it can be used as this project's logo if you like it?

The replication in weed is Strong consistency?

I saw the codes in ReplicatedWrite (store_replicate.go) ,
When the file uploaded, ReplicatedWrite will do replication (to another volume server) first, and when the replication is done, return the result , so I think The replication in weed is Strong consistency , is it right?

Restoring from idx and dat only?

I'm migrating volumes from one machine to another.

It seems like volumes start up fine with only idx and dat files. (I see the volumes being loaded in the logs.)

When the volumes are connected to a fresh master and I do /dir/assign, the master grows new volumes instead of allocating into the existing volumes. Is this expected behavior?

How to backup data?

How can I backup weed data? Suppose I can't stop writes. Can you explain it and add it to docs?

File Listing or Finding

Is there a way to find or list the files? Is there a simple plan so I can code it myself? I do like the simplicity of this.

single point of failure

3 master server: mA mB mC
2 volume server: vA vB

mA is the leader of (mA,mB,mC) cluster.
and vA and vB are connected with mA; (replication is "001")

If mA is down, mB become the leader of (mB,mC) as my expectation.
But why don't vA and vB is still looking up for mA when writing replication?
When vA and vB will realize to looking up for mB for writing replication?

In this case, when mA is down and then I submit file to mB;
it always returns e.g. {"error":"Failed to write to replicas for volume 3"}

How to solve this problem about SPOF?

Backups / Migrations

Hi,

I'd like to migrate data from a running weed-fs to another one (from a staging server to a production one for instance), or even just make a backup of the files (that I could re-import if needed, not just a big dump)... is there some export/import/migration possibility in place ?

Thanks!

content type from volume post upload always text/plain?

I'm assuming the return content-type should be application/json?

$ curl -v -F "file=@/tmp/test.json;type=application/json" localhost:8080/7,07a27ee211
> POST /7,07a27ee211 HTTP/1.1
> User-Agent: curl/7.30.0
> Host: localhost:8080
> Accept: */*
> Content-Length: 223
> Expect: 100-continue
> Content-Type: multipart/form-data; boundary=----------------------------2f1cf300fd2b
> 
< HTTP/1.1 100 Continue
< HTTP/1.1 201 Created
< Date: Mon, 29 Sep 2014 20:08:45 GMT
< Content-Length: 30
< Content-Type: text/plain; charset=utf-8
< 
* Connection #0 to host localhost left intact
{"name":"test.json","size":74}

Volume/ Master not linking

I have the following server configuration in DigitalOcean:

  • 2x Ubuntu 14.04.1 LTS
  • Local network IP enabled
  • Weedfs installed from "weed_0.68_amd64.deb" package

Server 1 (Master) IP: 10.130.197.44
Server 2 (Volume) IP: 10.130.161.166

Server 1 run command:

root@static:~# weed master -mdir="/tmp" -port=9333 -volumeSizeLimitMB=30000 -ip="10.130.197.44" -whiteList="127.0.0.1,10.130.161.166"
I0121 21:30:10 05111 file_util.go:20] Folder /tmp Permission: -rwxrwxrwx
I0121 21:30:10 05111 topology.go:86] Using default configurations.
I0121 21:30:10 05111 master_server.go:58] Volume Size Limit is 30000 MB
I0121 21:30:10 05111 master.go:70] Start Seaweed Master 0.68 at 0.0.0.0:9333
I0121 21:30:10 05111 raft_server.go:103] Old conf,log,snapshot should have been removed.

Server 2 run command:

root@vol1:~# weed volume -dir="/var/datastore" -ip="10.130.161.166" -max=7 -port=8080 -mserver="10.130.197.44:9333"
I0121 21:31:23 01110 file_util.go:20] Folder /var/datastore Permission: -rwxr-xr-x
I0121 21:31:23 01110 store.go:232] Store started on dir: /var/datastore with 0 volumes max 1
I0121 21:31:23 01110 volume.go:94] Start Seaweed volume server 0.68 at 0.0.0.0:8080

Been waiting for 10 mins and both console remains the same, seems to be stuck.

Add document into github

Please Add/Copy document from googlecode into github page or wiki. We no need to peeping googlecode.

Thank you

Thanks for adding this to Github, even if it's just a mirror. I suggest moving over development here as it will help you gain help and possibly start a community.

I plan to give a helping hand at some point in the future, if time allows, and play with it on our servers. There's still a lot of work to be done to catch up with the big guys (gluster, ceph, etc). I really dig the motivation of the project, though (simple, no bullshit, key-value file storage). I hope you're successful in finding a wider user base. 👍

How to migrate volume from server `A` to `B`?

Suppose we've got 3 volume servers (A, B, C) with 001 replication type. We wan't to remove volume server C. We have to migrate all volumes located on volume server C to A and B.

Is it possible to migrate volume with id X from server A to server B?

How can I append file

I have a file has uploaded , I want to append this file‘s content, how can do it ?

Fix travis-ci badge

You will need to sign up to
https://travis-ci.org/
With your github account

Go to your profile
Sync with you repos
Switch "on" chrislusf/weed-fs

If you will have questions or difficulties - feel free to ask.

P.s. currently build is failing due to compile-time error

Add a tool to change a volume's replication setting.

Hello. We are going to try weed-fs in our pretty big project to store 3m+ images. There are few questions about replication which are not clear from docs:

  1. Weed replicates volumes. Is it correct?
  2. Is it possible to change replication type for volume x? Suppose we added one more server
  3. Is it possible to control volume location? We've got 3 servers and 001 replication type. Volume 1 is located on server A and B. How can I move it from B to C?

expose http GET only

Is there currently a way to expose the GET resource separately from the other GETs & POSTs?

ie:

I would like to (publicly) provide access to

GET http://server1.example.com:8080/3,238ceaa9a7

but not to

POST http://server1.example.com:8080/3,238ceaa9a7

nor

GET http://server1.example.com:8080/status (etc...)

Right now my alternatives seem to be:

  • iptable string matching (which doesn't seem very fun)
  • proxy server (extra hop + defeats the purpose of fully concurrent reads)

I think a reasonable solution would be to provide some alternative ip:port binding for just the resource GETs.

Public port & url

Use case:
I have a weed servers:

s1.hostname.com:8080
s2.hostname.com:8080

I can place them behind reverse-proxy (nginx), and they will be accessible as

hostname.com/media/s1/:80
hostname.com/media/s2/:80

or even

hostname.com/media:80

or even

hostname.com/media:443

Which is

http://hostname.com/media
https://hostname.com/media

In case of 100 replication enabled on master server.

Now i can achieve this only with manual rewrite rules for s1 and s2.

It could be like

weed volume -port 8080 -publicPort 80 -publicPort 443 -publicURL "hostname.com/media"

Or something like this

redirectOnRead goes to localhost

I've got it running as a server:

./weed server -filer.redirectOnRead -filer

Here's the curl. Note the hostname in the Location header:

bash-3.2$ curl -vI http://weed1:8888/path/to/sources/README.md > /dev/null

HEAD /path/to/sources/README.md HTTP/1.1
User-Agent: curl/7.30.0
Host: weed1:8888
Accept: /

< HTTP/1.1 302 Found
< Location: http://localhost:8080/5,01afbd8c36
< Date: Sat, 20 Dec 2014 17:15:34 GMT
< Content-Type: text/plain; charset=utf-8
<
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0

  • Connection #0 to host weed1 left intact
    bash-3.2$

go fmt & golint

I think it will be awesome to solve some golint warnings/proposals and make go fmt on all code.

benchmark not working properly

on my mac, call ./weed benchmark -server=localhost:9333

then got the following, it seems stuck at "Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s" and does not exit

Completed 964638 of 1048576 requests, 92.0% 5197.2/s 5.2MB/s
Completed 970089 of 1048576 requests, 92.5% 5452.4/s 5.5MB/s
Completed 976989 of 1048576 requests, 93.2% 6889.6/s 6.9MB/s
Completed 984025 of 1048576 requests, 93.8% 7034.8/s 7.1MB/s
Completed 991143 of 1048576 requests, 94.5% 7118.4/s 7.2MB/s
Completed 998137 of 1048576 requests, 95.2% 6979.6/s 7.0MB/s
Completed 1005152 of 1048576 requests, 95.9% 7023.3/s 7.1MB/s
Completed 1012222 of 1048576 requests, 96.5% 7069.1/s 7.1MB/s
Completed 1019163 of 1048576 requests, 97.2% 6938.4/s 7.0MB/s
Completed 1026164 of 1048576 requests, 97.9% 6989.8/s 7.0MB/s
Completed 1033175 of 1048576 requests, 98.5% 7026.3/s 7.1MB/s
Completed 1039418 of 1048576 requests, 99.1% 6240.1/s 6.3MB/s
Completed 1046381 of 1048576 requests, 99.8% 6942.4/s 7.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 1803.5/s 1.8MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s
Completed 1048181 of 1048576 requests, 100.0% 0.0/s 0.0MB/s

weed listen ip on 0.0.0.0 ,regardless of -ip= setting

this is a bug ? find in 0.67-0.68 version

[root@backgroup ~]# ps aux | grep weed
root 15526 0.0 0.0 103252 840 pts/0 R+ 17:23 0:00 grep weed
weedfs 32554 0.0 0.0 777940 24076 ? Sl 2014 21:58 /usr/local/weed/weed master -mdir=/opt/weedfs -ip=10.168.241.178
[root@backgroup ~]# netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:6379 0.0.0.0:* LISTEN 11159/redis-server
tcp 0 0 0.0.0.0:11211 0.0.0.0:* LISTEN 4755/memcached
tcp 0 0 0.0.0.0:4369 0.0.0.0:* LISTEN 4621/epmd
tcp 0 0 0.0.0.0:9333 0.0.0.0:* LISTEN 32554/weed

Volume server keep wanting only to talk to localhost:9333

I am trying to get volume servers to talk to a master but no mater what I say on the command arguments for mserver it starts sticking trying to connect to localhost:

Starting volume server(on IP 10.150.14.80):
/usr/local/bin/weed/weed -log_dir=/var/log/weed -v=4 -alsologtostderr=true volume -dir=/data -mserver="10.150.14.89:9333"

Starting master server(ON IP 10.150.14.89)
/usr/local/bin/weed/weed -v=4 master -mdir=/data

The log on the volume server keeps saying:
I1106 22:00:14 04343 file_util.go:20] Folder /data Permission: -rwxr-xr-x
I1106 22:00:14 04343 store.go:220] Store started on dir: /data with 0 volumes max 7
I1106 22:00:14 04343 volume.go:91] Start Seaweed volume server 0.65 at 0.0.0.0:8080
I1106 22:00:14 04343 list_masters.go:18] list masters result :{"IsLeader":true,"Leader":"localhost:9333"}
I1106 22:00:14 04343 store.go:57] current master node is :localhost:9333
I1106 22:00:14 04343 volume_server.go:65] Volume Server Failed to talk with master: Post http://localhost:9333/dir/join: dial tcp 127.0.0.1:9333: connection refused

Any ideas ?

ReadTheDocs documentation

Move docs to RTD and extend them with docker examples

(This issue is for me, if you want - add me to collaborators and assign it to me)

get content type error

hi,

I use weed-fs 1 years ago. It is an old version [0.45] of weed.
I post an image to weedfs, then I get it from browser , It will display correctly.

curl -F file=@/home/chris/myphoto.jpg http://localhost:9333/submit

when I use 0.67 of weed. I do the same post, But the same images cannot display in the browser correctly. It cannot display then do download in the chrome.

what is wrong about the weed version up? How can I config to this get header output ?

please help me.

-filer.dir perms

I'm trying to start the server without filer by adding "-filer=false" but it still asks for the permissions. is that logical issue?

 Check Mapping Meta Folder (-filer.dir="") Writable: stat : no such file or directory
goroutine 16 [running]:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.