open-i-beam / swift-storlets Goto Github PK

View Code? Open in Web Editor NEW

7.0 7.0 8.0 437 KB

This repository has moved to

Home Page: https://github.com/openstack/storlets

License: Apache License 2.0

Python 57.98% Java 31.09% Shell 1.62% C 9.32%

swift-storlets's People

Contributors

Stargazers

Watchers

Forkers

ymoatti anirupdutta rvasilets pomeo92 nivertech ajiang38740 tsubic ucensysresearch

swift-storlets's Issues

Some loglines lost with the introduction of host logging for docker container

Aug 5 14:30:28 UNKNOWN_LOCALHOST [main] StorletDaemon_com.ibm.storlet.test.test1 TRACE com.ibm.storlet.test.test1: SBus listen() returned
Aug 5 14:30:28 UNKNOWN_LOCALHOST [main] StorletDaemon_com.ibm.storlet.test.test1 TRACE com.ibm.storlet.test.test1: Calling receive
Aug 5 14:30:28 UNKNOWN_LOCALHOST [main] StorletDaemon_com.ibm.storlet.test.test1 TRACE com.ibm.storlet.test.test1: Receive returned
Aug 5 14:30:28 UNKNOWN_LOCALHOST [main] StorletDaemon_com.ibm.storlet.test.test1 TRACE com.ibm.storlet.test.test1: Calling createStorletTask with com.ibm.storlet.sbus.SBusDatagram@2c9596e5
Aug 5 14:30:28 UNKNOWN_LOCALHOST [main] StorletDaemon_com.ibm.storlet.test.test1 TRACE createStorletTask: received EXECUTE command
Aug 5 14:30:28 UNKNOWN_LOCALHOST [main] StorletDaemon_com.ibm.storlet.test.test1 TRACE StorletTask: Got 5 fds
Aug 5 14:30:28 UNKNOWN_LOCALHOST [main] StorletDaemon_com.ibm.storlet.test.test1 TRACE createStorletTask: fd 0 is of type SBUS_FD_INPUT_OBJECT

Better deal with swift_public_url port config option in common.yml

Either change the entry to or test what is the port used in the swift installation

The storlet handler in the pipeline seem to cause problems with the replicators

Before install Storlets, I looked the logs of the storage nodes to check these errors, and they don't appear (replicator works fine without errors). Then, after install storlets, they appear again, and as more files in swift => more "invalid paths" errors from Storlet-handler.

Now, I have executed a simple test to verify this: In 1 storage node, I have modified object-server.conf deleting storlet-handler from pipeline and restarting swift with "swift-init all restart" => "invalid paths" errors from storlet-handler disapeer.

Ubuntu 15.04 support

The ansible script used relied on the upstart init service manager, but this was removed completely in Ubuntu 15.04, at least by default. This process is responsible initializing services, in our case docker. There is a new standard replacement (which was around earlier) and that is the systemd, the new init service. I found various articles citing how to swift back to upstart in 15.04 but decided just to bail out for now and try the install on 14.04, but I can come back to this later:
http://www.pcworld.com/article/2895517/ubuntu-just-switched-to-systemd-the-project-sparking-controversy-throughout-linux.html
http://linuxg.net/replace-systemd-with-upstart-on-ubuntu-15-04-or-an-official-flavor/

Add support for /etc/swift/proxy-server/proxy-server.conf.d 'format'

This will cause issues down the road, due to the generated ... it will only include the ONE proxy file you chose to include (unlike object conf files we don't support multiple proxy file) and that'll lead to an issue like:
Jul 10 05:37:21 saio-roumani object-6040: No section 'proxy-server' (prefixed by 'app' or 'application' or 'composite' or 'composit' or 'pipeline' or 'filter-app') found in config /etc/swift/storlet-proxy-server.conf

When we try to use the internal client to fetch the storlet and dependencies.

I suggest documenting this and maybe saying to fix you need to manually merge all the files in the to the storlet proxy conf after !

503 Service Unavailable in storlets.

Hi everyone, I have installed storlets using s2aio method of the latest commit and now I am getting 503 Service Unavailable error, when retrieving object using storlet jar. I can get object without storlet.
http://paste.openstack.org/show/497005/ here I have just uploaded Identity storlet with its dependency and source.txt in myobjects container.
The username and password for swift user is the default that was in the s2aio, i.e. swift and passw0rd under tenant service and region 1.
Any help would be appreciated.

P.S. I used https://github.com/openstack/storlets/ repo and I couldn't find any way to report issue there.

The mount points and drives are as follows.

root@swift:~/storlets/StorletSamples/IdentityStorlet/bin# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0    30G  0 disk 
├─sda1   8:1    0    28G  0 part /
├─sda2   8:2    0     1K  0 part 
└─sda5   8:5    0     2G  0 part [SWAP]
sr0     11:0    1  59.5M  0 rom  
loop0    7:0    0   9.3G  0 loop /srv/node/loop0

root@swift:~/storlets/StorletSamples/IdentityStorlet/bin# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        28G  7.9G   19G  31% /
none            4.0K     0  4.0K   0% /sys/fs/cgroup
udev            987M  8.0K  987M   1% /dev
tmpfs           200M  1.6M  198M   1% /run
none            5.0M     0  5.0M   0% /run/lock
none            997M  164K  997M   1% /run/shm
none            100M   84K  100M   1% /run/user
/dev/loop0      9.4G  849M  8.5G   9% /srv/node/loop0

root@swift:~/storlets/StorletSamples/IdentityStorlet/bin# blkid
/dev/loop0: UUID="5d0e5a2f-d935-478c-a86e-5bcc71a7a150" TYPE="xfs" 
/dev/sda1: UUID="dc55150c-c2f6-4309-b991-371a4e15afaa" TYPE="ext4" 
/dev/sda5: UUID="98881fcb-9625-4ddc-b7e4-cbe5222bef45" TYPE="swap"

Generate log seems to be broken

We need to rethink the feature. At any rate it currently seems broken
Hamdi ran into the following issue:
storlet_docker_gateway
def _upload_storlet_logs(self, slog_path):
if (config_true_value(self.idata['generate_log'])):
logfile = open(slog_path, 'r')

I ran into issues (namely an exception will be thrown) where this path was a directory namely:
/home/docker_device/logs/scopes/b2c9f064ec324/com.ibm.storlet.test.test1/

As opposed to the file:
/home/docker_device/logs/scopes/b2c9f064ec324/com.ibm.storlet.test.test1/storlet_invoke.log

Better handlig of proxy-server.conf location in Deploy/playbook/fetch_proxy_conf.yml

This YAML task does as follows:
Deploy/playbook/fetch_proxy_conf.yml
src=/tmp/proxy-server.conf dest=/etc/swift/proxy-server.conf

But this may not be where the conf file is, actually in a vagrant-swift it is definitely not !
We should have this as a configuration (one of the earlier points touches on this as well namely 9.

Storlets do not work on recent versions of swift

Storlets no longer work on recent versions of swift (last working version is 2.3.0). As a quick summary, storlets send a content length of 0 (as they cannot tell with the storlet will do with the output stream) but swift does not like this.

Details:
So I dug into this a little more just to see the code details, and when exactly this was changed:

This takes place while we're trying to build an iterator over the response:

... GETorHEAD_base -> get_working_response -> _make_app_iter

    parts_iter = self._get_response_parts_iter(req, node, source)

    def add_content_type(response_part):
        response_part["content_type"] = \
            HeaderKeyDict(response_part["headers"]).get("Content-Type")
        return response_part


    return document_iters_to_http_response_body(
        (add_content_type(pi) for pi in parts_iter),            
        boundary, is_multipart, self.app.logger)

So in the return you can see we call document_iters_to_http_response_body which does, the path in question is for none multi-part requests as things are handled differently for this):

document_iters_to_http_response_body
if multipart:
return document_iters_to_multipart_byteranges(ranges_iter, boundary)
else:
try:
response_body_iter = next(ranges_iter)['part_iter'] <<<<<<<<

When we call 'next' above on the iterator, is where we run into trouble, we created the iterator earlier though parts_iter = self._get_response_parts_iter(req, node, source), which will trigger this code:

_get_response_parts_iter:
# This is safe; it sets up a generator but does not call next()
# on it, so no IO is performed.
parts_iter = [
http_response_to_document_iters(
source[0], read_chunk_size=self.app.object_chunk_size)]

Finally, the call to http_response_to_document_iters uses the content-length (for none-multi part requests) to create the iterator:

http_response_to_document_iters:
if response.status == 200:
# Single "range" that's the whole object
content_length = int(response.getheader('Content-Length'))
return iter([(0, content_length - 1, content_length,
response.getheaders(), response)])

So the code in 2.3.0 is quite different, creating the iterator (i.e. _make_app_iter) doesn't call http_response_to_document_iters at all, rather all the logic simply existed in this function directly and it seems they created a generator using yield by using 'read.

I've tracked down the change to:
openstack/swift@4f2ed8b
https://review.openstack.org/#/c/173497/

EC: support multiple ranges for GET requests
GetOrHeadHandler got a base class extracted from it that treats an
HTTP response as a sequence of byte-range responses. This way, it can
continue to yield whole fragments, not just N-byte pieces of the raw
HTTP response, since an N-byte piece of a multipart/byteranges
response is pretty much useless.
...

Also, the MIME response for replicated objects got tightened up a
little. Before, it had some leading and trailing CRLFs which, while
allowed by RFC 7233, provide no benefit. Now, both replicated and EC
multipart/byteranges avoid extraneous bytes. This let me re-use the
Content-Length calculation in swob instead of having to either hack
around it or add extraneous whitespace to match.

Improve the download of dependencies

I found for whatever reason I couldn't download from this site (http://www.slf4j.org/dist/) at various times, not just through the script but even my browser.

Probably not much we can do about that but we could improve the logic that decides when we have to, we pull every time the ansible script is run, even if we have the final JAR files (i.e. previously downloaded and copied), but this is pretty minor.

Storlet cache / container does not get updated with new storlets in a multi-node cluster

Updating a storlet via swift PUT to the storlet container do not trigger its update in a multi-cluster env.
Can this be related to time difference between machines?

egg-info suffix differs between setup tools versions.

If you are using setuptools 3.3 for compile this package, there is no problem, but when used a newer version (in my case setuptools v15), setuptools change the name of egg-info dir, so I had to change the line:
RUN ["chmod", "-R" ,"0755", "/usr/local/lib/python2.7/dist-packages/SBusPythonFacade-1.0.egg-info"]
to this:
RUN ["chmod", "-R" ,"0755", "/usr/local/lib/python2.7/dist-packages/SBusPythonFacade-1.0-py2.7.egg-info"]

and this:
RUN ["chmod", "-R" ,"0755", "/usr/local/lib/python2.7/dist-packages/storlet_daemon_factory-1.0.egg-info"]
to this:
RUN ["chmod", "-R" ,"0755", "/usr/local/lib/python2.7/dist-packages/storlet_daemon_factory-1.0-py2.7.egg-info"]

Resolve conflicting dependencies from source swift and docker registry

Swift installed by source docker-registry dependency causes overwrite:
This was another tricky one to debug, essentially half way through the script the SWIFT client stopped working all together. After a lot of digging (and some help from Eran) the root cause was once of the dependencies, namely docker-registry, which has the following requirements file:
boto==2.19.0
backports.lzma==0.0.2
Flask==0.9
PyYAML==3.10
simplejson==3.1.3
requests==1.2.0
gunicorn==18.0
gevent==0.13.8
newrelic==1.13.1.31
blinker==1.3
python-glanceclient==0.10.0
python-keystoneclient==0.3.1
python-swiftclient==1.8.0
redis==2.8.0
rsa==3.1.2
These are all pretty old but the main issue is for an install from source, we'll still go ahead and grab the package above, which is quite a bit older, and many of its dependencies will not exist, implying it can't be executed.
So I believe docker-registry has actually been depreciated, the newer variant doesn't seem to have these dependencies, essentially you need to pull in whatever client you want. Anyway this is something for us to look at further.

Better handling of object_server_conf_files in Deploy/playbook/roles/common_templates/swift_middleware_conf

This file contains:
[object-confs]

object_server_conf_files = /etc/swift/object-server/1.conf, /etc/swift/object-server/2.conf, /etc/swift/object-server/3.conf, /etc/swift/object-server/4.conf

object_server_conf_files = /etc/swift/object-server.conf

We need to have this as a var in common.yml and reference the var from here.

Fix the documentation regarding usage by root / sudoer

The docs suggest if you are not using ROOT you can drive the ansible script using (ansible-playbook -s -i storlet.yml), generally speaking this will not work because sudo prompts for a password, so what the docs should say is to use the combination of -s -K, the later prompts for a password.
With that said, the above is going to cause problems, namely due to the following invocation (here's just one example)
python /opt/ibm/add_new_tenant.py service swift
The ansible script calls a python program that starts a process which drives another anible script, but of course if you weren't using root (and -K) it'll attempt to prompt, implying it will simply hang forever waiting for input.
Given this I think we need to specify that the user used for SSH is either root, or has sudo with passwordless SSH (e.g. rsa between all the hosts).

Call out in doc to make sure DNS is set right`

Docker DNS issues should be documented:
This isn't a huge bug per say but it took quite a bit of time to resolve so I'll call it out. So what I saw during the building of the docker image was:
openjdk-7-jre : Depends: openjdk-7-jre-headless (= 7u51-2.4.6-1ubuntu4) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.

This led to a long goose chase, at first I had thought my variant of Ubuntu had newer version of this particular library, I even tried to downgrade my variant to match it.
The real problem here was given the way we build the docker image we essentially see next to no other output (error or not), it took a while but eventually I just pulled down a container and ran the command ... the first thing we do is apt-get update, to pull in all the packages; It turns out this was doing nothing at all, it couldn't connect with the server, why ? DNS issues in the Toronto IBM lab. The key was to add the DNS server at the very start of the docker requirements file (i.e. update /etc/resolv.conf ).

I think what we need to do here is simply call this out clearly in the docs so customers can know this may be required.

Fix split in Deploy/playbook/roles/common_files/swift_config.py

You need to be careful with SPLIT and PATH exist in python : - )
There's the following code in Deploy/playbook/roles/common_files/swift_config.py
def install(conf):
object_server_conf_files = conf.get('object-confs', 'object_server_conf_files').split(',')
for f in object_server_conf_files:
if os.path.exists(f):
patch_swift_config_file(conf, f, 'object')

Where the config file on disk, to handle the scenario where there are multiple object conf files, as an example:
object_server_conf_files = /etc/swift/object-server/1.conf, /etc/swift/object-server/2.conf, /etc/swift/object-server/3.conf, /etc/swift/object-server/4.conf

So this all looks harmless enough ... but much later down the line I realised my object files were not all patched, now there were two reasons for this (one I'll cover in another section) but the code bug is as follows.
Calling split in python as done above will return:
"file1"
" file2"
" file2"
...
Notice the whitespace, looks harmless but this causes os.path.exists to fail (ouch), and we skip the files !
The first is to first call:
f = f.strip()

Find way to fix the docker version being taken by the install scripts

Docker lxc dependancy incorrectly specified it seems:

The ansible script does list docker-lxc as a dependency, but tries to install a specific version using lxc-docker=1.6.2, which should work.
For whatever reason this doesn't, but I've found that manually using apt-get install lxc-dockek-1.6.2 (this is the actual package name, odd I know), though I've found using a later version of LXC docker to work just fine (i.e .1.7).

Installation missing dependencies

Various packages that are missing and need to be manually installed:

This is fairly minor but there are various packages and we should add dependencies for these:
sudo apt-get install ansible
sudo apt-get install openjdk-7-jdk
sudo apt-get install sshpass
sudo apt-get install ant

We need to decide on whether we fix this in the documentation or adding some scripts.
The point to make is that these are out of 'Ansible scope' (they include ansible) and we probably need a small 'bootstrapping' script that will install these before invoking the Ansible scripts.