open-i-beam / swift-storlets Goto Github PK
View Code? Open in Web Editor NEWThis repository has moved to
Home Page: https://github.com/openstack/storlets
License: Apache License 2.0
This repository has moved to
Home Page: https://github.com/openstack/storlets
License: Apache License 2.0
Aug 5 14:30:28 UNKNOWN_LOCALHOST [main] StorletDaemon_com.ibm.storlet.test.test1 TRACE com.ibm.storlet.test.test1: SBus listen() returned
Aug 5 14:30:28 UNKNOWN_LOCALHOST [main] StorletDaemon_com.ibm.storlet.test.test1 TRACE com.ibm.storlet.test.test1: Calling receive
Aug 5 14:30:28 UNKNOWN_LOCALHOST [main] StorletDaemon_com.ibm.storlet.test.test1 TRACE com.ibm.storlet.test.test1: Receive returned
Aug 5 14:30:28 UNKNOWN_LOCALHOST [main] StorletDaemon_com.ibm.storlet.test.test1 TRACE com.ibm.storlet.test.test1: Calling createStorletTask with com.ibm.storlet.sbus.SBusDatagram@2c9596e5
Aug 5 14:30:28 UNKNOWN_LOCALHOST [main] StorletDaemon_com.ibm.storlet.test.test1 TRACE createStorletTask: received EXECUTE command
Aug 5 14:30:28 UNKNOWN_LOCALHOST [main] StorletDaemon_com.ibm.storlet.test.test1 TRACE StorletTask: Got 5 fds
Aug 5 14:30:28 UNKNOWN_LOCALHOST [main] StorletDaemon_com.ibm.storlet.test.test1 TRACE createStorletTask: fd 0 is of type SBUS_FD_INPUT_OBJECT
Either change the entry to or test what is the port used in the swift installation
Before install Storlets, I looked the logs of the storage nodes to check these errors, and they don't appear (replicator works fine without errors). Then, after install storlets, they appear again, and as more files in swift => more "invalid paths" errors from Storlet-handler.
Now, I have executed a simple test to verify this: In 1 storage node, I have modified object-server.conf deleting storlet-handler from pipeline and restarting swift with "swift-init all restart" => "invalid paths" errors from storlet-handler disapeer.
The ansible script used relied on the upstart init service manager, but this was removed completely in Ubuntu 15.04, at least by default. This process is responsible initializing services, in our case docker. There is a new standard replacement (which was around earlier) and that is the systemd, the new init service. I found various articles citing how to swift back to upstart in 15.04 but decided just to bail out for now and try the install on 14.04, but I can come back to this later:
http://www.pcworld.com/article/2895517/ubuntu-just-switched-to-systemd-the-project-sparking-controversy-throughout-linux.html
http://linuxg.net/replace-systemd-with-upstart-on-ubuntu-15-04-or-an-official-flavor/
This will cause issues down the road, due to the generated ... it will only include the ONE proxy file you chose to include (unlike object conf files we don't support multiple proxy file) and that'll lead to an issue like:
Jul 10 05:37:21 saio-roumani object-6040: No section 'proxy-server' (prefixed by 'app' or 'application' or 'composite' or 'composit' or 'pipeline' or 'filter-app') found in config /etc/swift/storlet-proxy-server.conf
When we try to use the internal client to fetch the storlet and dependencies.
I suggest documenting this and maybe saying to fix you need to manually merge all the files in the to the storlet proxy conf after !
Hi everyone, I have installed storlets using s2aio method of the latest commit and now I am getting 503 Service Unavailable error, when retrieving object using storlet jar. I can get object without storlet.
http://paste.openstack.org/show/497005/ here I have just uploaded Identity storlet with its dependency and source.txt in myobjects container.
The username and password for swift user is the default that was in the s2aio, i.e. swift and passw0rd under tenant service and region 1.
Any help would be appreciated.
P.S. I used https://github.com/openstack/storlets/ repo and I couldn't find any way to report issue there.
The mount points and drives are as follows.
root@swift:~/storlets/StorletSamples/IdentityStorlet/bin# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 30G 0 disk
├─sda1 8:1 0 28G 0 part /
├─sda2 8:2 0 1K 0 part
└─sda5 8:5 0 2G 0 part [SWAP]
sr0 11:0 1 59.5M 0 rom
loop0 7:0 0 9.3G 0 loop /srv/node/loop0
root@swift:~/storlets/StorletSamples/IdentityStorlet/bin# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 28G 7.9G 19G 31% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
udev 987M 8.0K 987M 1% /dev
tmpfs 200M 1.6M 198M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 997M 164K 997M 1% /run/shm
none 100M 84K 100M 1% /run/user
/dev/loop0 9.4G 849M 8.5G 9% /srv/node/loop0
root@swift:~/storlets/StorletSamples/IdentityStorlet/bin# blkid
/dev/loop0: UUID="5d0e5a2f-d935-478c-a86e-5bcc71a7a150" TYPE="xfs"
/dev/sda1: UUID="dc55150c-c2f6-4309-b991-371a4e15afaa" TYPE="ext4"
/dev/sda5: UUID="98881fcb-9625-4ddc-b7e4-cbe5222bef45" TYPE="swap"
We need to rethink the feature. At any rate it currently seems broken
Hamdi ran into the following issue:
storlet_docker_gateway
def _upload_storlet_logs(self, slog_path):
if (config_true_value(self.idata['generate_log'])):
logfile = open(slog_path, 'r')
I ran into issues (namely an exception will be thrown) where this path was a directory namely:
/home/docker_device/logs/scopes/b2c9f064ec324/com.ibm.storlet.test.test1/
As opposed to the file:
/home/docker_device/logs/scopes/b2c9f064ec324/com.ibm.storlet.test.test1/storlet_invoke.log
This YAML task does as follows:
Deploy/playbook/fetch_proxy_conf.yml
src=/tmp/proxy-server.conf dest=/etc/swift/proxy-server.conf
But this may not be where the conf file is, actually in a vagrant-swift it is definitely not !
We should have this as a configuration (one of the earlier points touches on this as well namely 9.
Storlets no longer work on recent versions of swift (last working version is 2.3.0). As a quick summary, storlets send a content length of 0 (as they cannot tell with the storlet will do with the output stream) but swift does not like this.
Details:
So I dug into this a little more just to see the code details, and when exactly this was changed:
This takes place while we're trying to build an iterator over the response:
... GETorHEAD_base -> get_working_response -> _make_app_iter
parts_iter = self._get_response_parts_iter(req, node, source)
def add_content_type(response_part):
response_part["content_type"] = \
HeaderKeyDict(response_part["headers"]).get("Content-Type")
return response_part
return document_iters_to_http_response_body(
(add_content_type(pi) for pi in parts_iter),
boundary, is_multipart, self.app.logger)
So in the return you can see we call document_iters_to_http_response_body which does, the path in question is for none multi-part requests as things are handled differently for this):
document_iters_to_http_response_body
if multipart:
return document_iters_to_multipart_byteranges(ranges_iter, boundary)
else:
try:
response_body_iter = next(ranges_iter)['part_iter'] <<<<<<<<
When we call 'next' above on the iterator, is where we run into trouble, we created the iterator earlier though parts_iter = self._get_response_parts_iter(req, node, source), which will trigger this code:
_get_response_parts_iter:
# This is safe; it sets up a generator but does not call next()
# on it, so no IO is performed.
parts_iter = [
http_response_to_document_iters(
source[0], read_chunk_size=self.app.object_chunk_size)]
Finally, the call to http_response_to_document_iters uses the content-length (for none-multi part requests) to create the iterator:
http_response_to_document_iters:
if response.status == 200:
# Single "range" that's the whole object
content_length = int(response.getheader('Content-Length'))
return iter([(0, content_length - 1, content_length,
response.getheaders(), response)])
So the code in 2.3.0 is quite different, creating the iterator (i.e. _make_app_iter) doesn't call http_response_to_document_iters at all, rather all the logic simply existed in this function directly and it seems they created a generator using yield by using 'read.
I've tracked down the change to:
openstack/swift@4f2ed8b
https://review.openstack.org/#/c/173497/
EC: support multiple ranges for GET requests
GetOrHeadHandler got a base class extracted from it that treats an
HTTP response as a sequence of byte-range responses. This way, it can
continue to yield whole fragments, not just N-byte pieces of the raw
HTTP response, since an N-byte piece of a multipart/byteranges
response is pretty much useless.
...
Also, the MIME response for replicated objects got tightened up a
little. Before, it had some leading and trailing CRLFs which, while
allowed by RFC 7233, provide no benefit. Now, both replicated and EC
multipart/byteranges avoid extraneous bytes. This let me re-use the
Content-Length calculation in swob instead of having to either hack
around it or add extraneous whitespace to match.
I found for whatever reason I couldn't download from this site (http://www.slf4j.org/dist/) at various times, not just through the script but even my browser.
Updating a storlet via swift PUT to the storlet container do not trigger its update in a multi-cluster env.
Can this be related to time difference between machines?
If you are using setuptools 3.3 for compile this package, there is no problem, but when used a newer version (in my case setuptools v15), setuptools change the name of egg-info dir, so I had to change the line:
RUN ["chmod", "-R" ,"0755", "/usr/local/lib/python2.7/dist-packages/SBusPythonFacade-1.0.egg-info"]
to this:
RUN ["chmod", "-R" ,"0755", "/usr/local/lib/python2.7/dist-packages/SBusPythonFacade-1.0-py2.7.egg-info"]
and this:
RUN ["chmod", "-R" ,"0755", "/usr/local/lib/python2.7/dist-packages/storlet_daemon_factory-1.0.egg-info"]
to this:
RUN ["chmod", "-R" ,"0755", "/usr/local/lib/python2.7/dist-packages/storlet_daemon_factory-1.0-py2.7.egg-info"]
This file contains:
[object-confs]
object_server_conf_files = /etc/swift/object-server.conf
We need to have this as a var in common.yml and reference the var from here.
This led to a long goose chase, at first I had thought my variant of Ubuntu had newer version of this particular library, I even tried to downgrade my variant to match it.
The real problem here was given the way we build the docker image we essentially see next to no other output (error or not), it took a while but eventually I just pulled down a container and ran the command ... the first thing we do is apt-get update, to pull in all the packages; It turns out this was doing nothing at all, it couldn't connect with the server, why ? DNS issues in the Toronto IBM lab. The key was to add the DNS server at the very start of the docker requirements file (i.e. update /etc/resolv.conf ).
I think what we need to do here is simply call this out clearly in the docs so customers can know this may be required.
Where the config file on disk, to handle the scenario where there are multiple object conf files, as an example:
object_server_conf_files = /etc/swift/object-server/1.conf, /etc/swift/object-server/2.conf, /etc/swift/object-server/3.conf, /etc/swift/object-server/4.conf
So this all looks harmless enough ... but much later down the line I realised my object files were not all patched, now there were two reasons for this (one I'll cover in another section) but the code bug is as follows.
Calling split in python as done above will return:
"file1"
" file2"
" file2"
...
Notice the whitespace, looks harmless but this causes os.path.exists to fail (ouch), and we skip the files !
The first is to first call:
f = f.strip()
Docker lxc dependancy incorrectly specified it seems:
Various packages that are missing and need to be manually installed:
We need to decide on whether we fix this in the documentation or adding some scripts.
The point to make is that these are out of 'Ansible scope' (they include ansible) and we probably need a small 'bootstrapping' script that will install these before invoking the Ansible scripts.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.