jackfengji / test_pro Goto Github PK
View Code? Open in Web Editor NEWDoc for dpark which is a python rewrite for spark
Doc for dpark which is a python rewrite for spark
https://github.com/jackfengji/test_pro/wiki/Examples
Dpark中一个示例有误:升级版二
原示例:
word_count = ctx.accumulator(0)
应该修正为:
word_count = ctx.accumulator.Accumulator(0)
否则会报错误:
TypeError: 'module' object is not callable
希望尽快修改,以免后续学习者迷惑。
为什么要把
self.shouldCache = True
注释掉?
是cache这块的功能有问题么?注释掉的话那岂不是没法启用cache来做持久化了么。
谢谢~
请问作者在写wiki文档是否有参考英文文档?如果有的话是否可以提供?
我在把文档翻译成英文。
DPark-0.4.2-py2.7-linux-x86_64.egg/dpark/crc32c.so: undefined symbol: __builtin_cpu_init
最近想尝试下用dpark写分布式程序,装好mesos后发现dpark运行时出错:
Traceback (most recent call last):
File "calc-pi.py", line 13, in <module>
dpark.parallelize(range(0, N), 5).foreach(random_once)
File "/home/work/local/lib/python2.7/site-packages/Dpark-0.1-py2.7.egg/dpark/rdd.py", line 146, in foreach
return self.ctx.runJob(self, mf)
File "/home/work/local/lib/python2.7/site-packages/Dpark-0.1-py2.7.egg/dpark/context.py", line 128, in runJob
self.start()
File "/home/work/local/lib/python2.7/site-packages/Dpark-0.1-py2.7.egg/dpark/context.py", line 114, in start
self.scheduler.start()
File "/home/work/local/lib/python2.7/site-packages/Dpark-0.1-py2.7.egg/dpark/schedule.py", line 436, in start
self.getExecutorInfo(), self.master)
File "/home/work/local/lib/python2.7/site-packages/Dpark-0.1-py2.7.egg/dpark/schedule.py", line 402, in _
return f(self, *a, **kw)
File "/home/work/local/lib/python2.7/site-packages/Dpark-0.1-py2.7.egg/dpark/schedule.py", line 473, in getExecutorInfo
info.uri = os.path.abspath(os.path.join(dir, 'executor'))
AttributeError: 'ExecutorInfo' object has no attribute 'uri'
我看了mesos_pb2.py的代码,发现ExecutorInfo的定义是这样的:
_EXECUTORINFO = descriptor.Descriptor(
name='ExecutorInfo',
full_name='mesos.ExecutorInfo',
filename=None,
file=DESCRIPTOR,
containing_type=None,
fields=[
descriptor.FieldDescriptor(
name='executor_id', full_name='mesos.ExecutorInfo.executor_id', index=0,
number=1, type=11, cpp_type=10, label=2,
has_default_value=False, default_value=None,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None),
descriptor.FieldDescriptor(
name='command', full_name='mesos.ExecutorInfo.command', index=1,
number=7, type=11, cpp_type=10, label=2,
has_default_value=False, default_value=None,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None),
descriptor.FieldDescriptor(
name='resources', full_name='mesos.ExecutorInfo.resources', index=2,
number=5, type=11, cpp_type=10, label=3,
has_default_value=False, default_value=[],
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None),
descriptor.FieldDescriptor(
name='data', full_name='mesos.ExecutorInfo.data', index=3,
number=4, type=12, cpp_type=9, label=1,
has_default_value=False, default_value="",
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None),
],
extensions=[
],
nested_types=[],
enum_types=[
],
options=None,
is_extendable=False,
extension_ranges=[],
serialized_start=417,
serialized_end=558,
)
里面没有uri。。。
是不是安装dpark需要的mesos必须是指定的某个较老的版本?
我的mesos是用svn在https://svn.apache.org/repos/asf/incubator/mesos/trunk/下的
Path: .
URL: https://svn.apache.org/repos/asf/incubator/mesos/trunk
Repository Root: https://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: 1330115
Node Kind: directory
Schedule: normal
Last Changed Author: benh
Last Changed Rev: 1330079
Last Changed Date: 2012-04-25 09:23:38 +0800 (Wed, 25 Apr 2012)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.