Giter Site home page Giter Site logo

Comments (4)

JerryLead avatar JerryLead commented on August 15, 2024 1

@about17ka

“在资源足够的情况下,stage1与stage2中的所有task,只要没有依赖关系,都可以并发执行”。正确。
“而如果资源有限,则会按照顺序从stage1开始执行,不知是否正确”。stage1和stage2里面的所有task都是ready-to-run,具体哪个task先被执行,要看调度器先调了哪个task。

“这种并发执行是否需要我们编写并行的scala程序才能实现?还是我们只需要写顺序的书写程序,而spark会调度程序并发的执行?”, 我们写的Scala程序只specify RDD之间的关系(也就是logic plan),Spark负责将logic plan转换成上面的physical plan,然后并发执行task,我们只能控制task的个数(也就是partition个数),其他并发由框架完成。

from sparkinternals.

JerryLead avatar JerryLead commented on August 15, 2024

@about17ka
stage1和stage2可以并发执行,因为他们之间没有依赖关系,但具体实现是否是并行的,有点忘了,要review一下代码。

stage2中的两个lineage也是并发执行的,一共生成4个tasks,每个task执行一条黑色的线

stage1和stage2都是stage0的parent stage,stage0等待stage1和stage2都完成任务后,才能执行,之前stage0状态一直是pending。激发很容易,就是一个栈,stage0在栈底,上面是stage0和stage1

from sparkinternals.

about17ka avatar about17ka commented on August 15, 2024

@JerryLead
首先很感谢你的回答,我对你回答的理解是:在资源足够的情况下,stage1与stage2中的所有task,只要没有依赖关系,都可以并发执行,而如果资源有限,则会按照顺序从stage1开始执行,不知是否正确。
再咨询一个问题,这种并发执行是否需要我们编写并行的scala程序才能实现?还是我们只需要写顺序的书写程序,而spark会调度程序并发的执行?

from sparkinternals.

about17ka avatar about17ka commented on August 15, 2024

非常感谢您的回答

from sparkinternals.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.