Comments (7)
Do you have the solution of the problem?
from incubator-uniffle.
Yes, it is be testing in our production environment. I will watch it for a while. If it's OK, I will create a pr
from incubator-uniffle.
Could you share your solution? We can discuss first.
from incubator-uniffle.
Could you share your solution? We can discuss first.
- In server side, if
requireBufferId
not found when send data, thrown an exception. - In client side, if fail to send data, require buffer again.
from incubator-uniffle.
cc @colinmjj . There seems not be cases in our production environment. But I think the analysis is correct. What do you think?
from incubator-uniffle.
I think @xianjingfeng is right, with current implementation, OOM will happen if requireBufferId
was expired in Shuffle Server already, this maybe caused by GC, network problem, high workload in shuffle server etc.
It's better to have the limitation to accept the data with requireBufferId
only to avoid such problem.
from incubator-uniffle.
closed by #157
from incubator-uniffle.
Related Issues (20)
- [FEATURE] Support rpc audit log for coordinator
- [Bug] java.lang.NoSuchMethodError: java.nio.ByteBuffer.position(I)Ljava/nio/ByteBuffer
- [FEATURE] Client send sparkConf through registerApplicationInfo rpc to Coordinator HOT 2
- [Bug] The server page of dashboard cannot display server info when refresh the page HOT 2
- [Bug] [TEZ] Application may get stuck when exception occurs.
- [FEATURE] Add mock data to the Dashboard front end.
- [Improvement] introduce Lombok
- [FEATURE] Introduce swagger to support restapi HOT 2
- [Improvement] Add Netty support for TEZ tasks in Uniffle
- [Subtask] Report HTTP port of Shuffle Server to the coordinator HOT 1
- [Bug] [dashboard] Add exclude node not work HOT 1
- Unify `ShuffleManageClient` config `rss.client.rpc.timeout.ms` and `rss.rpc.client.type.grpc.timeout`
- [Flaky Test] org.apache.uniffle.test.GetReaderTest
- [Improvement] Support Netty for MR integration test
- [Improvement] Support config to exclude verbose operation from rpc audit log HOT 1
- [DOCS] Add Troubleshooting log documents
- [Bug] org.apache.uniffle.common.exception.RssException: There isn't enough shuffle servers
- [Subtask] move the rest-client from cli module to common module
- [Improvement] Fix typo issue as a batch
- [Bug] spark on yarn throws NoClassDefFoundError HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from incubator-uniffle.