monstertail / blog Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 0.0 0 B

My blogs

blog's Introduction

Hi there 👋

🔭Want to know more about me? 😄Please visit My Blog

blog's People

Watchers

blog's Issues

System for Machine Learning

News

[01/06/2022] I got B.Eng of E.E. from Zhejiang University with the Outstanding Thesis Award.

[CS Research] Research and Publication

[CS Research] Research and Publication: Dr. Mu Li said, “An excellent Ph.D. should not pursue the quantity of publications, but should pursue the monotonically increasing quality of each paper.” Sounds quite challenging! I hope my first publication can occur at a good system conference.

[CS] When theory meets practice

[CS research] When theory meets practice. I am really into this idea: "the problems that arise when addressing systems’ “pain points” can serve as a compass to guide us to the exciting new theory", which is from a distributed system lab at Cornell University. I have read a couple of papers from this lab related to distributed systems(blockchains)， and their papers really do what they say.

[CS Research]: Find the meaningful problems!

[CS Research]: Find the meaningful problems. After reading a couple of papers in top system conferences like OSDI, SOSP, VLDB, SIGMOD, etc, I came to realize "finding the real and meaningful problems in the system" is difficult but necessary for system research.

About

Brief Bio

I got B.Eng of E.E. from Zhejiang University in July 2022. I am currently pursuing a CS Ph.D. degree at EPFL on a fellowship.

Under the supervision of Professor Zeke Wang, I was honored to receive the Outstanding Graduation Thesis Award for my research on hardware acceleration of graph neural networks (GNNs).

My passion lies in the field of computer science, particularly in distributed systems for X. X could be machine learning, blockchains, databases, etc.

I am seeking professors who can provide support for my research interests after my first year of study at EPFL.

News

[22/02/2023] I start to share paper readings for distributed systems ( mainly for machine learning and blockchains) in Zhihu(知乎).

[01/06/2022] I got B.Eng of E.E. from Zhejiang University.

[31/05/2022] My thesis won the Outstanding Thesis Award. Thank Prof. Zeke Wang for the supervision!

What happened recently.

You can find my CV here.

## Publications

Under construction...

Contact Me

Email: jinwei.yao @ epfl.ch

[OSDI'22]Ekko

EKKO

Notes in Chinese： In Zhihu（知乎）

Notes in English： In my Notion

How to read a paper:

Step 1: Keep in mind

-What problem does this paper try to solve?
-Why is this an important and hard problem?
-Why can’t previous work solve this problem?
-What is novel in this paper?
-Does it show good results?

Step 2: Summarize

Step 2: Summarize
- Summary for high-level ideas
  - Bypass long-latency update steps by allowing model updates to be immediately disseminated to all inference clusters
- Problems/Motivations: what problem does this paper solve?
  - Deep Learning Recommender Systems (DLRSs) need to update models at low latency(achieve latency-related Service-Level Objectives (SLOs)), while existing techniques compromise latency or SLO performance
- Challenges: why is this problem hard to solve?
  - WANs which have limited bandwidths and heterogeneous network paths make it difficult to disseminate massive model updates efficiently
  - Network congestion and biased model updates may harm SLOs by delaying critical updates and reducing accuracy, respectively.
- Methods: what are the key techniques in the paper?
  - efficient peer-to-peer model update dissemination algorithm:
    - 1)shard version→ similar to version control in distributed databases;
    - 2)Parameter update cache→ make advantage of the sparsity in the updates ;
    - 3)WAN-optimisation→allow parameter servers to prioritise bandwidth-affluent intra-DC network paths over bandwidth-limited inter-DC WAN
  - SLO-aware Model Update Scheduler
    - →to handle network congestion by prioritizing updates
  - Inference Model State Manager
    - Protect SLOs from biased updates ←monitors the SLOs of inference models and rollbacks the models

[OSDI'23]AdaEmbed

[OSDI'23]AdaEmbed: Adaptive Embedding for Large-Scale Recommendation Models

Notes in Chinese： In Zhihu（知乎）
Notes in English： In my Notion
How to read a paper:
Step 1: Keep in mind
-What problem does this paper try to solve?
-Why is this an important and hard problem?
-Why can’t previous work solve this problem?
-What is novel in this paper?
-Does it show good results?

Step 2: Summarize
- Summary for high-level ideas
  - Reduce the size of embeddings needed for the same DLRM accuracy via in-training embedding pruning
  - < = > For the given embedding size, AdaEmbed scalably identifies and retains embeddings that have larger importance to model accuracy at particular times during training.
- Problems/Motivations: what problem does this paper solve?
  - While more embedding rows typically enable better model accuracy by considering more feature instances, they lead to large deployment cost and slow model execution.
  - Key insight is that the access patterns and weights of different embeddings are heterogeneous across embedding rows, and dynamically change over the training
    process, implying varying embedding importance with respect to model accuracy
- Challenges: why is this problem hard to solve?
  - DLRMs often have stringent throughput and latency requirements for (online) training and inference, but gigantic embeddings make computation , communication and memory optimizations challenging
    - To achieve desired model throughput, practical deployments often have to use hundreds of GPUs to hold embeddings.
  - Designing better embeddings (e.g., number of per-feature embedding rows and which embedding weights to retain) remains challenging because the exploration space increases with larger embeddings and requires intensive manual efforts
- Methods: what are the key techniques in the paper?
  - AdaEmbed considers embeddings with higher runtime access frequencies and larger training gradients to be more important, and it dynamically prunes less important embeddings at scale to automatically determine per-feature embeddings.
    - challenge 1: Identifying important embeddings out of billions is non-trivial.
      - Embedding Monitor: Identify Important Embedding(by access frequency and L2-norm of gradients)
    - challenge 2: Enforcing in-training pruning after identifying important embeddings is not straightforward either
      - AdaEmbed Coordinator: Prune at Right Time(trade-offs between pruning overhead and quality)
      - Memory Manager: Prune Weights at Scale( Virtually Hashed Physically Indexed is used to reduce memory reallocation)

[Make an important decison]

[Make an important decsion] I just made an important decision.

From now on, I will be cautious in choosing my life, and I will no longer allow myself to be easily swayed by various temptations.

I have heard the call from afar in my heart, and I no longer need to look back and concern myself with the various controversies and discussions behind me.

I have no time to consider the past; I'm moving forward.

Milan Kundera, "The Unbearable Lightness of Being"

ZJU

link: //www.zju.edu.cn/english/
cover: //cdn.jsdelivr.net/gh/Monstertail/BlogAssets/zjulogo.png
avatar: //cdn.jsdelivr.net/gh/Monstertail/BlogAssets/zjuminilogo.png

EPFL

link: //www.epfl.ch/en/
cover: //cdn.jsdelivr.net/gh/Monstertail/BlogAssets/EPFL_cover.png
avatar: //cdn.jsdelivr.net/gh/Monstertail/BlogAssets/epflLogo.jpg

[OSDI'20]Kungfu

Kungfu: Making Training in Distributed Machine Learning Adaptive

Notes in Chinese： In Zhihu（知乎）

Notes in English： In my Notion

How to read a paper:

Step 1: Keep in mind

-What problem does this paper try to solve?
-Why is this an important and hard problem?
-Why can’t previous work solve this problem?
-What is novel in this paper?
-Does it show good results?

Step 2: Summarize

Step 2: Summarize
- Summary for high-level ideas
  - design a distributed ML system that supports adaptation.
- Problems/Motivations: what problem does this paper solve?
  - Empirical parameter tuning is Dataset-specific,Model-specific, and Cluster-specific.
  - Adapt parameters are hardto realise.Problems in the previous systems:
  - 1）No built-in mechanisms for adaptation
  - 2）High monitoring overhead.
  - 3）Expensive state management under change
- Challenges: why is this problem hard to solve?
  - How to support different types of adaptation?e.g. AutoScaling→support only one type of adaptation
  - How to adapt based on large volume of monitoring data?e.g.MLFlow→computes
    statistical metrics over this amount of data→ consumes substantial compute resources and network bandwidth
  - How to change parameters of stateful workers?In existing systems, users typically mustcheckpoint and restore all state when changing configuration parameters→ can take hundreds of seconds
- Methods: what are the key techniques in the paper?
  - Expressing adaptation policies→adapt configuration parameters based on monitored metrics
  - Embedding monitoring operators inside dataflow→asychronous collective communication layer+ embed its functions as monitoring operators to dataflow graph+NCCL for communication layer acceleration
  - Distributed mechanisms for parameter adaptation→Decouple system parameters with dataflow state

Publications

Hope I can get the first paper accepted to a good conference soon...

Book_template

ES6 标准入门

author: 阮一峰
published: 2017-09-01
progress: 正在阅读...
rating: 5,
postTitle: ES6 标准入门
postLink: //chanshiyu.com/#/post/12
cover: //chanshiyu.com/yoi/2019/ES6-标准入门.jpg
link: //www.duokan.com/book/169714
description: 柏林已经来了命令，阿尔萨斯和洛林的学校只许教 ES6 了...他转身朝着黑板，拿起一支粉笔，使出全身的力量，写了两个大字：“ES6 **！”（《最后一课》）。

Biography

About Me

Greetings to all! My name is Jinwei Yao and I go by the nickname 菁葳 (Gin-Way) in Chinese.

I am currently pursuing a CS Ph.D. degree at EPFL on a fellowship.

I completed my undergraduate studies at Zhejiang University in July 2022.

Under the supervision of Professor Zeke Wang, I was honored to receive the Outstanding Graduation Thesis Award for my research on hardware acceleration of graph neural networks (GNNs).

My passion lies in the field of computer science, particularly in distributed systems for X. X could be machine learning, blockchains, databases, etc.

I am seeking professors who can provide support for my research interests after my first year of study at EPFL.

感谢蝉时雨dl提供的template！！！

致谢蝉时雨DL提供的template ！！！！
膜。

monstertail / blog Goto Github PK

blog's Introduction

Hi there 👋

blog's People

Watchers

blog's Issues

System for Machine Learning

News

Brief Bio

News

Contact Me

EKKO

Notes in Chinese： In Zhihu（知乎）

Notes in English： In my Notion

Step 1: Keep in mind

Step 2: Summarize

[OSDI'23]AdaEmbed: Adaptive Embedding for Large-Scale Recommendation Models

[Make an important decsion] I just made an important decision.

ZJU

EPFL

Kungfu: Making Training in Distributed Machine Learning Adaptive

Notes in Chinese： In Zhihu（知乎）

Notes in English： In my Notion

Step 1: Keep in mind

Step 2: Summarize

Publications

ES6 标准入门

About Me

Recommend Projects

Recommend Topics

Recommend Org