kevin-shihello-world,github

-starttransformers

🌱StartTransformer_1 is a new transformer structure build with time-wise normalization and a new way to allocate params for FFN in order to train a transformer-kind structure with much lower params stably and its basic idea can be used on developing a lot of another stuctructures

chineseglue

Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard

cut-shortcut

Cut-shortcut: I consided GNN and other model may use wrong shortcut instead of really learned a good representation, so I add some random-projected target information to a projected matrix in my model and minimize the similarity between the output and the model without target information.My experiment shows it works and it is possible to adopt this

deepspeed-compress-comm

DeepSpeed-Compress-comm using inverse FFT and a new kind of diffusion training to compress Tensors in all_reduce in muti-gpu inference to accelerate the speed.

gfnet-pytorch

A general framework for inferring CNNs efficiently. Reduce the inference latency of MobileNet-V3 by 20% on an iPhone XS Max without sacrificing accuracy.

kevin-shihello-world

Config files for my GitHub profile.

startbert

starttransformer_0

🌱StartTransformer is a new transformer structure build with time-wise normalization and a new way to allocate params for FFN in order to train a transformer-kind structure with much lower params stably and its basic idea can be used on developing a lot of another stuctructures

up-downformer

Up-DownFormer: This kind of transformer architecture is mostly a newly decided GNN decided in this work, And I've tested this kind of gene and on normal GNN test and get superior result and thishe whole new transformer architecture on a NLP task and it got comparable result as the formal all self-attention ones with much lower computation

vllm-compress-comm

vllm-compress-comm use inverse FFT and a new kind of training strategy on training of a new kind of diffution model to compress those Tensors transport among GPUs in accelerating multi-GPU inferencing.

kevin-shihello-world Goto Github PK

kevin-shihello-world's Projects

-starttransformers

chineseglue

cut-shortcut

deepspeed-compress-comm

gfnet-pytorch

kevin-shihello-world

startbert

starttransformer_0

up-downformer

vllm-compress-comm

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent