paper lists
inference
train
PS
parallel training
- PipeDream: Generalized Pipeline Parallelism for DNN Training
- https://insujang.github.io/2022-06-11/parallelism-in-distributed-deep-learning/
communication
- Compressed Communication for Distributed Deep Learning: Survey and Quantitative Evaluation
- Efficient Sparse Collective Communication and its application to Accelerate Distributed Deep Learning
quantization
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
https://blog.csdn.net/qq_19784349/article/details/82883271
$r=S(q-Z)$ => $q=round(\frac{r}{S}+Z)$; S - scale, Z - zero-point
$\Large{S=\frac{val_{max}-val_{min}}{2_{bit_length}-1}}$
$\Large{Z=round(-\frac{val_{min}}{S})}$
llm
vla
title | summary | link |
---|---|---|
Galaxea Open-World Dataset and G0 Dual-System VLA Model |
rl
title | summary | link |
---|---|---|