Fast Block Sparse Matrices for Pytorch
Enables networks which are both smaller and faster to let anybody use neural networks in production at low cost, and to improve the experience for the end ...
sparsity pytorch gpu efficiency
Movement Pruning: Adaptive Sparsity by Fine-Tuning
We propose the use of movement pruning, a simple, deterministic first-order weight pruning method that is more adaptive to pretrained model fine-tuning.
pruning movement-pruning sparsity adaptive-sparsity
