Compressing Neural Networks with the Hashing Trick (HashedNets)

hashednets

As deep nets are increasingly used in applications suited for mobile devices, a fundamental dilemma becomes apparent: the trend in deep learning is to grow models to absorb ever-increasing data set sizes; however mobile devices are designed with very little memory and cannot store such large models.

We propose HashedNets, a novel network architecture to reduce and limit the memory overhead of neural networks. Our approach is compellingly simple: we use a hash function to group network connections into hash buckets uniformly at random such that all connections grouped to the i th hash bucket share the same weight value w_i. Our parameter hashing is akin to prior work in feature hashing and requires no additional memory overhead. The backpropagation algorithm can naturally tune the hash bucket parameters and take into account the random weight sharing within the neural network architecture.

Torch Code Download

For any questions or bug reports, please contact Yixin Chen.

Reference

W. Chen, J. Wilson, S. Tyree, K. Weinberger and Y. Chen, Compressing Neural Networks with the Hashing Trick, Proc. International Conference on Machine Learning (ICML-15), Lille, France.(PDF)

Acknowledgement

Wenlin and Yixin are supported by NSF grants CCF-1215302, IIS-1343896, DBI-1356669, CNS-1320921, and a Microsoft Research New Faculty Fellowship. Kilian is supported by NSF grants IIA-1355406, IIS-1149882, EFRI-1137211. The authors thank Wenlin Wang for many helpful discussions.