Compressing Neural Networks with the Hashing Trick (HashedNets)


As deep nets are increasingly used in applications suited for mobile devices, a fundamental dilemma becomes apparent: the trend in deep learning is to grow models to absorb ever-increasing data set sizes; however mobile devices are designed with very little memory and cannot store such large models.

We propose HashedNets, a novel network architecture to reduce and limit the memory overhead of neural networks. Our approach is compellingly simple: we use a hash function to group network connections into hash buckets uniformly at random such that all connections grouped to the i th hash bucket share the same weight value w_i. Our parameter hashing is akin to prior work in feature hashing and requires no additional memory overhead. The backpropagation algorithm can naturally tune the hash bucket parameters and take into account the random weight sharing within the neural network architecture.

Torch Code Download

For any questions or bug reports, please contact Wenlin Chen (wenlinchen AT


W. Chen, J. Wilson, S. Tyree, K. Weinberger and Y. Chen, Compressing Neural Networks with the Hashing Trick, Proc. International Conference on Machine Learning (ICML-15), Lille, France.(PDF)


Wenlin and Yixin are supported by NSF grants CCF-1215302, IIS-1343896, DBI-1356669, CNS-1320921, and a Microsoft Research New Faculty Fellowship. Kilian is supported by NSF grants IIA-1355406, IIS-1149882, EFRI-1137211. The authors thank Wenlin Wang for many helpful discussions.