Grouped Pointwise Convolutions Reduce Parameters in Convolutional Neural Networks
Abstract
In DCNNs, the number of parameters in pointwise convolutions rapidly grows due to the multiplication of the number of filters by the number of input channels that come from the previous layer. Our proposal makes pointwise convolutions parameter efficient via grouping filters into parallel branches or groups, where each branch processes a fraction of the input channels. However, by doing so, the learning capability of the DCNN is degraded. To avoid this effect, we suggest interleaving the output of filters from different branches at intermediate layers of consecutive pointwise convolutions. We applied our improvement to the EfficientNet, DenseNet-BC L100, MobileNet and MobileNet V3 Large architectures. We trained these architectures with the CIFAR-10, CIFAR-100, Cropped-PlantDoc and The Oxford-IIIT Pet datasets. When training from scratch, we obtained similar test accuracies to the original EfficientNet and MobileNet V3 Large architectures while saving up to 90% of the parameters and 63% of the flops.
References
Abadi, M., et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.
Chollet, F., et al. Keras. https://keras.io, 2015.
He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 770–778.
Howard, A., et al. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision (2019), pp. 1314–1324.
Howard, A. G., et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).
Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K. Q. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 4700–4708.
Ioannou, Y., Robertson, D., Cipolla, R., and Criminisi, A. Deep roots: Improving cnn efficiency with hierarchical filter groups. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 1231–1240.
Krizhevsky, A. Learning multiple layers of features from tiny images. Tech. rep., University of Toronto, 2009.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012).
Lin, M., Chen, Q., and Yan, S. Network in network, 2014.
Parkhi, O. M., Vedaldi, A., Zisserman, A., and Jawahar, C. Cats and dogs. In 2012 IEEE conference on computer vision and pattern recognition (2012), IEEE, pp. 3498–3505.
Russakovsky, O., et al. Imagenet large scale visual recognition challenge. International journal of computer vision 115, 3 (2015), 211–252.
Schuler, J., Romani, S., Abdel-Nasser, M., Rashwan, H., and Puig, D. Grouped Pointwise Convolutions Significantly Reduces Parameters in EfficientNet. 10 2021, pp. 383–391.
Schuler, J. P. S. K-cai neural api, 2021. https://doi.org/10.5281/zenodo.5810092.
Simonyan, K., and Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
Singh, D., Jain, N., Jain, P., Kayal, P., Kumawat, S., and Batra, N. Plantdoc: A dataset for visual plant disease detection. In Proceedings of the 7th ACM IKDD CoDS and 25th COMAD (2020), pp. 249–253.
Szegedy, C., et al. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (2015), pp. 1–9.
Tan, M., and Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning (2019), PMLR, pp. 6105–6114.
Xie, S., Girshick, R., Dollar, P., Tu, Z., and He, K. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 1492–1500.
Zeiler, M. D., and Fergus, R. Visualizing and understanding convolutional networks. In European conference on computer vision (2014), Springer, pp. 818–833.
Zhang, T., Qi, G., Xiao, B., and Wang, J. Interleaved group convolutions for deep neural networks. arXiv preprint arXiv:1707.02725 (2017).
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A. Learning deep features for discriminative localization.
Copyright (c) 2022 MENDEL
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
MENDEL open access articles are normally published under a Creative Commons Attribution-NonCommercial-ShareAlike (CC BY-NC-SA 4.0) https://creativecommons.org/licenses/by-nc-sa/4.0/ . Under the CC BY-NC-SA 4.0 license permitted 3rd party reuse is only applicable for non-commercial purposes. Articles posted under the CC BY-NC-SA 4.0 license allow users to share, copy, and redistribute the material in any medium of format, and adapt, remix, transform, and build upon the material for any purpose. Reusing under the CC BY-NC-SA 4.0 license requires that appropriate attribution to the source of the material must be included along with a link to the license, with any changes made to the original material indicated.