Grouped Pointwise Convolutions Reduce Parameters in Convolutional Neural Networks

Joao Paulo Schwarz Schuler; Santiago Romani; Mohamed Abdel-Nasser; Hatem Rashwan; Domenec Puig

doi:10.13164/mendel.2022.1.023

Joao Paulo Schwarz Schuler Universitat Rovira i Virgili, Spain https://orcid.org/0000-0002-7582-0711
Santiago Romani Universitat Rovira i Virgili, Spain
Mohamed Abdel-Nasser Universitat Rovira i Virgili, Spain
Hatem Rashwan Universitat Rovira i Virgili, Spain
Domenec Puig Universitat Rovira i Virgili, Spain

DOI: https://doi.org/10.13164/mendel.2022.1.023

Keywords: EfficientNet, Deep Learning, Computer Vision, CNN, DCNN

Abstract

In DCNNs, the number of parameters in pointwise convolutions rapidly grows due to the multiplication of the number of filters by the number of input channels that come from the previous layer. Our proposal makes pointwise convolutions parameter efficient via grouping filters into parallel branches or groups, where each branch processes a fraction of the input channels. However, by doing so, the learning capability of the DCNN is degraded. To avoid this effect, we suggest interleaving the output of filters from different branches at intermediate layers of consecutive pointwise convolutions. We applied our improvement to the EfficientNet, DenseNet-BC L100, MobileNet and MobileNet V3 Large architectures. We trained these architectures with the CIFAR-10, CIFAR-100, Cropped-PlantDoc and The Oxford-IIIT Pet datasets. When training from scratch, we obtained similar test accuracies to the original EfficientNet and MobileNet V3 Large architectures while saving up to 90% of the parameters and 63% of the flops.

References

Abadi, M., et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.

Chollet, F., et al. Keras. https://keras.io, 2015.

He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 770–778.

Howard, A., et al. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision (2019), pp. 1314–1324.

Howard, A. G., et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).

Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K. Q. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 4700–4708.

Ioannou, Y., Robertson, D., Cipolla, R., and Criminisi, A. Deep roots: Improving cnn efficiency with hierarchical filter groups. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 1231–1240.

Krizhevsky, A. Learning multiple layers of features from tiny images. Tech. rep., University of Toronto, 2009.

Krizhevsky, A., Sutskever, I., and Hinton, G. E. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012).

Lin, M., Chen, Q., and Yan, S. Network in network, 2014.

Parkhi, O. M., Vedaldi, A., Zisserman, A., and Jawahar, C. Cats and dogs. In 2012 IEEE conference on computer vision and pattern recognition (2012), IEEE, pp. 3498–3505.

Russakovsky, O., et al. Imagenet large scale visual recognition challenge. International journal of computer vision 115, 3 (2015), 211–252.

Schuler, J., Romani, S., Abdel-Nasser, M., Rashwan, H., and Puig, D. Grouped Pointwise Convolutions Significantly Reduces Parameters in EfficientNet. 10 2021, pp. 383–391.

Schuler, J. P. S. K-cai neural api, 2021. https://doi.org/10.5281/zenodo.5810092.

Simonyan, K., and Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

Singh, D., Jain, N., Jain, P., Kayal, P., Kumawat, S., and Batra, N. Plantdoc: A dataset for visual plant disease detection. In Proceedings of the 7th ACM IKDD CoDS and 25th COMAD (2020), pp. 249–253.

Szegedy, C., et al. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (2015), pp. 1–9.

Tan, M., and Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning (2019), PMLR, pp. 6105–6114.

Xie, S., Girshick, R., Dollar, P., Tu, Z., and He, K. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 1492–1500.

Zeiler, M. D., and Fergus, R. Visualizing and understanding convolutional networks. In European conference on computer vision (2014), Springer, pp. 818–833.

Zhang, T., Qi, G., Xiao, B., and Wang, J. Interleaved group convolutions for deep neural networks. arXiv preprint arXiv:1707.02725 (2017).

Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A. Learning deep features for discriminative localization.