Heuristic Approach to Multivariate Inverse Prediction Problem using Data Reconciliation
Abstract
Some engineering waste management tasks require a complete data sets of its production. However, these sets are not available in most cases. Whether they are not archiving at all or are unavailable for their sensitivity. This article deals with the issue of incomplete datasets at the microregional level. For estimates, the data from higher territorial units and additional information from the micro-region are used. The techniques used in this estimation are illustrated by an example in the field of waste management. In particular, it is an estimate of the amount of waste in individual municipalities. It is based on recorded waste production at district level and total waste management costs, which is available at a municipal level. To estimate the waste production, combinations of linear regression models with random forest models were used, followed by correction by quadratic and nonlinear optimization models. Such task could be seen as a multivariate version of inverse prediction (or calibration) problem, which is not solvable analytically. To test this approach, data for 2010 - 2015 measured in the Czech Republic were used.
References
Smejkalova, V., Somplak R., Nevrly V., Pavlas M.: Design and decomposition of waste prognostic model with hierarchical structures. In: R. Matousek (ed.), In MENDEL, (MENDEL 2018), vol. 24, No. 1, Brno University of Technology, Brno (June 18), ISSN 1803-3814. Manuscript submitted for publication.
Beigl, P., Lebesorger S., Salhofer S.: Modelling municipal solid waste generation: A review. Waste Management 28(1) 200–214 (2008). DOI https://doi.org/10.1016/j.wasman.2006.12.011
Osborne, C.: Statistical Calibration: A Review. International Statistical Review / Revue Internationale De Statistique 59(3), 309–36 (1991). DOI 10.2307/1403690
Nevrly, V., Somplak R., Popela P., Pavlas M., Osicka O., Kudela J.: Heuristic challenges for spatially distributed waste production identification problems. In: R. Matousek (ed.), In MENDEL, (MENDEL 2016), vol. 22, No. 1, pp. 109–116, Brno University of Technology, Brno (June 16), ISSN 1803-3814
Narasimhan, S., Jordache, C.: An Intelligent Use of Process Data, Data Reconciliation and Gross Error Detection. Gulf Publishing Co., Houston (2000)
McCullagh, P., Nelder, J. A.: Generalized linear models, second edn. Chapman & Hall, New York (1989)
Pavlas, M., Somplak R., Smejkalova V., Nevrly V., Szasziov´a L., Kudela J., Popela P.: Spatially distributed production data for supply chain models - Forecasting with hazardous waste. Journal of Cleaner Production 16, 1317–1328 (2017). DOI 10.1016/j.jclepro.2017.06.107
R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/
Kalmar, M., Nilsson, J.: The art of forecasting – an analysis of predictive precision of machine learning models. PhD thesis, Uppsala University (2016)
MENDEL open access articles are normally published under a Creative Commons Attribution-NonCommercial-ShareAlike (CC BY-NC-SA 4.0) https://creativecommons.org/licenses/by-nc-sa/4.0/ . Under the CC BY-NC-SA 4.0 license permitted 3rd party reuse is only applicable for non-commercial purposes. Articles posted under the CC BY-NC-SA 4.0 license allow users to share, copy, and redistribute the material in any medium of format, and adapt, remix, transform, and build upon the material for any purpose. Reusing under the CC BY-NC-SA 4.0 license requires that appropriate attribution to the source of the material must be included along with a link to the license, with any changes made to the original material indicated.