Sparse regression interaction models for spatial prediction of soil properties in 3D
Authorized Users Only
2018
Authors
Pejović, Milutin
Nikolić, Mladen
Heuvelink, Gerard B. M.

Hengl, Tomislav

Kilibarda, Milan

Bajat, Branislav

Article (Published version)

Metadata
Show full item recordAbstract
An approach for using lasso (Least Absolute Shrinkage and Selection Operator) regression in creating sparse 3D models of soil properties for spatial prediction at multiple depths is presented. Modeling soil properties in 3D benefits from interactions of spatial predictors with soil depth and its polynomial expansion, which yields a large number of model variables (and corresponding model parameters). Lasso is able to perform variable selection, hence reducing the number of model parameters and making the model more easily interpretable. This also prevents overfitting, which makes the model more accurate. The presented approach was tested using four variable selection approaches - none, stepwise, lasso and hierarchical lasso, on four kinds of models - standard linear model, linear model with polynomial expansion of depth, linear model with interactions of covariates with depth and linear model with interactions of covariates with depth and its polynomial expansion. This framework was us...ed to predict Soil Organic Carbon (SOC) in three contrasting study areas: Bor (Serbia), Edgeroi (Australia) and the Netherlands. Results show that lasso yields substantial improvements in accuracy over standard and stepwise regression - up to 50 % of total variance. It yields models which contain up to five times less nonzero parameters than the full models and that are usually more sparse than models obtained by stepwise regression, up to three times. Extension of the standard linear model by including interactions typically improves the accuracy of models produced by lasso, but is detrimental to standard and stepwise regression. Regarding computation time, it was demonstrated that lasso is several orders of magnitude more efficient than stepwise regression for models with tens or hundreds of variables (including interactions). Proper model evaluation is emphasized. Considering the fact that lasso requires meta-parameter tuning, standard cross-validation does not suffice for adequate model evaluation, hence a nested cross-validation was employed. The presented approach is implemented as publicly available sparsereg3D R package.
Keywords:
Spatial prediction / Lasso / Interactions / Nested cross-validation / Soil organic carbon / 3DSource:
Computers & Geosciences, 2018, 118, 1-13Publisher:
- Elsevier Ltd
Funding / projects:
- The role and implementation of the national spatial plan and regional development documents in renewal of strategic research, thinking and governance in Serbia (RS-47014)
- Spatial, environmental, energy and social aspects of developing settlements and climate change - mutual impacts (RS-36035)
- The application of GNSS and LIDAR technology for infrastructure facilities and terrain stability monitoring (RS-36009)
- Automated Reasoning and Data Mining (RS-174021)
DOI: 10.1016/j.cageo.2018.05.008
ISSN: 0098-3004
WoS: 000441857000001
Scopus: 2-s2.0-85047098366
Institution/Community
GraFarTY - JOUR AU - Pejović, Milutin AU - Nikolić, Mladen AU - Heuvelink, Gerard B. M. AU - Hengl, Tomislav AU - Kilibarda, Milan AU - Bajat, Branislav PY - 2018 UR - https://grafar.grf.bg.ac.rs/handle/123456789/943 AB - An approach for using lasso (Least Absolute Shrinkage and Selection Operator) regression in creating sparse 3D models of soil properties for spatial prediction at multiple depths is presented. Modeling soil properties in 3D benefits from interactions of spatial predictors with soil depth and its polynomial expansion, which yields a large number of model variables (and corresponding model parameters). Lasso is able to perform variable selection, hence reducing the number of model parameters and making the model more easily interpretable. This also prevents overfitting, which makes the model more accurate. The presented approach was tested using four variable selection approaches - none, stepwise, lasso and hierarchical lasso, on four kinds of models - standard linear model, linear model with polynomial expansion of depth, linear model with interactions of covariates with depth and linear model with interactions of covariates with depth and its polynomial expansion. This framework was used to predict Soil Organic Carbon (SOC) in three contrasting study areas: Bor (Serbia), Edgeroi (Australia) and the Netherlands. Results show that lasso yields substantial improvements in accuracy over standard and stepwise regression - up to 50 % of total variance. It yields models which contain up to five times less nonzero parameters than the full models and that are usually more sparse than models obtained by stepwise regression, up to three times. Extension of the standard linear model by including interactions typically improves the accuracy of models produced by lasso, but is detrimental to standard and stepwise regression. Regarding computation time, it was demonstrated that lasso is several orders of magnitude more efficient than stepwise regression for models with tens or hundreds of variables (including interactions). Proper model evaluation is emphasized. Considering the fact that lasso requires meta-parameter tuning, standard cross-validation does not suffice for adequate model evaluation, hence a nested cross-validation was employed. The presented approach is implemented as publicly available sparsereg3D R package. PB - Elsevier Ltd T2 - Computers & Geosciences T1 - Sparse regression interaction models for spatial prediction of soil properties in 3D EP - 13 SP - 1 VL - 118 DO - 10.1016/j.cageo.2018.05.008 ER -
@article{ author = "Pejović, Milutin and Nikolić, Mladen and Heuvelink, Gerard B. M. and Hengl, Tomislav and Kilibarda, Milan and Bajat, Branislav", year = "2018", abstract = "An approach for using lasso (Least Absolute Shrinkage and Selection Operator) regression in creating sparse 3D models of soil properties for spatial prediction at multiple depths is presented. Modeling soil properties in 3D benefits from interactions of spatial predictors with soil depth and its polynomial expansion, which yields a large number of model variables (and corresponding model parameters). Lasso is able to perform variable selection, hence reducing the number of model parameters and making the model more easily interpretable. This also prevents overfitting, which makes the model more accurate. The presented approach was tested using four variable selection approaches - none, stepwise, lasso and hierarchical lasso, on four kinds of models - standard linear model, linear model with polynomial expansion of depth, linear model with interactions of covariates with depth and linear model with interactions of covariates with depth and its polynomial expansion. This framework was used to predict Soil Organic Carbon (SOC) in three contrasting study areas: Bor (Serbia), Edgeroi (Australia) and the Netherlands. Results show that lasso yields substantial improvements in accuracy over standard and stepwise regression - up to 50 % of total variance. It yields models which contain up to five times less nonzero parameters than the full models and that are usually more sparse than models obtained by stepwise regression, up to three times. Extension of the standard linear model by including interactions typically improves the accuracy of models produced by lasso, but is detrimental to standard and stepwise regression. Regarding computation time, it was demonstrated that lasso is several orders of magnitude more efficient than stepwise regression for models with tens or hundreds of variables (including interactions). Proper model evaluation is emphasized. Considering the fact that lasso requires meta-parameter tuning, standard cross-validation does not suffice for adequate model evaluation, hence a nested cross-validation was employed. The presented approach is implemented as publicly available sparsereg3D R package.", publisher = "Elsevier Ltd", journal = "Computers & Geosciences", title = "Sparse regression interaction models for spatial prediction of soil properties in 3D", pages = "13-1", volume = "118", doi = "10.1016/j.cageo.2018.05.008" }
Pejović, M., Nikolić, M., Heuvelink, G. B. M., Hengl, T., Kilibarda, M.,& Bajat, B.. (2018). Sparse regression interaction models for spatial prediction of soil properties in 3D. in Computers & Geosciences Elsevier Ltd., 118, 1-13. https://doi.org/10.1016/j.cageo.2018.05.008
Pejović M, Nikolić M, Heuvelink GBM, Hengl T, Kilibarda M, Bajat B. Sparse regression interaction models for spatial prediction of soil properties in 3D. in Computers & Geosciences. 2018;118:1-13. doi:10.1016/j.cageo.2018.05.008 .
Pejović, Milutin, Nikolić, Mladen, Heuvelink, Gerard B. M., Hengl, Tomislav, Kilibarda, Milan, Bajat, Branislav, "Sparse regression interaction models for spatial prediction of soil properties in 3D" in Computers & Geosciences, 118 (2018):1-13, https://doi.org/10.1016/j.cageo.2018.05.008 . .