Abstrak/Abstract |
Machine learning has been employed for various mapping and modeling tasks using input variables from remote sensing data. For feature selection involving high-dimensionality data, various methods have been developed and incorporated into the machine learning framework to ensure an efficient and optimal computational process. This research aims to assess the accuracy of various feature selection and machine learning methods for estimating forest height using AISA (airborne imaging spectrometer for applications) hyperspectral bands (479 bands) and airborne light detection and ranging (Lidar) height metrics (36 metrics). Feature selection and dimensionality reduction using Boruta (BO), principal component analysis (PCA), simulated annealing (SA), and genetic algorithm (GA) in combination with machine learning algorithms such as multivariate adaptive regression spline (MARS), random forests (RF), support vector regression (SVR) with radial basis function, and extreme gradient boosting (XGB) with trees (XGbtree and XGBdart) and linear (XGBlin) classifiers were evaluated in this study. The results demonstrated that the combinations of BO-XGBdart and BO-SVR delivered the best model performance for estimating forest height by integrating Lidar and hyperspectral data, with the R2 = 0.53 and RMSE = 1.7 m for BO-XGBdart and R2 = 0.51 and RMSE = 1.8 m BO-SVR. Our study also demonstrated the effectivenss of BO for variables selection; it could reduce 95% of the data to select 29 important variables from the initial 516 variables from Lidar metrics and hyperspectral data. |