Evaluating the Predictive Power of Random Forest Regression on Economic Growth
Keywords:
Random Forest Regression, logarithm of gross domestic, Mean Absolute Error, R-squared, Economic Growth, Ordinary Least Squares.Abstract
This paper examines the probabilities of using the Random Forest regression model to predict the Logarithm of Gross Domestic Product (LGDP), offering a valuable resource in economic forecasting. This model demonstrated outstanding predictive accuracy, with a Mean Squared Error (MSE) of 0.0045, Root Mean Squared Error (RMSE) of 0.0674, and Mean Absolute Error (MAE) of 0.0565. These metrics confirm the model's effectiveness in capturing economic trends. Additionally, an R-squared value of 0.9430 indicates that the model explains 94.30 per cent of the variation in LGDP, highlighting its strength and reliability in economic prediction. To provide a comprehensive assessment, the Random Forest model is compared with the traditional Ordinary Least Squares (OLS) regression, which shows higher error rates (MSE: 0.0089, RMSE: 0.0943, MAE: 0.0782) and a lower R-squared of 0.8975, demonstrating the superiority of the Random Forest approach in data prediction. The Labour Growth Rate (LAGR) emerges as the most influential predictor of economic growth, followed by the Service Sector Index (LSER) and the Industrial Output Index (LIND). It is important to note that agriculture has long been a major determinant of economic performance, further reinforcing the significance of the agricultural sector in national economic outcomes. These findings suggest that policymakers should consider targeted investments and supportive policies in agriculture to promote environmentally sustainable economic growth. The Random Forest regression model proves to be an effective and widely applicable tool for economic forecasting, serving as a foundation for policy planning and decision-making. This study offers a valuable framework for decision-makers to manage economic uncertainties and foster long-term growth by leveraging the method’s capacity to handle complex, non-linear relationships in high-dimensional data.
