Document Type : Research Article

Author

Faculty of Literature and Humanities, Hakim Sabzevari University, Sabzevar, Iran

Abstract

In statistical modeling, prediction and explanation are two fundamental objectives. When the primary goal is forecasting, it is important to account for the inherent uncertainty associated with estimating unknown outcomes. Traditionally, confidence intervals constructed using standard deviations have served as a formal means to quantify this uncertainty and evaluate the closeness of predicted values to their true counterparts. This approach reflects an implicit aim to capture the behavioral similarity between observed and estimated values. However, advances in similarity-based approaches present promising alternatives to conventional variance-based techniques, particularly in contexts characterized by large datasets or a high number of explanatory variables. This study aims to investigate which methods—either traditional or similarity-based—are capable of producing narrower confidence intervals under comparable conditions, thereby offering more precise and informative intervals. The dataset utilized in this study consists of U.S. mega-cap companies, comprising 42 firms. Due to the high number of features, interdependencies among predictors are common; therefore, Ridge Regression is applied to address this issue. The research findings indicate that the σ-based method and LCSS exhibit the highest coverage among the analyzed methods, although they produce broader intervals. Conversely, DTW, Hausdorff, and TWED deliver narrower intervals, positioning them as the most accurate methods, despite their medium coverage rates. Ultimately, the trade-off between interval width and coverage underscores the necessity for context-aware decision-making when selecting similarity-based methods for confidence interval estimation in time series analysis.

Keywords

[1] M. Abdar, F. Pourpanah, S. Hussain, D. Rezazadegan, L. Liu, M. Ghavamzadeh, ... &
S. Nahavandi, A review of uncertainty quantification in deep learning: Techniques, applications and challenges, in *Information Fusion*, 76, 243–297, (2021).
[2] S. Cheng, C. Quilodran-Casas, S. Ouala, A. Farchi, C. Liu, P. Tandeo, ... & R. Ar- ´
cucci, Machine learning with data assimilation and uncertainty quantification for dynamical
systems: a review, in *IEEE/CAA Journal of Automatica Sinica*, 10(6), 1361–1387, (2023).
[3] K. Wang, C. Shen, X. Li, & J. Lu, Uncertainty quantification for safe and reliable autonomous vehicles: A review of methods and applications, in *IEEE Transactions on Intelligent Transportation Systems*, (2025).
[4] P. Zhang, S. Liu, D. Lu, G. Zhang, & R. Sankaran, A prediction interval method for
uncertainty quantification of regression models, Oak Ridge National Lab.(ORNL), Oak Ridge,
TN, (2021).
[5] E. Nikulchev & A. Chervyakov, Prediction intervals: A geometric view, in *Symmetry*,
15(4), 781, (2023).
[6] Y. Cui & M. G. Xie, Confidence distribution and distribution estimation for modern statistical inference, in *Springer Handbook of Engineering Statistics*, 575–592, London: Springer
London, (2023).
[7] C. Thiele & G. Hirschfeld, Confidence intervals and sample size planning for optimal
cutpoints, in *PLOS One*, 18(1), e0279693, (2023).
[8] A. Ganguly & T. Sutter, Optimal learning via moderate deviations theory, arXiv preprint
arXiv:2305.14496, (2023).
[9] W. Wu, T. Zou, L. Zhang, K. Wang, & X. Li, Similarity-Based Remaining Useful Lifetime
Prediction Method Considering Epistemic Uncertainty, in *Sensors*, 23(23), 9535, (2023).
[10] Q. Wei, R. Wang, & C. Y. Ruan, Similarity Measures of Probabilistic Interval Preference
Ordering Sets and Their Applications in Decision-Making, in *Mathematics*, 12(20), 3255,
(2024).
[11] Y. Zhao, Y. Wang, Z. Wang, & J. Wang, The sub-interval similarity: A general uncertainty quantification metric, in *Reliability Engineering & System Safety*, 221, 108316,
(2022).
[12] H. Arslan, M. Aslan, & G.-W. Weber, Distance-based prediction intervals for time series
forecasting, arXiv preprint arXiv:2309.10613, (2023).
[13] G. Shmueli, To explain or to predict?, in *Statistical Science*, 25(3), 289–310, (2010).
[14] L. Wasserman, All of nonparametric statistics, New York, NY: Springer, (2006).
[15] M. Goldani & S. Asadi Tirvan, Sensitivity assessing to data volume for forecasting: introducing similarity methods as suitable ones in feature selection methods, in *Journal of
Mathematics and Modeling in Finance*, 4(2), 115–134, (2024).
[16] J. Y. L. Chan, S. M. H. Leow, K. T. Bea, W. K. Cheng, S. W. Phoong, Z. W. Hong,
& Y. L. Chen, Mitigating the multicollinearity problem and its machine learning approach:
a review, in *Mathematics*, 10(8), 1283, (2022).
[17] M. Arashi, M. Roozbeh, N. A. Hamzah, & M. Gasparini, Ridge regression and its applications in genetic studies, in *PLOS One*, 16(4), e0245376, (2021).
[18] S. Mermi, O. Akku, A. G ¨ okta, & N. G ¨ und ¨ uz¨ , A new robust ridge parameter estimator having no outlier and ensuring normality for linear regression model, in *Journal of Radiation
Research and Applied Sciences*, 17(1), 100788, (2024).
[19] E. Cule & M. De Iorio, Ridge regression in prediction problems: automatic choice of the
ridge parameter, in *Genetic Epidemiology*, 37(7), 704–714, (2013).
[20] H. Quan, D. Srinivasan & A. Khosravi, Short-Term Load and Wind Power Forecasting
Using Neural Network-Based Prediction Intervals, in *IEEE Transactions on Neural Networks
and Learning Systems*, 25(2), 303–315, (2014).
[21] A. M. Simundic, Confidence interval, in *Biochemia Medica*, 18(2), 154–161, (2008).
[22] A. Hazra, Using the confidence interval confidently, in *Journal of Thoracic Disease*, 9(10),
4125, (2017).