Pedestrian crashes are a serious public health and economic issue, and analyzing the main contributing factors that lead to a fatal outcome could be an optimal strategy in a proactive approach to safety. According to the current literature, primarily econometric and Machine Learning Methods can predict crash severity. By analyzing pedestrian crash data from the city of Rome, Italy, this study presents the training and testing of five different models: Logistic Regression (LR), K-Nearest Neighbour (KNN), Decision Tree (DT), Random Forest (RF), and Support Vector Machine (SVM) with a radial kernel. To check for model stability and generalization, thirty (30) random samples are created, and prediction performances are evaluated by comparing the mean value of the F1-score. Gini Variable Importance and SHAP Analysis are also performed on the best model according to the F1-score, which has been identified in the Random Forest Model (0.9816). Pedestrian Age, Gender, and Behaviour, Hour of the day, Season of the year, Vehicle Type, and Location emerge as the most important contributing factors.

Analysis of Contributing Factors influencing Pedestrian Crash Severity: a case study in Rome, Italy

Giuseppe Cappelli
;
Sofia NArdoianni;Mauro D'Apuzzo;
2025-01-01

Abstract

Pedestrian crashes are a serious public health and economic issue, and analyzing the main contributing factors that lead to a fatal outcome could be an optimal strategy in a proactive approach to safety. According to the current literature, primarily econometric and Machine Learning Methods can predict crash severity. By analyzing pedestrian crash data from the city of Rome, Italy, this study presents the training and testing of five different models: Logistic Regression (LR), K-Nearest Neighbour (KNN), Decision Tree (DT), Random Forest (RF), and Support Vector Machine (SVM) with a radial kernel. To check for model stability and generalization, thirty (30) random samples are created, and prediction performances are evaluated by comparing the mean value of the F1-score. Gini Variable Importance and SHAP Analysis are also performed on the best model according to the F1-score, which has been identified in the Random Forest Model (0.9816). Pedestrian Age, Gender, and Behaviour, Hour of the day, Season of the year, Vehicle Type, and Location emerge as the most important contributing factors.
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S235214652500763X-main (1).pdf

solo utenti autorizzati

Tipologia: Versione Editoriale (PDF)
Licenza: Copyright dell'editore
Dimensione 781.01 kB
Formato Adobe PDF
781.01 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11580/123334
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
social impact