- Research
- Open access
- Published:
Interpretable machine learning model for outcome prediction in patients with aneurysmatic subarachnoid hemorrhage
Critical Care volume 29, Article number: 36 (2025)
Abstract
Background
Aneurysmatic subarachnoid hemorrhage (aSAH) is a critical condition associated with significant mortality rates and complex rehabilitation challenges. Early prediction of functional outcomes is essential for optimizing treatment strategies.
Methods
A multicenter study was conducted using data collected from 718 patients with aSAH who were treated at five hospitals in Japan. A deep learning model was developed to predict outcomes based on modified Rankin Scale scores using pretherapy clinical data collected from admission to the initiation of physical therapy. The model’s performance was assessed using the area under the curve, and interpretability was enhanced using SHapley Additive exPlanations (SHAP). Logistic regression analysis was also performed for further validation.
Results
The area under the receiver operating characteristic curve of the model was 0.90, with age, World Federation of Neurosurgical Societies grade, and higher brain dysfunction identified as key predictors. SHAP analysis supported the importance of these features in the prediction model, and logistic regression analysis further confirmed the model’s robustness.
Conclusions
The novel deep learning model demonstrated strong predictive performance in determining functional outcomes in patients with aSAH, making it a valuable tool for guiding early rehabilitation strategies.
Background
Aneurysmal subarachnoid hemorrhage (aSAH) remains one of the most severe types of stroke. It has a high mortality rate, with one-quarter of patients dying before reaching the hospital or in the emergency room [1]. Advancements in stroke management have improved outcomes in some populations. However, aSAH continues to pose significant challenges, particularly in Japan, which has the highest aging population worldwide [2, 3]. Despite these challenges, a subset of patients, including those with severe initial presentations such as grade V aSAH, can achieve favorable outcomes with appropriate treatment [4]. Early initiation of physical therapy improves recovery in patients with aSAH [5,6,7]. However, treatment strategies must be tailored based on individual patient needs, as emphasized in existing guidelines and studies [8,9,10]. It is essential to accurately predict outcomes in patients with aSAH for guiding personalized care and optimizing resource allocation. Existing tools, such as the SAFIRE classification [11] and SAHIT predictive model [12], have significantly advanced outcome prediction in patients with aSAH. However, these models primarily focus on long-term outcomes, such as 2–3-month functional recovery, or are tailored according to specific patient populations. To address the unique needs of patients transitioning from acute care to rehabilitation, standardized, accurate models tailored to this phase of recovery are still required.
Machine learning has advanced significantly in recent years and has been useful for accurately predicting outcomes in several conditions, including stroke [13, 14]. Several studies have applied machine learning to aSAH [15, 16]; however, most models are based on data from single institutions and lack reproducibility. Furthermore, the black-box nature of several machine learning algorithms poses challenges in clinical settings, where the interpretability of complex models is essential for informed decision-making. SHapley Additive Explanations (SHAP), which is inspired by game theory, offers a promising approach to elucidate the contributions of individual variables in machine learning models. It enables evaluating how each characteristic variable affects predictions [17]. A large patient dataset has examined associations between salvage therapies, such as balloon angioplasty, intraarterial infusion of vasodilators, and induction of hypertension as well as improved outcomes after 3 months of post-SAH vasospasm [18]. However, no reports have assessed the association between early treatment data and rehabilitation outcomes, emphasizing the importance of setting goals at the intervention stage.
The present study aimed to predict discharge outcomes in patients with aSAH using data obtained before the initiation of physical therapy. It focused on short-term recovery during transition from the ICU to the rehabilitation center. Unlike models that target long-term outcomes, such as 2–3-month functional recovery, this study emphasized on immediate outcomes at discharge to guide early clinical decision-making. A deep learning model, specifically a deep neural network, was used to investigate complex, nonlinear associations between baseline patient characteristics and functional outcomes, as measured using the modified Rankin Scale (mRS) [19]. SHAP values were used to interpret the effects and interactions of predictors at both the cohort (global) and individual (local) levels, thereby enhancing the model’s transparency and clinical relevance. By targeting discharge outcomes, this study aimed to support early rehabilitation planning, thereby offering robust, data-driven evidence for critical decision-making during the intervention stage.
Methods
Dataset
This multicenter collaborative study used data from the medical records of five hospitals in Japan and was part of the Safety and Efficacy of Acute Rehabilitation in Subarachnoid Hemorrhage study [20]. This study was performed in accordance with the principles of the Declaration of Helsinki, and it followed the Strengthening the Reporting of Observational Studies in Epidemiology guidelines.
Data were collected from 718 patients diagnosed with subarachnoid hemorrhage who had undergone surgical treatment and physical therapy between April 2014 and March 2019. The exclusion criteria were as follows: (1) patients who died before starting physical therapy, (2) those with recurrent aSAH, (3) those who had undergone reoperation, (4) those whose aneurysm treatment occurred > 72 h after aSAH onset, (5) those aged < 20 years, and (6) those with an mRS score of 2–5 before aSAH onset. For excluded patients, follow-up data were not systematically recorded. Information on the withdrawal of life-sustaining therapy (WLST) was not systematically recorded and therefore was not available for analysis in this study. To address missing data, individualized strategies were applied based on the variable type. For continuous variables, mean imputation was performed specifically for stress index (SI), which accounted for missing data in 29 entries, representing approximately 5.2% of the dataset. This approach ensured that key predictive variables were retained in the analysis. For categorical variables, one-hot encoding was used, with missing values imputed as zeros in the corresponding one-hot encoded columns. This approach was applied to variables such as World Federation of Neurosurgical Societies (WFNS) and modified Fisher scale (mFS), thereby addressing missing data in 15 categorical variables, which represented approximately 2.7% of the dataset.
Definition of variables (pretherapy clinical data)
To ensure clarity and reproducibility, the variables used in this study were defined as follows: SI was defined as the ratio of blood glucose level to serum potassium level (SI = glucose level/potassium level). This index reflected physiological stress response in patients with aSAH and was associated with catecholamine release. In this study, SI was calculated at admission to quantify acute stress response and evaluate its association with disease severity and outcomes. Symptomatic cerebral vasospasm was defined as the presence of new-onset focal neurological deficits, confirmed based on radiological evidence of vasospasm on cerebral angiography, computed tomography angiography (CTA), or magnetic resonance angiography (MRA), in the absence of other identifiable causes such as hydrocephalus, rebleeding, and cerebral infarction. Intraventricular hemorrhage is defined as the presence of blood within the ventricular system of the brain, detected via CT scan or MRI during the acute phase of aSAH. All complications were adjudicated by attending physicians or a multidisciplinary team based on standardized clinical criteria. Pulmonary disease was defined as the presence of pneumonia confirmed on chest radiography, clinical signs (e.g., fever and productive cough), and elevated inflammatory markers. Perioperative complications were defined as any adverse events occurring during or immediately after the surgical procedure, including surgical site infections, cardiovascular instability, and significant bleeding. Higher brain dysfunction was evaluated by a multidisciplinary team, including physicians, nurses, and speech-language pathologists. The key features assessed included attention deficit, aphasia, and agnosia. The evaluations adhered to the established clinical practice guidelines. Data on hydrocephalus and mechanical ventilation duration were also collected as part of this study. Hydrocephalus was defined based on radiological findings, such as ventricular enlargement on CT scan or MRI, combined with clinical indications for cerebrospinal fluid drainage. Mechanical ventilation duration was documented in days, as recorded in the medical records. However, these variables were excluded from the analysis due to specific reasons. Hydrocephalus typically develops during the subacute phase. However, this study focused on the hyperacute phase, i.e., from onset to the initiation of rehabilitation. Furthermore, mechanical ventilation duration was recorded in days rather than hours, which limited the precision required for a more detailed analysis. These exclusions were made to maintain the alignment with the study’s primary focus and ensure the robustness and consistency of the analysis.
Outcome
The primary outcome of this study was the mRS score at the time of hospital discharge. This outcome was selected due to its clinical relevance, as discharge mRS serves a critical indicator for planning post-discharge rehabilitation strategies. mRS scores were assessed by trained evaluators, including attending physicians and rehabilitation specialists, using standardized criteria. Evaluations were conducted within 24 h prior to hospital discharge to identify the patients’ functional status at this key transition point. For analysis, the mRS scores were dichotomized into good outcome (mRS score of 0–2) and poor outcome (mRS score of 3–6).
Machine learning model development and testing, SHAP analysis, and statistical analysis
Descriptive statistics analyses were performed on the extracted items and classified into two groups based on mRS score, using the chi-square test and the Mann–Whitney U test. A supervised machine learning classification method was introduced to predict good/poor prognosis based on mRS. Using deep learning, input parameters were examined alongside significant items from the descriptive statistical analysis and those used in the SAFIRE classification. In our study, the feature of aneurysm size used in the SAFIRE classification was not available. Therefore, other features were examined. A feedforward neural network with four hidden layers and one output layer was constructed. The output layer had two channels, which performed a two-class classification task to estimate good and poor prognoses. The model was built using the Pytorch library.
The training dataset was randomly divided into 450 training and 113 evaluation samples. The model was trained with 50 epochs, a batch size of 32, and a learning rate of 0.001, using the Adam optimizer. The performance of the evaluation data for each epoch was assessed, and the model with the best epoch was used for subsequent model evaluation. The area under the curve (AUC) was calculated as a scoring metric. We conducted additional internal validation using k-fold cross-validation (k = 5). This approach divided the dataset into five folds, with each fold serving as the validation set and the remaining four folds as the training set. This method allowed us to evaluate the model’s performance across multiple subsets of the data and mitigate the risk of overfitting. The performance metrics for each fold, which include the AUC, accuracy, recall, precision, and F1 score, were calculated.
To analyze how the constructed model responded to the task, SHAP was used to analyze the effect of each input data on the final prediction of good or poor prognosis. SHAP values quantitatively represent whether a certain input variable increases or decreases the final prediction result.
Finally, a logistic regression analysis was performed with mRS score as the dependent variable and the features used in the ML model as the independent variables. A correlation matrix was created in advance when inputting the features, and there was no strong correlation between the independent variables (r > 0.80). A multiple logistic regression analysis was performed using the variable forward method based on the likelihood ratio. The suitability of the regression equation was evaluated using the Hosmer–Lemeshow test. The accuracy of the model was also evaluated by comparing the estimated values with the observed values and using the AUC from the receiver operating characteristic (ROC) curve. Statistical analysis was performed using GraphPad Prism versions 8 and 9 (GraphPad Software Inc.) and the Statistical Package for the Social Sciences software version 20 (IBM). All tests were two-sided, and a p value of < 0.05 was considered statistically significant. Statistical planning and analysis were performed after consulting a biostatistician.
Results
In total, 563 patients who met the inclusion criteria were analyzed. Among them, 307 (54.5%) patients had a good prognosis (mRS score of 0–2 upon discharge), whereas 256 (45.5%) had a poor prognosis (mRS score of 3–6) (Fig. 1). Table 1 shows the results of the univariate analysis of the two patient groups classified based on their prognosis at discharge. Compared with the poor prognosis group, the good prognosis group was significantly younger and had a higher WFNS grade. They also had lower mRS scores and a lower proportion of patients with aneurysms located in the anterior cerebral artery, middle cerebral artery, and vertebral artery. Conversely, the good prognosis group had a higher proportion of patients with aneurysms in the anterior communicating artery, basal artery, internal carotid artery, internal carotid-posterior communicating artery, and posterior cerebral artery. In addition, the good prognosis group had a significantly lower SI and lower incidence rates of cerebral hemorrhage and symptomatic cerebral vasospasm than the poor prognosis group. The good prognosis group experienced fewer complications (e.g., pneumonia and perioperative issues) and had a lower proportion of patients with higher brain dysfunction (agnosia, aphasia) than the poor prognosis group. Furthermore, the good prognosis group had shorter durations from onset to the start of physical therapy, mobilization, and walking, as well as a reduced length of hospital stay than the poor prognosis group.
Flowchart of patient selection and modified Rankin scale outcomes in aSAH cohort.This flowchart shows patient selection from 718 aSAH cases, resulting in 563 included cases. Exclusions were based on criteria such as treatment delays, recurrence, and pre-existing disability. Patients were classified by mRS into favorable (0–2, n = 307) and unfavorable (3–6, n = 256) outcomes
Machine learning model performance
The performance of the machine learning models was evaluated using AUC values across multiple feature sets and validation approaches. For the SEASAH1 model, which only included statistically significant items from the descriptive statistics, the average AUC across k-fold cross-validation (k = 5) was 0.88 ± 0.02. The SEASAH2 model, which included all available features, achieved an average AUC of 0.88 ± 0.03. The SEASAH3 model, which was developed using optimized feature selection and hyperparameter tuning, had the best performance with an average AUC of 0.89 ± 0.02 across folds. For comparison, the SAFIRE study data (modified SAFIRE study) were examined using features that excluded aneurysm size (not available in the database) and achieved an average AUC of 0.84 ± 0.02 (Fig. 2).
The performance of each model. Figure 2 compares the ROC curves for the SEASAH models (SEASAH1, SEASAH2, SEASAH3) and the modified SAFIRE model, evaluated using k-fold cross-validation (k = 5). Each solid line represents the mean ROC curve calculated across the five folds, while the shaded areas indicate the standard deviation (SD). The x-axis represents the False Positive Rate, while the y-axis shows the True Positive Rate. Each curve illustrates the trade-off between sensitivity and specificity for each model
In addition to AUC, SEASAH3 was further evaluated using a confusion matrix and additional performance metrics to conduct a comprehensive assessment of its predictive capabilities. The internal test split for SEASAH3 produced a confusion matrix with 54.4 true positives, 36 true negatives, 7 false positives, and 15.2 false negatives. From this matrix, the key performance metrics were calculated: accuracy, 0.83 ± 0.01; recall, 0.89 ± 0.04; precision, 0.78 ± 0.07; and F1 score, 0.83 ± 0.04. These metrics showed the robustness and reliability of SEASAH3 in predicting patient outcomes. The confusion matrices and performance metrics for SEASAH1, SEASAH2, and the modified SAFIRE model are provided in the Supplementary Material for reference (Table S1). This comprehensive evaluation emphasized that SEASAH3 was the most reliable model in this study, with balanced predictive performance and clinical applicability. Overall impact of extracted items on model output and distribution of extracted items for each patient (SHAP values).
SHAP analysis was used for ranking the contribution of each feature. As shown in Fig. 3, age was the most significant predictor, followed by whether or not the WFNS grade is 1, presence or absence of symptomatic cerebral vasospasm, presence or absence of intracerebral hemorrhage, and whether or not the WFNS grade is 5. Therefore, age had a more substantial impact on mRS scores than other factors, while WFNS grade proved more predictive of prognosis in less severe cases (WFNS grade of 1) than in more severe cases (WFNS grade of 5).
Absolute mean SHAP values for the impact of every features on model predictions. The y-axis lists the features, such as age, vasospasm, intraventricular hemorrhage, and various WENS and modified Fisher scale scores, while the x-axis represents the mean SHAP values, indicating the average impact of each feature on the model output. Higher SHAP values denote greater importance in the model’s decision-making process
Figure 4 (left and right) shows the individual distribution of SHAP values for single variables in the good and poor mRS groups, respectively. The good prognosis group was young, presented with symptomatic cerebral vasospasm, intracerebral hemorrhage, perioperative complications, and pneumonia, and had WFNS grades of 1, 4, and 5 and a Fisher classification score of 4. Meanwhile, the poor prognosis group was older and had a WFNS grade of 1, 4, and 5 and a Fisher classification score of 4. Furthermore, they presented with cerebral vasospasm, intracerebral hemorrhage, pneumonia, perioperative complications, and aphasia-related higher brain dysfunction. In both groups, the presence or absence of attention disorder and the SI were high-ranking factors affecting output. However, good prognosis could not be clearly distinguished from poor prognosis. Notably, age, which is the most important factor shown in Fig. 3, was mixed in the positive SHAP values in Fig. 4 (left) and negative SHAP values in Fig. 4 (right), even in older participants.
Feature importance and influence across mRS groups using SHAP values. Figure 4 presents the SHAP analysis of feature importance and influence on model outputs for two groups based on the mRS: (A) good mRS group and (B) poor mRS group. The x-axis displays SHAP values, indicating the impact of each feature on the model’s predictions. Features with high SHAP values contribute significantly to the model’s output
Multivariate logistic regression analysis of prognostic predictors
In the multivariate logistic regression analysis (Table 2), age was significantly associated with the outcome (β = 0.070, odds ratio [OR] = 1.073, 95% CI: 1.052–1.094, p < 0.001). WFNS1 (β = 0.920, OR = 2.508, 95% CI: 1.376–4.575, p = 0.003) and WFNS5 (β = 3.446, OR = 31.364, 95% CI: 9.513–103.405, p < 0.001) were positively associated with increased odds of the outcome. Conversely, WFNS4 was negatively associated with the outcome (β = − 0.915, OR = 0.401, 95% CI: 0.234–0.685, p = 0.001). Symptomatic cerebral vasospasm (β = 2.096, OR = 8.135, 95% CI: 4.216–15.697, p < 0.001), pneumonia (β = 1.734, OR = 5.662, 95% CI: 2.360–13.584, p < 0.001), and perioperative period (β = 0.909, OR = 2.482, 95% CI: 1.041–5.919, p = 0.040) were significant predictors. In addition, the presence of intracerebral hemorrhage was marginally significant (β = 0.544, OR = 1.723, 95% CI: 1.000–2.970, p = 0.050). The model showed good calibration, as evidenced by the Hosmer–Lemeshow test (χ2 = 9.148, df = 8, p = 0.330). Hence, there was no significant discrepancy between the observed and predicted values.
The predictive performance of the logistic regression model using mRS score as a dependent variable was evaluated (Fig. 5). The model had an excellent discriminative ability, as indicated by the AUC, which was 0.896 (95% CI, 0.869–0.922). This AUC value indicated that the model could effectively distinguish patients with different mRS outcomes. Model calibration was assessed using the Hosmer–Lemeshow test, which showed no significant difference between the observed and predicted values (χ2 = 9.148, df = 8, and p = 0.330), thereby confirming the model’s good fit.
ROC curve for logistic regression model performance. Figure 5 illustrates the ROC curve for the logistic regression model used in the study. The x-axis represents specificity, while the y-axis indicates sensitivity. The AUC is a measure of the model’s overall performance, with higher values indicating better discriminative ability
Discussion
Development of the mRS prediction model
This study developed and validated a machine learning model to predict functional outcomes at discharge for patients with aSAH, using data available from the time of admission to the initiation of physical therapy. Unlike models such as SAFIRE and SAHIT, which focus on long-term outcomes (e.g., 2–3-month functional recovery), the SEASAH study targets the transition phase from acute care to rehabilitation, thereby providing actionable insights for early intervention. The model demonstrated excellent predictive accuracy by extracting information from initial findings and highlighting features with statistical significance (AUC, 0.88) and all extracted features (AUC, 0.88). The model’s optimal performance was achieved using eight key features: (1) age, (2) WFNS grade, (3) mFS score, (4) SI, (5) presence of intracerebral hemorrhage, (6) presence of symptomatic cerebral vasospasm, (7) presence of higher brain dysfunction (including attention deficits and aphasia), and (8) presence of complications (e.g., pneumonia and perioperative issues), yielding an AUC of 0.89. By incorporating novel features such as higher brain dysfunction and SI, which are often overlooked in predictive models, SEASAH offers enhanced clinical applicability. Furthermore, the use of SHAP values enhances interpretability, which allows clinicians to understand the contribution of each variable to patient outcomes. This transparency supports better informed decision-making, distinguishing SEASAH as a valuable complement to existing models such as SAHIT and SAFIRE.
Comparison with previously reported predictive factors
At the cohort level, age was the most significant factor influencing aSAH outcomes. Previous studies have shown that patients aged 75 years are significantly more likely to experience poor outcomes [21]. Park et al. classified patients aged ≥ 75 years as high risk [22]. In the SAFIRE classification, individuals aged < 50 years received 0 points, whereas those in their 50 s received 1 point, those in their 60 s received 2 points, and those aged ≥ 70 years received 5 points, indicating a greater weighting for older age groups [11]. Other than age, whether a patient is elderly is an important factor. Age is the most important predictive factor in various models. However, according to the SHAP score distribution at the individual patient level, some elderly patients achieve good outcomes, underscoring the need for detailed analyses focused on elderly patients with aSAH in future.
The prognosis of aSAH has long been determined based on the severity at onset. Previous studies using the Hunt and Hess grade have shown that the survival rates at discharge or within 30 days is < 95% for grades I and II. Nevertheless, they decrease to approximately 90% for grade III, 68%–76% for grade IV, and 30%–49% or lower for grade V [23, 24]. In some studies using the WFNS grade, > 70% of patients with grades I and II aSAH were able to return home. Meanwhile, approximately half of those with grades III and IV aSAH could not, and almost none with grade V were able to do so [25, 26]. In our study, the WFNS grade was adopted as an indicator of severity, based on the SAFIRE classification. Our study confirmed that severity at onset is a determinant of prognosis, with WFNS grades 1 and 5 showing a particularly strong impact on outcome determination.
In contrast, by incorporating factors related to executive dysfunction and higher brain dysfunction, which have not been the focus of previous studies on the intensive care period but are significant in the mid- to long-term [27], a robust model was constructed. The presence of attention deficits is a significant predictor at the cohort level according to the SHAP score. However, individual patient distribution varies. In contrast, the presence of aphasia in our study data clearly classified the good and poor outcome groups. Additionally, prior research suggests that aphasia is considered to cause greater functional impairment than hemiplegia [28]. Furthermore, aphasia is a predictor of poor outcomes even in patients with mild ischemic stroke [29]. Therefore, cautious evaluation is recommended to detect higher brain dysfunction, including attention deficits and aphasia, at the start of physical therapy, considering of the presence of consciousness disorders.
In addition to higher brain dysfunction, SI was included as a predictive variable in this study, reflecting acute physiological stress response in patients with aSAH. SI, calculated as the ratio of blood glucose level to serum potassium level, serves as an integrated measure of metabolic and catecholaminergic activities, which are essential in the acute phase of aSAH. Elevated blood glucose levels and electrolyte imbalances, particularly hypokalemia, have been associated with poor outcomes in critical care settings, as shown in previous studies on hyperglycemia and potassium levels in patients admitted to the ICU [30, 31]. The inclusion of SI as a composite variable emphasizes its potential utility in capturing these interrelated physiological stress responses. While SI is widely used as a prognostic indicator in Japan, its application in international studies remains limited. However, its simplicity and cost-effectiveness make it a promising tool for evaluating disease severity and guiding clinical decision-making during transition from ICU care to rehabilitation. Recent studies, such as those by Yang et al. [32], further support the prognostic significance of hyperglycemia in critically ill patients, suggesting that SI can complement existing metrics to provide a more nuanced understanding of patient trajectories.
Limitations
First, the model was trained and evaluated using datasets extracted from the same population, without independent validation using an external dataset. This lack of external validation restricted our ability to completely assess the generalizability and robustness of the model. Hence, it is considered a significant limitation. External validation is essential to validate the model’s broader applicability, reduce the risk of overfitting, and benchmark its performance against existing tools, such as SAFIRE, in diverse clinical settings. To address this limitation, additional internal validation was performed using k-fold cross-validation (k = 5). This approach allowed us to evaluate the model’s performance across multiple subsets of the dataset and mitigated the risk of overfitting. However, it is still challenging to obtain suitable external datasets with comparable variables and outcomes. The key variables in our model, such as the stress index and specific rehabilitation-related factors, are not widely used in other studies, thereby limiting the availability of compatible datasets. To overcome these challenges, we are planning to conduct a nationwide registry study in Japan. This initiative aims to provide a larger and more diverse dataset while addressing regional characteristics and hospital-specific evaluation standards, which facilitate the future external validation and broader applicability of the model. Nevertheless, it is important to recognize the unique contributions of the SEASAH model, which represents a significant step forward in the outcome prediction for patients with aSAH. By focusing on discharge outcomes, SEASAH addresses a critical transition phase in patient care that is not the primary focus of other models, such as SAHIT and SAFIRE. This enables clinicians to make informed decisions early in the recovery process, particularly in individualizing rehabilitation strategies and optimizing resource allocation.
Second, this study primarily focused on discharge outcomes rather than long-term outcomes, such as 90-day or 2–3-month mRS scores. Discharge outcomes are clinically relevant for guiding early rehabilitation strategies and resource allocation. However, they only represent the initial stage of recovery. Long-term outcomes are essential for providing a more comprehensive understanding of recovery trajectories and assessing the sustained impact of early interventions. Thus, future studies should aim to incorporate long-term outcomes to complement the findings of this study and further validate the predictive value of the model across different recovery phases.
Third, the SEASAH study was a multicenter research conducted at five facilities in Japan. Although the analysis was performed using relatively large dataset, a larger cohort size could be beneficial. To further validate the robustness of the model, a dataset that considers regional characteristics while establishing standards for each hospital (fixed evaluation date) is important. In the future, these tools should be prospectively verified, and this approach must be further optimized with a large multicenter dataset. While the current model is tailored specifically to Japanese patients with aSAH, its applicability to other countries may be limited due to differences in medical systems and acute care practices. However, this study was made possible because it was conducted in Japan, a country with one of the highest aging populations worldwide. The insights gained from this study provide a unique perspective on managing elderly patients with aSAH and their rehabilitation needs. Over the coming years, these findings may serve as a valuable resource for other countries as they face similar demographic transitions and seek to optimize care for aging populations.
Fourth, this study excluded specific patient groups to maintain focus on a homogeneous cohort for predictive modeling. These exclusions included patients who died before starting physical therapy, those with recurrent aSAH, those who had received aneurysm treatment > 72 h after onset, those aged < 20 years, and those with an mRS score of 2–5 before aSAH onset. These criteria were essential to refine the study’s focus and improve the model’s generalizability to typical aSAH cases. However, they may have limited its applicability to more diverse clinical scenarios. In addition, follow-up data on excluded patients were incomplete, which prevented us from conducting a systematic comparison between the included and excluded groups. This limitation emphasized the need for cautious interpretation when generalizing these findings to broader patient populations. Future prospective studies should aim to collect comprehensive data on the excluded groups to better evaluate the impact of these exclusions on predictive modeling and outcomes.
Fifth, this study did not account for the WLST, which could influence outcomes and potentially introduce biases, including self-fulfilling prophecy biases. The lack of systematic data collection on WLST is a limitation of retrospective studies. Therefore, future studies should aim to systematically capture and analyze WLST decisions to better understand their impact on outcomes and improve the robustness of predictive modeling in SAH research.
Finally, in this study, a model using features selected based on descriptive statistics and the SAFIRE classification was constructed. Nevertheless, there may be other important features. In particular, novel features that should be measured and collected that are not included in the dataset, such as troponin T levels upon admission [33] and hemodynamic response during endotracheal suctioning [34], may play an important role in the prediction accuracy and interpretation of the model. Therefore, further investigation should be performed.
Conclusions
This study developed a machine learning model that can accurately predict functional outcomes in patients with aSAH. The model demonstrated excellent performance using key clinical features. Our findings emphasize the importance of early identification of predictors such as age, WFNS grade, and higher brain dysfunction (including aphasia) for guiding rehabilitation strategies. Future efforts should focus on refining these models to enhance clinical decision-making and improve patient care.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Abbreviations
- aSAH:
-
Aneurysmal subarachnoid hemorrhage
- WFNS:
-
World federation of neurosurgical societies
- mRS:
-
Modified rankin scale
- SHAP:
-
SHapley additive explanations
- SEASAH:
-
Safety and efficacy of acute rehabilitation in subarachnoid hemorrhage
References
Korja M, Lehto H, Juvela S, Kaprio J. Incidence of subarachnoid hemorrhage is decreasing together with decreasing smoking rates. Neurology. 2016;87:1118–23. https://doi.org/10.1212/WNL.0000000000003091.
Feigin VL, Stark BA, Johnson CO, Roth GA, Bisignano C, Abady GG, et al. Global, regional, and national burden of stroke and its risk factors, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet Neurol. 2021;20:795. https://doi.org/10.1016/S1474-4422(21)00252-0.
Yoshikawa S, Kamide T, Kikkawa Y, Suzuki K, Ikeda T, Kohyama S, et al. Long-term outcomes of elderly patients with poor-grade aneurysmal subarachnoid hemorrhage. World Neurosurg. 2020;144:e743–9. https://doi.org/10.1016/j.wneu.2020.09.061.
Rosengart AJ, Schultheiss KE, Tolentino J, Macdonald RL. Prognostic factors for outcome in patients with aneurysmal subarachnoid hemorrhage. Stroke. 2007;38:2315–21. https://doi.org/10.1161/STROKEAHA.107.484360.
Olkowski BF, Devine MA, Slotnick LE, Veznedaroglu E, Liebman KM, Arcaro ML, et al. Safety and feasibility of an early mobilization program for patients with aneurysmal subarachnoid hemorrhage. Phys Ther. 2013;93:208–15. https://doi.org/10.2522/ptj.20110334.
Karic T, Roe C, Nordenmark TH, Becker F, Sorteberg A. Impact of early mobilization and rehabilitation on global functional outcome one year after aneurysmal subarachnoid haemorrhage. J Rehabil Med. 2016;48:676–82. https://doi.org/10.2340/16501977-2121.
Yokobatake K, Ohta T, Kitaoka H, Nishimura S, Kashima K, Yasuoka M, et al. Safety of early rehabilitation in patients with aneurysmal subarachnoid hemorrhage: a retrospective cohort study. J Stroke Cerebrovasc Dis. 2022;31:106751. https://doi.org/10.1016/j.jstrokecerebrovasdis.2022.106751.
Winstein CJ, Stein J, Arena R, Bates B, Cherney LR, Cramer SC, et al. guidelines for adult stroke rehabilitation and recovery: a guideline for healthcare professionals from the American Heart Association/American Stroke Association. Stroke. 2016;47:e98–169. https://doi.org/10.1161/STR.0000000000000098.
Teasell R, Salbach NM, Foley N, Mountain A, Cameron JI, de Jong A, et al. Canadian stroke best practice recommendations: rehabilitation, recovery, and community participation following stroke. part one: rehabilitation and recovery following stroke; 6th edition update 2019. Int J Stroke. 2020;15:763–88. https://doi.org/10.1177/1747493019897843.
Bernhardt J, Langhorne P, Lindley RI, Thrift AG, Ellery F, Collier J, et al. Efficacy and safety of very early mobilisation within 24 h of stroke onset (AVERT): a randomised controlled trial. Lancet. 2015;386:46–55. https://doi.org/10.1016/S0140-6736(15)60690-0.
van Donkelaar CE, Bakker NA, Birks J, Veeger NJGM, Metzemaekers JDM, Molyneux AJ, et al. Prediction of outcome after aneurysmal subarachnoid hemorrhage. Stroke. 2019;50:837–44. https://doi.org/10.1161/STROKEAHA.118.023902.
Jaja BNR, Saposnik G, Lingsma HF, Macdonald E, Thorpe KE, Mamdani M, et al. Development and validation of outcome prediction models for aneurysmal subarachnoid haemorrhage: the SAHIT multinational cohort study. BMJ (Online). 2018. https://doi.org/10.1136/bmj.j5745.
Heo JN, Yoon JG, Park H, Kim YD, Nam HS, Heo JH. Machine learning-based model for prediction of outcomes in acute stroke. Stroke. 2019;50:1263–5. https://doi.org/10.1161/STROKEAHA.118.024293.
Brugnara G, Neuberger U, Mahmutoglu MA, Foltyn M, Herweh C, Nagel S, et al. Multimodal predictive modeling of endovascular treatment outcome for acute ischemic stroke using machine-learning. Stroke. 2020;51:3541–51. https://doi.org/10.1161/STROKEAHA.120.030287.
Savarraj JPJ, Hergenroeder GW, Zhu L, Chang T, Park S, Megjhani M, et al. Machine learning to predict delayed cerebral ischemia and outcomes in subarachnoid hemorrhage. Neurology. 2021;96:e553–62. https://doi.org/10.1212/WNL.0000000000011211.
Gaastra B, Barron P, Newitt L, Chhugani S, Turner C, Kirkpatrick P, et al. CRP (C-Reactive Protein) in outcome prediction after subarachnoid hemorrhage and the role of machine learning. Stroke. 2021;52:3276–85. https://doi.org/10.1161/STROKEAHA.120.030950.
Rodríguez-Pérez R, Bajorath J. Interpretation of compound activity predictions from complex machine learning models using local approximations and shapley values. J Med Chem. 2020;63:8761–77. https://doi.org/10.1021/acs.jmedchem.9b01101.
Martini ML, Neifert SN, Shuman WH, Chapman EK, Schüpper AJ, Oermann EK, et al. Rescue therapy for vasospasm following aneurysmal subarachnoid hemorrhage: a propensity score-matched analysis with machine learning. J Neurosurg. 2022;136:134–47. https://doi.org/10.3171/2020.12.JNS203778.
Sze V, Chen Y-H, Yang T-J, Emer JS. Efficient Processing of Deep Neural Networks. Synthesis Lectures on Computer Architecture. 2020;15. https://doi.org/10.1007/978-3-031-01766-7.
Takara H, Suzuki S, Satoh S, Abe Y, Miyazato S, Kohatsu Y, et al. Association between early mobilization and functional outcomes in patients with aneurysmal subarachnoid hemorrhage: a multicenter retrospective propensity score-matched study. Neurocrit Care. 2024;41:445–54. https://doi.org/10.1007/s12028-024-01946-y.
Proust F, Gérardin E, Derrey S, Lesvèque S, Ramos S, Langlois O, et al. Interdisciplinary treatment of ruptured cerebral aneurysms in elderly patients. J Neurosurg. 2010;112:1200–7. https://doi.org/10.3171/2009.10.JNS08754.
Park J, Woo H, Kang DH, Kim Y. Critical age affecting 1-year functional outcome in elderly patients aged ≥ 70 years with aneurysmal subarachnoid hemorrhage. Acta Neurochir (Wien). 2014;156:1655–61. https://doi.org/10.1007/s00701-014-2133-6.
Roquer J, Cuadrado-Godia E, Guimaraens L, Conesa G, Rodríguez-Campello A, Capellades J, et al. Short-and long-term outcome of patients with aneurysmal subarachnoid hemorrhage. Neurology. 2020;95:e1819–29. https://doi.org/10.1212/WNL.0000000000010618.
Lantigua H, Ortega-Gutierrez S, Schmidt JM, Lee K, Badjatia N, Agarwal S, et al. Subarachnoid hemorrhage: who dies, and why? Crit Care. 2015;19:1–10. https://doi.org/10.1186/s13054-015-1036-0.
Van Heuven AW, Mees SMD, Algra A, Rinkel GJE. Validation of a prognostic subarachnoid hemorrhage grading scale derived directly from the glasgow coma scale. Stroke. 2008;39:1347–8. https://doi.org/10.1161/STROKEAHA.107.498345.
Galea JP, Dulhanty L, Patel HC. Predictors of outcome in aneurysmal subarachnoid hemorrhage patients: Observations from a multicenter data set. Stroke. 2017;48:2958–63. https://doi.org/10.1161/STROKEAHA.117.017777.
Al-Khindi T, MacDonald RL, Schweizer TA. Cognitive and functional outcome after aneurysmal subarachnoid hemorrhage. Stroke. 2010;41:519–36. https://doi.org/10.1161/STROKEAHA.110.581975.
Boehme AK, Martin-Schild S, Marshall RS, Lazar RM. Effect of aphasia on acute stroke outcomes. Neurology. 2016;87:2348–54. https://doi.org/10.1212/WNL.0000000000003297.
Nesi M, Lucente G, Nencini P, Fancellu L, Inzitari D. Aphasia predicts unfavorable outcome in mild ischemic stroke patients and prompts thrombolytic treatment. J Stroke Cerebrovasc Dis. 2014;23:204–8. https://doi.org/10.1016/j.jstrokecerebrovasdis.2012.11.018.
Liu J, Luo F, Guo Y, Li Y, Jiang C, Pi Z, et al. Association between serum glucose potassium ratio and mortality in critically ill patients with intracerebral hemorrhage. Sci Rep. 2024;14:27391. https://doi.org/10.1038/s41598-024-78230-8.
Uijtendaal EV, Zwart-van Rijkom JEF, de Lange DW, Lalmohamed A, van Solinge WW, Egberts TCG. Influence of a strict glucose protocol on serum potassium and glucose concentrations and their association with mortality in intensive care patients. Crit Care. 2015;19:1–12. https://doi.org/10.1186/s13054-015-0959-9.
Yang Y, Li J, Xiao Z, Yang X, Wang L, Duan YH, et al. Relationship between stress hyperglycemia ratio and prognosis in patients with aneurysmal subarachnoid hemorrhage: a two-center retrospective study. Neurosurg Rev. 2024;47:315. https://doi.org/10.1007/s10143-024-02549-z.
Oras J, Grivans C, Bartley A, Rydenhag B, Ricksten SE, Seeman-Lodding H. Elevated high-sensitive troponin T on admission is an indicator of poor long-term outcome in patients with subarachnoid haemorrhage: a prospective observational study. Crit Care. 2016;20:1–10. https://doi.org/10.1186/s13054-015-1181-5.
Rass V, Ianosi BA, Lindner A, Kofler M, Schiefecker AJ, Pfausler B, et al. Hemodynamic response during endotracheal suctioning predicts awakening and functional outcome in subarachnoid hemorrhage patients. Crit Care. 2020;24:1–10. https://doi.org/10.1186/s13054-020-03089-w.
Acknowledgements
The authors thank the staff of Professor Kaoru Sakatani's laboratory at the University of Tokyo for their great interest and cooperation. The authors would also like to thank Kouji Kinjo for their help with data retrieval.
Funding
This work was partly supported by JSPS KAKENHI under Grant Number 22H03979, and by Yokohama City University under Grant.
Author information
Authors and Affiliations
Contributions
Moriya and Takara participated in the conception and design of the study. Miyazato, Suzuki, Abe, Satoh, Minakata, and Takara participated in the data acquisition and data analysis. Moriya, Suzuki, and Takara participated in the data interpretation. Moriya and Karako participated in the development of the machine-learning model. Miyazaki participated in the statistical analysis. Moriya drafted the manuscript, and the all authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
This study was part of the Safety and Efficacy of Acute Rehabilitation in Subarachnoid Hemorrhage (SEASAH) study and was approved by the ethics committees of the respective hospitals (Representative institution: Naha City Hospital, 2021a26; Author Affiliation: Teikyo Heisei University, R02-007).
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Moriya, M., Karako, K., Miyazaki, S. et al. Interpretable machine learning model for outcome prediction in patients with aneurysmatic subarachnoid hemorrhage. Crit Care 29, 36 (2025). https://doi.org/10.1186/s13054-024-05245-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13054-024-05245-y