SeleHiTANet: Boosting Health Risk Prediction via Data Selection on Hierarchical Time-aware Attention Networks

Open Access
- Author:
- Qi, Sirui
- Area of Honors:
- Information Sciences and Technology
- Degree:
- Bachelor of Science
- Document Type:
- Thesis
- Thesis Supervisors:
- Fenglong Ma, Thesis Supervisor
Nick Giacobe, Thesis Honors Advisor - Keywords:
- healthcare informatics
attention mechanism
transformer
denoising algorithm
Risk prediction - Abstract:
- At present, due to today's high medical costs and a shortage of medical personnel, there is a great demand for sickness prediction in the healthcare field. Currently, transformers-based methods were early successful in sick prediction. Hierarchical Time-aware Attention Network(HiTANet) is a model used to predict the health situation of the patients base on the visiting information and visiting times at local and global stages, which simulate how doctors' making decision in risk prediction. We make a model called SeleHiTANet that improves the traditional model by automatically ruling out irrelevant visits and codes by effectively skimming the electronic health records (EHRs) data. In SeleHiTANet, we used parts of the model in MedSkim to be the ICD-9 code selection mechanism. MedSkim proved that making visits and codes decision in model input could improve the model's performance in predicting the diagnosis. This improvement can help healthcare models to focus on the most important information in the EHRs, improving the accuracy of diagnoses and treatment plans. By utilizing advanced algorithms, our new model will be able to quickly and efficiently scan through large amounts of EHRs data, saving time and reducing the risk of human error. We evaluate the performance of the SeleHiTANet model on EHRs and try to show whether it outperforms the original model in terms of risk prediction accuracy. SeleHiTANet model incorporates the ICD-9 code selection mechanism into HiTANet. In our experiments, we used Accuracy (Acc), Precision (Pre), Recall, F1, and the Area Under Curve (Auc) scores as the evaluation metrics. Experimental results show that the proposed SeleHiTANet model outperforms baselines and successfully improves the performance of the transformer-based models for the health risk prediction task.