Predictive Analysis to Find Chances of Causing Stroke (Machine Learning)

Authors

  • T Sai Lalith Prasad Assistant Professor, Department of Artificial Intelligence and Data Science, Vignan Institute of Technology and Science, Hyderabad, India Author
  • K Saiprakash 2 UG Student, Department of AI&DS, Vignan Institute of Technology and Science, Hyderabad, India Author
  • M Ashok Reddy UG Student, Department of AI&DS, Vignan Institute of Technology and Science, Hyderabad, India Author
  • M Varshitha UG Student, Department of AI&DS, Vignan Institute of Technology and Science, Hyderabad, India Author

DOI:

https://doi.org/10.33425/3066-1226.1140

Keywords:

Stroke prediction, Encode categorical variables, Data preprocessing

Abstract

Stroke is recognised as one of the most dangerous it can cause for both life time disability and cause immediate death,across the world, there is a immediate necessary of accurate predictive models. These machine learning models are used to more accurate identifying the person who are more likely to experiencing this life altering medical event. So, now there is need for prediction of stroke for present generation intervention and based on the result taking before medication. Stroke can be predicted by analysing different signs in your body like hypertension, severe headache, trouble speaking and numbness on the face arm or on legs. By giving some health parameters like age, hypertension(0,1), any heart disease, married, residence type, average glucose level are considered as the input feature which is used to training the model and testing the model this helps to predict the given inputs and predict accurate solution. In this model we used algorithms like Linear Support Vector Machine (SVM) for classification, SMOTEENN (Synthetic Minority Over-sampling Technique with Edited Nearest Neighbors) used to balance the dataset, One-Hot Encoding is used to preprocess categorical variables by converting them into a numerical format, Pipeline combines preprocessing and the classification algorithm into a single workflow. This model can give approximately 85-90% accuracy. .

Published

2025-07-28

Issue

Section

Articles