Project Type: Data Science · Regression Analysis
This project explores the relationship between socioeconomic, environmental, and health indicators and their effect on national life expectancy. Using data from the World Health Organization, the model predicts life expectancy based on over 20 variables such as immunization rate, GDP, BMI, and HIV prevalence.
The dataset was preprocessed to handle missing values and outliers, then standardized and split into training and testing sets. A linear regression model was trained and evaluated using metrics like RMSE and R². Feature importance was visualized to identify key predictors influencing life expectancy outcomes.
Python, Pandas, Scikit-learn, NumPy, Matplotlib, Seaborn
← Back to Portfolio