This repository contains my solution for Task 01 of the Prodigy InfoTech Machine Learning internship.
The goal is to build a linear regression model to predict house prices using the Kaggle House Prices: Advanced Regression Techniques dataset.[web:12]
GrLivArea– Above-ground living area (square feet)BedroomAbvGr– Number of bedrooms above groundFullBath– Number of full bathrooms- Target:
SalePrice
- Loaded
train.csvandtest.csvfrom the Kaggle competition. - Selected the three required numerical features and handled missing values using median imputation.
- Trained a baseline LinearRegression model and evaluated it with RMSE and R².
- Improved performance by adding PolynomialFeatures and tuning the polynomial degree with GridSearchCV, keeping the model in the linear regression family.[web:66][web:81]
- Trained the best model on all training data and generated predictions for the Kaggle test set.
house_price_prediction_task1.ipynb– main Colab notebook (click the badge to open in Colab).README.md– overview of the project.
- Open the notebook in Google Colab using the "Open in Colab" button.
- Upload
train.csvandtest.csvfrom the Kaggle competition. - Run all cells to reproduce the results.
submission.csv– model predictions (Id,SalePrice) for all rows intest.csv.- The notebook also displays a preview of the predicted prices using
submission.head().