California Housing Challenge
Learn about tackling regression challenges with scikit-learn with this classic dataset ๐ก
Loading...
Learn about tackling regression challenges with scikit-learn with this classic dataset ๐ก
This educational challenge is to develop a machine learning model for predicting median block house prices in California using features such as the number of rooms and the age of the house.
This is a great opportunity to experiment with and learn about a number of core concepts in machine learning using pandas, seaborn and scikit-learn.
To get started, check out our tutorial notebook:
https://github.com/DoxaAI/educational-challenges/blob/main/california-housing/getting-started.ipynb
This challenge is based on the popular California housing dataset originally based on data from the 1990 US Census.
It contains the following data variables:
median_income
: the median income in block group in thousands of dollarshouse_age
: the median house age in block groupmean_rooms
: the mean number of rooms per householdmean_bedrooms
: the mean number of bedrooms per householdpopulation
: the block group populationmean_household_size
: the median household size of the blocklatitude
: the latitude of the block grouplongitude
: the longitude of the block groupmedian_house_value
: the median house value in thousands of dollarsIf you have any questions about the challenge, feel free to reach out in the DOXA Community Discord server.