ยทOngoingยท
20
Sign in or sign up to participate

California Housing Challenge

Learn about tackling regression challenges with scikit-learn with this classic dataset ๐Ÿก

California Housing Challenge ๐Ÿก

This educational challenge is to develop a machine learning model for predicting median block house prices in California using features such as the number of rooms and the age of the house.

This is a great opportunity to experiment with and learn about a number of core concepts in machine learning using pandas, seaborn and scikit-learn.

Getting started

To get started, check out our tutorial notebook:

https://github.com/DoxaAI/educational-challenges/blob/main/california-housing/getting-started.ipynb

The dataset

This challenge is based on the popular California housing dataset originally based on data from the 1990 US Census.

It contains the following data variables:

  • median_income: the median income in block group in thousands of dollars
  • house_age: the median house age in block group
  • mean_rooms: the mean number of rooms per household
  • mean_bedrooms: the mean number of bedrooms per household
  • population: the block group population
  • mean_household_size: the median household size of the block
  • latitude: the latitude of the block group
  • longitude: the longitude of the block group
  • median_house_value: the median house value in thousands of dollars

If you have any questions about the challenge, feel free to reach out in the DOXA Community Discord server.