An international student machine learning competition to develop state-of-the-art solar PV forecasting models 🌍
Join the official ClimateHack.AI Discord server to become part of an international community of AI enthusiasts and receive competition announcements. 🚀
Your challenge—should you choose to accept it—is to develop a cutting-edge machine learning model for predicting near-term site-level solar power production using satellite imagery, weather forecasts and air quality data better than the current state of the art before submissions close on Friday 2nd February 2024.
Your contributions could directly help cut carbon emissions in Great Britain by up to 100 kilotonnes per year by helping to advance Open Climate Fix's solar PV nowcasting research work for the National Grid Electricity System Operator.
To get started with this challenge, read through the information posted here and then check out the starter resources on GitHub! You will be guided through loading and examining the data, training your first model using PV and HRV data, and submitting your model to the platform for evaluation.
If you just want to start exploring the data, you can run the notebook with Google Colab.
In order to account for the variability of solar photovoltaic (PV) power production, the National Grid Electricity System Operator (ESO) schedules a spinning reserve of natural gas generators, which can take hours to ramp up from a cold start, to operate below their maximum capacity so that there is headroom on the grid that can ramp up rapidly to make up for any shortfalls.
Not only is this expensive, but it contributes to ~100 kilotonnes in excess carbon emissions each year in Great Britain alone. As such, better solar PV forecasting techniques would allow the National Grid ESO to cut their spinning reserve, thereby reducing emissions and helping to improve the deployability of cheaper, greener solar power.
Cloud coverage (especially in areas with variable meteological environments such as the United Kingdom) can have a outsized impact on solar photovoltaic power yields. By incorporating satellte imagery into near-term forecasting (or "nowcasting") models for solar power generation, it is possible to significantly improve the minute-to-minute accuracy of machine learning-based solar PV models beyond relying on numerical weather predictions alone.
Take a look at this video animation from Open Climate Fix to see this effect in practice:
Another under-researched source of variability could be the presence of aerosols—suspended particulates that can affect the path of sunlight—at different altitudes in the atmosphere. As such, we are making 259 GB of air quality data available (related to dust, NO2, ozone and more), and you will have the option of integrating this data into the models you train. If this approach proves successful, this could be a key research contribution to come out of this competition!
The aim: develop a model for site-level PV forecasting over the next four hours that is both accurate and performant.
Features to choose from (at time
) along with associated metadata
[12, 128, 128])
[12, 128, 128, 11])
[T - 1h, T, T + 1h, T + 2h, T + 3h, T + 4h](
[6, 128, 128])
[6, 8, 128, 128])
Target: the expected site-level solar PV power to be generated over the next four hours as a proportion of installed capacity (
Evaluation metric: mean absolute error (over all four hours)
With so many different data sources to choose from, we encourage you to be creative and see how well you can do without necessarily using all of the data sources! Lighter models are ultimately easier to deploy in production.
The HRV satellite imagery data will seem very familiar to participants of ClimateHack.AI 2022, which had a future satellite imagery generation challenge based on that channel over the same year range.