top of page

ClimateWins

Machine Learning and Weather Prediction

Overview

Objective:

ClimateWins, a European nonprofit organization, is interested in using machine learning to help predict the consequences of climate change around Europe and, potentially, the world.

Role:

As a data analyst for ClimateWins, I will assess the tools available to categorize and predict the weather in Europe.

Tools:

Python, Jupyter

icons8-python-144.png
icons8-jupiter-144.png

Data Set:

  • Observations from 18 weather stations across Europe, dating from late 1800’s to 2022

  • Values such as temperature, wind speed, snow, precipitation, global radiation and more

  • Collected by the European Climate Assessment & Data Set project

Hypothesis:

Machine-learning algorithms can be applied to the data to predict weather conditions.

Process for exploring hypothesis:

  • Clean and prepare weather data

  • Run optimization algorithm

  • Apply various machine learning models, both supervised and unsupervised

  • Evaluate accuracy and usefulness of different models on the weather data set

  • Propose an effective method for machine learning application for weather prediction

To better understand the structure of our data and optimize it for machine learning, we can run an optimization algorithm such as gradient descent on a feature of the data.

Here we are fitting a regression model by running gradient descent on daily mean temperatures of various stations for a chosen year.

Gradient Descent on Kassel Station, 1960:

  • Iteration: 200, Step size: 0.1

  • Successfully converges

  • Minimum achievable loss at 0.43

kassel1960loss.png

Kassel 1960 Loss Function

Optimization

Supervised Learning

Preparation:

  • An answers data set is provided to train the model on predicting if a certain day is pleasant or not.

  • The data set is scaled to prevent the machine learning model attributing more weight to higher values.

  • The data is split, 70% for training, 30% for testing.

  • For this project, we applied the following models:

    • K-Nearest Neighbor (KNN)

    • Decision Tree

    • Artificial Neural Networks (ANN)

Challenges:

  • Parameters: For each model, we must experiment on the parameters to produce the highest accuracy.

  • Overfitting: The models are overfitting to one weather station (Sonnblick), therefore negatively affecting the overall accuracy.

Results on Supervised Models:

tableSupModels.png

Why the ANN model works best in predicting current data:

 

  • Most accurate predictions on both training and testing sets.

  • Less overfitting than the decision tree model.

  • Room for improvement and experimentation with the parameters.

Unsupervised Learning

Neural Network Models

The success of the ANN model in the supervised learning exercise leads to an exploration of the Recurrent Neural Network (RNN) model in unsupervised learning. RNN’s strength lies in handling temporal data.

The Long Short-Term Memory (LSTM) model is an improved version of RNN. The model is trained to predict the same weather conditions: pleasant or unpleasant.

Here is the final Keras model layout for running LSTM on the weather data.

lstm.png

Challenge:

It is time-consuming and difficult to experiment with the endless variations of hyperparameters in neural network models. How do we efficiently find the optimal values to improve the performance of deep learning models?

Bayesian Optimization:

Applying a Bayesian search on the hyperparameters produced a set of optimal values for running the model. It improved the accuracy of the pre-optimized model by 5%.

Proposal for ClimateWins

The following proposal results from thought experiments on the possibilities of implementing machine learning to predicting data and the study of various machine learning models throughout this project.

proposalTable.png

Key Insights:

Machine learning models are able to help predict weather conditions.

​

Neural network models work best for the ClimateWins dataset.

​

Complex deep learning models such as LSTM will need optimization algorithms to produce higher quality results.

​

Thought experiments can assist in creating a viable method for applying machine learning for extreme weather prediction.

Project Assessment:

Working with machine learning algorithms is a challenging yet exciting venture. This project has given me an insightful first look at the complex inner workings of various machine learning models. I’ve gained an understanding of which models are useful in different scenarios, or even that there are possibilities out there still to be explored.

 

As powerful as machine learning models are, I have also realized through this project how vital the quality and preparation of the data is to running a successful and useful model. In future projects, I would aim to always thoroughly clean and transform the data before modeling.

Conclusion

The project is available on Github:
icons8-github-100.png
bottom of page