This package contains example code to run and train GraphCast. It also provides three pretrained models:
GraphCast, the high-resolution model used in the GraphCast paper (0.25 degree resolution, 37 pressure levels), trained on ERA5 data from 1979 to 2017,
GraphCast_small, a smaller, low-resolution version of GraphCast (1 degree resolution, 13 pressure levels, and a smaller mesh), trained on ERA5 data from 1979 to 2015, useful to run a model with lower memory and compute constraints,
GraphCast_operational, a high-resolution model (0.25 degree resolution, 13 pressure levels) pre-trained on ERA5 data from 1979 to 2017 and fine-tuned on HRES data from 2016 to 2021. This model can be initialized from HRES data (does not require precipitation inputs
Thank you for linking to this. I find it very interesting!
I think for ‘us’ there isn’t much hope of running this as a full model. The ERA5 dataset is about 5 petabytes so well beyond what most of us can afford!
In basic terms they are using a large volume of previous weather data to train an AI model. The model examines the data and tries to identify patterns of prior conditions that lead to particular weather future weather patterns.
So as an obviously ridiculuous example, lets say it finds that on 100 occasions in April when the temperature decreases from 10C to 5C over a period of 5 hours that there’s 5mm of rain at the location in the next 5 hours. It will find lots of other hypotheses, likely very much more complicated than this, from the previous 80+ years of ERA5 data. Once you’ve done that, you then feed in the current conditions to the model to see if it can find any matches with it’s hypotheses. So if your current data shows the temperature decreasing from 10C to 5C in the last 5 hours then it will predict 5mm of rain in the next 5 hours.
I’ve been thinking that someone would do this, and actually thought about how I might do it myself. I don’t have the maths or computing skills to do it though. It’s a very interesting way of using AI and unlike other potential uses I think it’s actually one that not many could argue is bad. There’s a lot of weather data and it’s very difficult for humans to sensibly look at huge volumes of data to spot the patterns. AI is very good at that.
One thing that could be interesting is training the model using local data. For example, I have many years of local weather data accumulated in a database and I could possibly train the model on that. I’d need to investigate whether the model is suitable for training with the kind of data that I have first.
Digging a bit further, there are some trained models available, but they seem to be for certain US locations, although I can’t see anything that says which locations.
There are also references to the computing power you’d need to use the models, e.g. TPUv4 or NVidia A100. A 512 core TPUv4 system costs about $2.1M/year, or you could rent a 4-core TPUv4 ‘system’ from Google for about $13/hour.
Ah good to see interests in this !
I follow news and replies on Tweakers as they are IT profesionals. But the discussion about wheater forecasts is more then IT knowledge.
Please feel free to translate !
Isn’t current AI mostly machine learning? You teach a system how to behave by throwing lots of previous data at it. Brute force playing chess is quite different…that’s just trying all possible moves and seeing which has the best outcome. Playing back all Grandmaster games into a machine learning system to identify the best moves to make in a given situation without having to brute force the 500 billion moves possible in the next ‘x’ turns is a very different beast. You could argue that the current GFS forecasting system is like brute force forecasting…they run the model many times to generate ensembles and then try to figure out which ensembles will best match the weather in the next 10+ days.
Forecasting is just using the knowledge we’ve gained about weather patterns, and environmental physics, to try to predict future weather patterns. I suspect that weather AI/machine learning will probably mimic a lot of what we already know but may throw in a few interesting additions that nobody has noticed in the huge volume of data that exists.
I see that CERN are also resorting to AI/machine learning to handle the data produced by the LHC. That seems to be driven by the huge volumes of data being produced that the scientists can’t cope with.
If we called this ‘very clever programming to identify hidden patterns in data’ then we’d applaud the very clever programmers. I think the term AI is putting people off what is actually some very clever programming because it makes people think we’re going to be ruled by computer in the near future (as if we aren’t already!) When I started a new job in 1986 one of the first tasks I was given was to extract data from some big operational systems to put into an ‘Information Centre’. The idea was that the data could be manipuated more easily in our own system than in a fixed format operational system. We could programmatically slice and dice the data in different ways to identify patterns in it that couldn’t be seen in the standard reports provided by the operational systems. We didn’t worry that would make us all data slaves…at least I don’t think we did.
Just yesterday I wanted to ask here if there was anyone trying to find correlations in your weather stations’ data and based on that predict typical weather pattern, changes, etc., followed after combination of values (pressure change, dew)! It is basically an algorithm (AI?) that does such analysis - OP was faster than me
I believe this is an interesting field which I’m not really “afraid” of (AI going after us). In the end, weather goes its own way.