Bohler 2023-10-09-18z - Bad data

Hi, @Tom_Ehrensperger @samuelbohler @administrator
Bohler 2023-10-09-18z Run ( Lite/Data/Adv) all had Bad Data at Line (+90)Hour.
I ceased these checks from the new Auto Email Api Posts, as I thought they had been fixed.
It seems not, so I am manually posting this.
I will email Tom and Sam after posting this.
I have saved and attached all Bohler and McMahon 2023-10-09-18z Run Files for others to cross reference if desired. Oh added ecmwf files too just for measure.

Bohler-Adv.txt (1.1 MB)
Bohler-Data.txt (9.8 KB)
Bohler-Lite.txt (4.3 KB)
ecmwf-Adv.txt (357.2 KB)
ecmwf-Data.txt (2.8 KB)
McMahon-Adv.txt (1.1 MB)
McMahon-Data.txt (9.8 KB)

Note: Looking back through my auto emails, it seems this has occurred on and off over past couple of weeks, however they occurred during my night time (while I was asleep) and so was not able to validate the data in real-time.
Kindest Regards,

Note : Bohler 2023-10-10-00z All Good.

Looking at the data suggests that as all the 90h values in the advection file are bad, the whole file containing the +90h data was corrupt in some way. The data is packed in such a way that if even one byte is missed or an extra byte is inserted in an 8MB file then the entire data from start to end is likely to be corrupted. I had problems with this kind of issue in the early days of my script and I put some checks in place to try to ensure that each downloaded GRIB2 file is good, e.g. checking the length of the file to make sure there’s no data missing or added. If I find a bad file I restart the entire process in the hope that the bad file was just a corruption in transit. I have had occasions in the past where the bad file seemed to be bad at source, or multiple random files were corrupt when there were serious network problems so I still didn’t get a good download after trying multiple times.

Sam’s script was developed completely independently of mine so I’m making assumptions about how he’s downloading and processing data but I’m pretty sure the download and extraction of data must be somewhat similar.

Hi Chris,
Thanks for the detailed explanation, appreciated.
Hopefully @samuelbohler gets to cast an eye over it too.
Kindest Regards,