The input (conditioning) dataset is taken from ERA5 reanalysis, a global dataset at spatial resolution of about 25km and a temporal resolution of 1h. The target dataset used in this study is a subset of the proprietary RWRF model data (Radar Data Assimilation with WRFDA 1). The RWRF model is one of the operational regional Numerical Weather Prediction (NWP) models developed by CWA, which focuses on radar Data Assimilation (DA) in the vicinity of Taiwan. The WRF - CWA system uses a nested 2km domain in a 10km that is driven by a global model (GFS) as boundary conditions.
To facilitate training, we interpolate the input data onto the curvirectangular grid of CWA with bilinear interpolation (with a rate of 4x), which results in 36 × 36 pixels over the region of Taiwan. Each sample in the input dataset consists of 20 channels. This includes four pressure levels (500 hPa, 700 hPa, 850 hPa, and 925 hPa) with four corresponding variables: temperature, eastward and northward components horizontal wind vector, and Geopotential Height. Additionally, the dataset includes single level variables such as 2 meter Temperature, 10 meter wind vector, and the total column water vapor.
The target dataset covers a duration of 52 months, specifically from January 2018 to April 2022. It has a temporal frequency of one hour and a spatial resolution of 2km. We use only the inner (nested) 2km domain, which has 448 × 448 pixels, projected using the Lambert conformal conical projection method around Taiwan. The geographical extent of the dataset spans from approximately 116.371°E to 125.568°E in longitude and 19.5483°N to 27.8446°N in latitude. We sub-selected 4 channels (variables) as the target variables, that are most relevant for practical forecasting: temperature at 2 meter above the surface, the horizontal winds at 10 meter above the surface and the 1h maximum radar reflectivity - a surrogate of expect precipitation. Notably, the radar reflectivity channel is not present in the input data and needs to be predicted based on the other channels, making its prediction strictly generative.
Attribution-NonCommercial-NoDerivatives 4.0 International