POD - Sebastian Friedemann Phd Defense

Prendre des notes

Il n’y a pas de note disponible pour vous pour cette vidéo.

Connectez-vous pour en créer une nouvelle.

Disciplines

Types

Mots clés

perform 304 fle 290 sciences 290 techniques 290 filipé 284 fos 282 lig 182 cpp 178 mathematiques 165 soutenance 151 gricad 146 prepa inp 145 prepa des inp 139 thèse 135 innovation 114 sante 106 pedagogie 97 2a 87 dgd bapso 86 recherche 86

Le serveur vidéo sera en maintenance mardi 21 Mai toute la journée.

Prediction of chaotic and non-linear systems like weather or the groundwater cycle
relies on a floating fusion of sensor data (observations) with numerical models to
decide on good system trajectories and to compensate for non-linear feedback effects.
Ensemble-based data assimilation (DA) is a major method for this concern. It relies
on the propagation of an ensemble of perturbed model realizations (members) that
is enriched by the integration of observation data. Performing DA at large scale to
capture continental up to global geospatial effects, while running at high resolution to
accurately predict impacts from small scales is computationally demanding. This requires
supercomputers leveraging hundreds of thousands of compute nodes, interconnected
via high-speed networks. Efficiently scaling DA algorithms to such machines requires
carefully designed highly parallelized workflows that avoid overloading of shared resources.
Fault tolerance is of importance too, since the probability of hardware and numerical
faults increases with the amount of resources and the number of ensemble members.
Existing DA frameworks either use the file system as intermediate storage to provide a
fault-tolerant and elastic workflow, which, at large scale, is slowed down by file system
overload, or run large monolithic jobs that suffer from intrinsic load imbalance and are
very sensible to numerical and hardware faults. This thesis elaborates on a highly parallel,
load-balanced, elastic, and fault-tolerant solution, enabling it to run efficiently statistical,
ensemble-based DA at large scale. We investigate two classes of DA algorithms, the en-
semble Kalman filter (EnKF), and the particle filter algorithm with sequential importance
resampling (SIR), and validate our framework under realistic conditions. Groundwater
sensor data is assimilated using a regional hydrological simulation leveraging the ParFlow
model. We efficiently run EnKF with up to 16,384 members on 16,240 compute cores
for this purpose. A comparison with an existing state-of-the-art solution on the same
domain, running 2,500 members on 20,000 cores, shows that our approach is about
50 % faster. We also present performance improvements running particle filter with
SIR at large scale. These experiments assimilate cloud coverage observations into
2,555 members, i.e., particles, running the weather research and forecasting (WRF)
model over the European domain. To manage the many experiments performed on
various supercomputers, we developed a specific setup that we also present.

Keywords: Data Assimilation, Ensemble Based, In Situ Processing, EnKF, Particle
Filter, High Performance Computing

Mots clés : data assimilation enkf ensemble based high performance computing in situ processing particle filter

Ajouté par : Millian Poquet
Mis à jour le : 4 juillet 2022 15:22
Type : Autres
Langue principale : Anglais
Discipline(s) :
- Informatique, Mathématiques, Sciences et technologies de l'information et de la communication

Sebastian Friedemann PhD defense - Ensemble-based Data Assimilation for Large Scale Simulations

Commentaire(s)

Sebastian Friedemann PhD defense - Ensemble-based Data Assimilation for Large Scale Simulations

Informations

Commentaire(s)