水文地质与水文工程学报

Exploring the Role of Domain Partitioning on Efficiency of Parallel Distributed Hydrologic Model Simulations

Mukesh Kumar and Christopher J Duffy

Exploring the Role of Domain Partitioning on Efficiency of Parallel Distributed Hydrologic Model Simulations

Spatially distributed hydrologic models of watersheds and river basins are data and computation intensive because of the combined nature of hydrodynamics, complex forcings and heterogeneous parameter fields. Application of these models at fine temporal and spatial resolutions, and on large problem domains, is facilitated by parallel computation on multi-processor clusters. Notably, the computation efficiency of parallel simulations is crucially determined by the efficiency with which data are divided-and-distributed in a multiprocessor environment and how the information is shared between processors. While numerous data partitioning algorithms exist and have been extensively studied in computer science literature, detailed elucidation of the role of hydrologic model structure on data partitioning has not been presented yet. In addition, the relative role of computational load balance and interprocessor communication on parallel computation efficiency of a hydrologic model is not known. Considering the unstructured domain discretization scheme used in PIHM hydrologic model as an example, the paper first presents a generic methodology for incorporating hydrologic factors in optimal domain partitioning algorithms. The partitions are then used to explore the isolated role of computation load balance and interprocessor communication on parallel efficiency. Results confirm that parallel simulations on partitions that minimize interprocessor communication and divide the computational load equally are the most efficient. More importantly, load balance between processors is observed to be a more sensitive control on parallel efficiency than minimization of interprocessor communication. Further analyses of the efficiency and scalability of the parallel code for different partitioning configurations reveal a direct correspondence between parallel efficiency and theoretical metrics such as load balance ratio and communication to computation ratio. Results indicate that theoretical metrics can be used for the selection of best partitions before computationally intensive parallel simulations are performed. The study serves as a proof-of-concept evaluation of the impact of computation and communication on the efficiency of parallelized distributed hydrologic models at multiple resolutions.