Job Coscheduling on Coupled High-End Computing Systems

2011 International Conference on Parallel Processing, 2011
Supercomputer centers often deploy large-scale computing systems together with an associated data analysis or visualization system. In this paper, we propose a co scheduling mechanism, providing the ability to coordinate execution between jobs on different systems. The mechanism is built on top of a lightweight protocol for coordination between policy domains without manual intervention. We have evaluated this system using real job traces from Intrepid and Eureka, the production Blue Gene/P and data analysis systems, respectively, deployed at Argonne National Laboratory. Our experimental results quantify the costs of co scheduling and demonstrate that co scheduling can be achieved with limited impact on system performance under varying workloads.