Paper Review: Efficient Coflow Scheduling Without Prior Knowledge

In this paper, the authors have given an approach to solve the problem of efficient coflow scheduling without prior knowledge.  Inter-coflow scheduling improves application-level communication performance in data-parallel clusters. However, existing efficient schedulers require prior coflow information which limits their applicability. Schedulers without prior knowledge compromise on performance to avoid head-of-line blocking.

The paper discusses prior inter-coflow schedulers. Baraat and Orchestra are FIFO based, which compromise on performance by multiplexing coflows to avoid head of line blocking. Another scheduler Varys improves performance using heuristics like smallest-bottleneck-first and smallest-total-size-first. But all of them assume complete prior knowledge of the number of flows, sizes, and endpoints.

The authors have presented Coflow-Aware Least Attained Service (CLAS) that minimizes the average coflow completion time (CCT) without any prior knowledge. CLAS generalizes the classic lease attained service scheduling discipline to coflows. But instead of independently considering the number of bytes sent by each flow, CLAS takes into account the total number of bytes sent by all the flows of a coflow. As a result, smaller coflows have higher priorities than the larger ones, which brings down the average CCT.

“Aalo” employs Discretized Coflow-Aware Least-Attained Service (D-CLAS) to separate coflows into a small number of priority queues based on how much they have already sent across the cluster. By performing prioritization across queues and by scheduling coflows in the FIFO order within each queue, Aalo’s non-clairvoyant scheduler can schedule diverse coflows and minimize their completion times.

The authors deployed Aalo on EC2 and also performed trace-driven simulations. The results are quite encouraging. As per them, communication stages complete 1.93X faster on average and 3.59X faster at the 95th percentile using Aalo in comparison to per-flow mechanisms. Aalo’s performance is comparable to that of solutions using prior knowledge, and Aalo outperforms them in presence of cluster dynamics.

There are some drawbacks to this paper. First. it considers each coflow with the same priority. This has been discussed in Karuna [3]. Second, it assumes that all distributed parallel applications in the cluster use the same coflow API. This might not be the case every time. This has been discussed in CODA [2]. Third, the design of this coflow scheduling has a rigid design. That means it supports only one point in the bandwidth allocation policy design space, but operators ideally want a transport that can be tuned for different points in the design space depending on workload requirements. This has been discussed in [3].

References:

[1] Li Chen, Kai Chen, Wei Bai, Mohammad Alizadeh. Scheduling Mix-flows in Commodity Datacenters with Karuna

[2] Hong Zhang, Li Chen, Bairen Yi, Kai Chen, Mosharaf Chowdhury, Yanhui Geng. CODA: Toward Automatically Identifying and Scheduling COflows in the Dark

[3] Kanthi Nagaraj, Dinesh Bharadia, Hongzi Mao, Sandeep Chinchali, Mohammad Alizadeh, Sachin Katti. NUMFabric: Fast and Flexible Bandwidth Allocation in Datacenters

Link to the paper: https://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p393.pdf

Leave a comment