此文是關于阿裡雲伏羲平台的論文,一些感興趣的點:
fuxi:a resouce management and job scheduling system. (我感覺是基于yarn做的,很像yarn)
1, an incremental resource management protocol
2, a user-transparent failure recovery
3, a effective (faulty-node) detection mechanism and a mlti-level blacklisting schema
fuxi (fuximaster, appmaster, tubo) <>yarn(resourcemanager, appmaster, nodemanager)
fuxi 與 yarn差別:
1,fuxi seperates the notion of task(the application process that performs the actual work) and container(the unit of resource grant). once an application master receives an grant , it explicitly controls its life-cycle and may reuse the container to run multiple tasks.
2,lcality tree based scheduling.