Parallel Failure Recovery Techniques in Cluster-based Media Servers

[Conference] Parallel Failure Recovery Techniques in Cluster-based Media Servers

2012.07.06 10:56

Parallel Failure Recovery Techniques in Cluster-based Media Servers.pdf (436.6K) [194] DATE : 2012-07-06 10:56:46

For large-scale video-on-demand (VOD) service, cluster servers are highlighted due to their high performance and low cost. A cluster server consists of a front-end node and multiple backend nodes. Though the increase in backend nodes provides more quality of service (QoS) streams, the possibility of backend node failure is proportionally increased. The failure causes not only the cessation of streaming services but also the loss of current playing positions. In this paper, when a backend node fails, recovery mechanisms are studied to support the streaming service continuously. Without considering the characteristics of cluster-based servers and MPEG media, the basic redundant array of independent disks (RAID) techniques cause a network bottleneck in the internal network path and demonstrate inefﬁcient CPU usage in backend nodes. To address these problems, a new failure recovery mechanism is proposed based on the pipeline computing concept. The proposed method not only distributes the internal network trafﬁc generated from the recovery operations but also utilizes the CPU time available in the backend nodes. In the experiments, even if a backend node fails, the proposed method provides continuous streaming media services within a short MTTR value as well as more QoS streams than the existing method.

MEMBER LOGIN

논문/교재/특허/SW

International Papers