Allocation Optimization for the ATLAS Rebalancing Data Service

Allocation Optimization for the ATLAS Rebalancing Data Service

Abstract

The distributed data management system Rucio manages all data of the ATLAS collaboration across the grid. Automation such as replication and rebalancing are an important part to ensure the minimum workflow execution times. In this paper, a new rebalancing algorithm based on machine learning is proposed. First, it can run independently of the existing rebalancing mechanism and can be modularised. It collects data from other services and learns optimality as it runs in the background. Periodically this learning agent takes a subset of the global datasets and proposes them for redistribution to reduce waiting times. The user can interact and choose to accept, decline, or override the dataset placement suggestions. The accepted items are shifted continuously between destination data centres as a background service while taking network and storage utilisation into account.

Grafik Top
Authors
  • Vamosi, Ralf
  • Lassnig, Mario
  • Schikuta, Erich
Grafik Top
Shortfacts
Category
Paper in Conference Proceedings or in Workshop Proceedings (Paper)
Event Title
23rd International Conference on Computing in High Energy and Nuclear Physics
Divisions
Workflow Systems and Technology
Subjects
Angewandte Informatik
Event Location
Sofia, Bulgaria
Event Type
Conference
Event Dates
9-13 Jul 2018
Date
9 July 2018
Export
Grafik Top