Data Allocation with Neural Similarity Estimation for Data-Intensive Computing

Content

Abstract
Authors
Shortfacts

Abstract

Science collaborations such as ATLAS at the high-energy particle accelerator at CERN use a computer grid to run expensive computational tasks on massive, distributed data sets. Dealing with big data on a grid demands workload management and data allocation to maintain a continuous workflow. Data allocation in a computer grid necessitates some data placement policy that is conditioned on the resources of the system and the usage of data. In part, automatic and manual data policies shall achieve a short time-to-result. There are efforts to improve data policies. Data placement/allocation is vital to coping with the increasing amount of data processing in different data centers. A data allocation/placement policy decides which locations sub-sets of data are to be placed. In this paper, a novel approach copes with the bottleneck related to wide-area file transfers between data centers and large distributed data sets with high dimensionality. The model estimates similar data with a neural network on sparse and uncertain observations and then proceeds with the allocation process. The allocation process comprises evolutionary data allocation for finding near-optimal solutions and improves over 5% on network transfers for the given data centers.

Top

Authors

Vamosi, Ralf
Schikuta, Erich

Top

Shortfacts

Category	Paper in Conference Proceedings or in Workshop Proceedings (Paper)
Event Title	ICCS 2022: International Conference on Computational Science
Divisions	Workflow Systems and Technology
Subjects	Datenverarbeitungsmanagement Datenbanken Datenspeicher
Event Location	London, United Kingdom
Event Type	Conference
Event Dates	21-23 June, 2022
Date	2022
Export

Top