Data Allocation with Neural Similarity Estimation for Data-Intensive Computing
Science collaborations such as ATLAS at the high-energy particle accelerator at CERN use a computer grid to run expensive computational tasks on massive, distributed data sets. Dealing with big data on a grid demands workload management and data allocation to maintain a continuous workflow. Data allocation in a computer grid necessitates some data placement policy that is conditioned on the resources of the system and the usage of data. In part, automatic and manual data policies shall achieve a short time-to-result. There are efforts to improve data policies. Data placement/allocation is vital to coping with the increasing amount of data processing in different data centers. A data allocation/placement policy decides which locations sub-sets of data are to be placed. In this paper, a novel approach copes with the bottleneck related to wide-area file transfers between data centers and large distributed data sets with high dimensionality. The model estimates similar data with a neural network on sparse and uncertain observations and then proceeds with the allocation process. The allocation process comprises evolutionary data allocation for finding near-optimal solutions and improves over 5% on network transfers for the given data centers.
Top- Vamosi, Ralf
- Schikuta, Erich
Category |
Paper in Conference Proceedings or in Workshop Proceedings (Paper) |
Event Title |
ICCS 2022: International Conference on Computational Science |
Divisions |
Workflow Systems and Technology |
Subjects |
Datenverarbeitungsmanagement Datenbanken Datenspeicher |
Event Location |
London, United Kingdom |
Event Type |
Conference |
Event Dates |
21-23 June, 2022 |
Date |
2022 |
Export |