Replication-Robust Payoff-Allocation for Machine Learning Data Markets

Replication-Robust Payoff-Allocation for Machine Learning Data Markets

Abstract

The increasing take-up of machine learning techniques requires ever-more application-specific training data. Manually collecting such training data is time-consuming and error-prone process. Data marketplaces represent a compelling alternative, providing an easy way for acquiring data from potential data providers. A key component of such marketplaces is the compensation mechanism for data providers. Classic payoff-allocation methods, such as the Shapley value, can be vulnerable to data-replication attacks, and are infeasible to compute in the absence of efficient approximation algorithms. To address these challenges, we present an extensive theoretical study on the vulnerabilities of game theoretic payoff-allocation schemes to replication attacks. Our insights apply to a wide range of payoff-allocation schemes, and enable the design of customised replication-robust payoff-allocations. Furthermore, we present a novel efficient sampling algorithm for approximating payoff-allocation schemes based on marginal contributions. In our experiments, we validate the replication-robustness of classic payoff-allocation schemes and new payoff-allocation schemes derived from our theoretical insights. We also demonstrate the efficiency of our proposed sampling algorithm on a wide range of machine learning tasks.

Grafik Top
Authors
  • Han, Dongge
  • Wooldridge, Michael
  • Rogers, Alex
  • Tople, Shruti
  • Ohrimenko, Olga
  • Tschiatschek, Sebastian
Grafik Top
Shortfacts
Category
Technical Report (Working Paper)
Divisions
Data Mining and Machine Learning
Publisher
CoRR arXiv
Date
25 June 2020
Export
Grafik Top