ULF: Unsupervised Labeling Function Correction using Cross-Validation for Weak Supervision

Content

Abstract
Authors
Shortfacts

Abstract

A cost-effective alternative to manual data labeling is weak supervision (WS), where data samples are automatically annotated using a predefined set of labeling functions (LFs), rule-based mechanisms that generate artificial labels for the associated classes. In this work, we investigate noise reduction techniques for WS based on the principle of k-fold cross-validation. We introduce a new algorithm ULF for Unsupervised Labeling Function correction, which denoises WS data by leveraging models trained on all but some LFs to identify and correct biases specific to the held-out LFs. Specifically, ULF refines the allocation of LFs to classes by re-estimating this assignment on highly reliable cross-validated samples. Evaluation on multiple datasets confirms ULF's effectiveness in enhancing WS learning without the need for manual labeling.

Top

Authors

Sedova, Anastasiia
Roth, Benjamin

Top

Shortfacts

Category	Paper in Conference Proceedings or in Workshop Proceedings (Poster)
Event Title	Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Divisions	Data Mining and Machine Learning
Subjects	Kuenstliche Intelligenz Sprachverarbeitung
Event Location	Singapore
Event Type	Conference
Event Dates	6-10 Dec 2023
Publisher	Association for Computational Linguistics
Page Range	pp. 4162-4176
Date	1 December 2024
Official URL	https://aclanthology.org/2023.emnlp-main.254
Export

Top