CogMemLM: Human-Like Memory Mechanisms Improve Performance and Cognitive Plausibility of LLMs

CogMemLM: Human-Like Memory Mechanisms Improve Performance and Cognitive Plausibility of LLMs

Abstract

In the present contribution to the BabyLM STRICT track, we take a threefold approach: firstly, we implement a simple curriculum learning approach and split the provided BabyLM dataset into four sub-datasets by increasing complexity, to broadly structure the data such that it better reflects what kind of input is available to infants and children throughout development. Secondly, we simulate a memory-based vocabulary learning inspired by psycholinguistic work and use this information to guide the token encodings. Thirdly, we implement redundant text representations to make the compositional aspect of language more salient: The lexicons that emerge from our curriculum learning steps, respectively, shape the (token) encoding of the given input text. We pre-trained a RoBERTa-base architecture with masked language modeling, considering these simple, human memory-based mechanisms. Our CogMemLM-s model achieves improved results compared to the BabyLM RoBERTa baseline model in 29 out of 39 evaluation tasks. Although the so far integrated mechanisms have been implemented in a simplified form with regard to cognitive plausibility, it is intriguing that our pre-training method already improved the performance considerably.

Grafik Top
Authors
  • Thoma, Lukas
  • Weyers, Ivonne
  • Cano, Erion
  • Schweter, Stefan
  • Mueller, Jutta L
  • Roth, Benjamin
Grafik Top
Shortfacts
Category
Paper in Conference Proceedings or in Workshop Proceedings (Poster)
Event Title
Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning
Divisions
Data Mining and Machine Learning
Subjects
Kuenstliche Intelligenz
Sprachverarbeitung
Event Location
Singapore
Event Type
Workshop
Event Dates
6-10 Dec 2023
Series Name
Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning
Publisher
Association for Computational Linguistics
Page Range
pp. 180-185
Date
December 2023
Official URL
https://aclanthology.org/2023.conll-babylm.15
Export
Grafik Top