ML Pipeline Insights Service for Rule-Based Assessment of Training Practices in Reinforcement Learning
As artificial intelligence continues to advance, Reinforcement Learning (RL) has established itself as a core approach for developing intelligent agents that make decisions over time. As RL systems grow in complexity, the need for standardized training practices becomes critical. This paper introduces a rule-based assessment approach to enforce best practices in RL training. We define a comprehensive set of architectural rules focused on RL pipeline practices, models versioning, multi-agents deployment and managing models in inference. Our methodology integrates Large Language Models (LLMs) and custom-based code detectors to ensure compliance with these best practices across diverse RL systems. We developed a \textit{ML pipeline insights service} to automatically validate RL training practices directly from the source code. We validate our approach by applying it in a large-scale industrial case study and sixteen open-source case studies. Our evaluation showed that custom-based detectors achieved near-perfect precision and recall F_1 = 0.98, while LLM-based detectors provided scalable validation with moderate F_1 scores (0.67--0.71), demonstrating the hybrid approach’s strength in balancing accuracy and automation. The results demonstrate our tool's accuracy in identifying and enforcing best practices with high precision and recall rates, highlighting its practical applicability and automation feasibility.

- Ntentos, Evangelos
- Urdih, Francesco
- Zdun, Uwe

Category |
Paper in Conference Proceedings or in Workshop Proceedings (Paper) |
Event Title |
51th Euromicro Conference Series on Software Engineering and Advanced Applications (SEAA) |
Divisions |
Software Architecture |
Subjects |
Software Engineering |
Event Location |
Salerno, Italy |
Event Type |
Conference |
Event Dates |
10 -12 September 2025 |
Date |
10 September 2025 |
Export |
