LLM Performance for Code Generation on Noisy Tasks
, 2025.
Abstract
This paper investigates the ability of large language models (LLMs) to recognise and solve tasks which have been obfuscated beyond recognition. Focusing on competitive programming and benchmark tasks (LeetCode and MATH), we compare performance across multiple models and obfuscation methods. We introduce the concept of eager pattern matching and discuss implications for benchmarking, dataset contamination, and automated software systems.