LLM Performance for Code Generation on Noisy Tasks

Radzim Sendyka; Christian Cabrera; Andrei Paleyes; Diana Robinson; Neil D. Lawrence

Back to publications

LLM Performance for Code Generation on Noisy Tasks

Radzim Sendyka, Christian Cabrera, Andrei Paleyes, Diana Robinson, Neil D. Lawrence

, 2025.

Abstract

This paper investigates the ability of large language models (LLMs) to recognise and solve tasks which have been obfuscated beyond recognition. Focusing on competitive programming and benchmark tasks (LeetCode and MATH), we compare performance across multiple models and obfuscation methods. We introduce the concept of eager pattern matching and discuss implications for benchmarking, dataset contamination, and automated software systems.

Links

Cite this Paper

BibTeX


@Misc{publications/llm-performance-for-code-generation-on-noisy-tasks,
  title = 	 {LLM Performance for Code Generation on Noisy Tasks},
  author = 	 {Sendyka, Radzim and Cabrera, Christian and Paleyes, Andrei and Robinson, Diana and Lawrence, Neil D.},
  year = 	 {2025},
  url = 	 {/publications/llm-performance-for-code-generation-on-noisy-tasks.html},
  abstract = 	 {This paper investigates the ability of large language models (LLMs) to recognise and solve tasks which have been obfuscated beyond recognition. Focusing on competitive programming and benchmark tasks (LeetCode and MATH), we compare performance across multiple models and obfuscation methods. We introduce the concept of eager pattern matching and discuss implications for benchmarking, dataset contamination, and automated software systems.}
}

Endnote

%0 Generic
%T LLM Performance for Code Generation on Noisy Tasks
%A Radzim Sendyka
%A Christian Cabrera
%A Andrei Paleyes
%A Diana Robinson
%A Neil D. Lawrence
%D 2025	
%F publications/llm-performance-for-code-generation-on-noisy-tasks
%U /publications/llm-performance-for-code-generation-on-noisy-tasks.html
%X This paper investigates the ability of large language models (LLMs) to recognise and solve tasks which have been obfuscated beyond recognition. Focusing on competitive programming and benchmark tasks (LeetCode and MATH), we compare performance across multiple models and obfuscation methods. We introduce the concept of eager pattern matching and discuss implications for benchmarking, dataset contamination, and automated software systems.

RIS


TY  - GEN
TI  - LLM Performance for Code Generation on Noisy Tasks
AU  - Radzim Sendyka
AU  - Christian Cabrera
AU  - Andrei Paleyes
AU  - Diana Robinson
AU  - Neil D. Lawrence
DA  - 2025/05/26	
ID  - publications/llm-performance-for-code-generation-on-noisy-tasks
UR  - /publications/llm-performance-for-code-generation-on-noisy-tasks.html
AB  - This paper investigates the ability of large language models (LLMs) to recognise and solve tasks which have been obfuscated beyond recognition. Focusing on competitive programming and benchmark tasks (LeetCode and MATH), we compare performance across multiple models and obfuscation methods. We introduce the concept of eager pattern matching and discuss implications for benchmarking, dataset contamination, and automated software systems.
ER  -

APA


Sendyka, R., Cabrera, C., Paleyes, A., Robinson, D. & Lawrence, N.D.. (2025). LLM Performance for Code Generation on Noisy Tasks.  Available from /publications/llm-performance-for-code-generation-on-noisy-tasks.html.