Prompt Variability Effects on LLM Code Generation

Andrei Paleyes, Radzim Sendyka, Diana Robinson, Christian Cabrera, Neil D. Lawrence
Third International Workshop on Large Language Models for Code (LLM4Code 2026), co-located with ICSE 2026, 2026.

Abstract

Large language models lower barriers to writing code, but the quality of generated programs is sensitive to prompt quality and the user’s background. We propose a synthetic evaluation pipeline and a systematic persona-based evaluation approach to quantify an LLM’s sensitivity to input variations, independent of specific programming tasks or models.