Prompt engineering in one paragraph
Prompt engineering is the practice of designing the text input to an LLM so the output is reliable enough for production. Early in the LLM era prompt engineering was treated as a creative art; in production it is treated as engineering — versioned prompts, eval suites, A/B testing, and rollback discipline. The work is similar to software engineering: clear specifications, comprehensive tests, observable behaviour, and accountability for regressions.
What's actually in a production prompt
A production prompt typically includes:
- System prompt — high-level instructions, role definition, tone, safety boundaries
- Tool definitions — when the prompt enables tool use, the available tools' schemas and descriptions
- Output format specifications — XML tags, JSON schemas, or structured-output constraints
- Few-shot examples — input/output pairs that demonstrate the desired behaviour
- Context — the actual data the model needs to reason about (retrieved chunks, user input, prior conversation)
- Reasoning scaffolding — instructions that tell the model to think step-by-step or use thinking blocks where supported
Claude rewards specific patterns: XML-structured prompts with named tags (<context>, <task>, <constraints>), explicit instructions about output shape, and concrete examples of desired output.
Production prompt discipline
Treating prompts as code:
- Prompt registry — every production prompt has a version, an owner, and a changelog
- Semantic versioning — major.minor.patch with breaking-change discipline
- Eval suite — golden-set test cases with expected output; new prompts must meet eval-bar before promotion
- A/B routing — incremental rollout with metric monitoring
- Rollback discipline — bad prompts can be reverted in seconds, not hours
- CI gating — eval-bar parity blocks production prompt changes
NINtec's deployments build this from architecture phase. Production prompts cannot be edited freely; they go through the same discipline as production code.
Common prompt-engineering patterns
Patterns we use repeatedly:
- Role priming — tell Claude what role it is playing ("You are a senior compliance officer reviewing a regulatory letter")
- Constrained outputs — instruct Claude to respond only in a specific structure ("Respond only in valid JSON matching this schema")
- Citation forcing — when grounding on retrieved context, instruct Claude to cite sources ("Cite the chunk number after each claim")
- Refusal forcing — for safety boundaries, instruct Claude to refuse rather than fabricate ("If the answer is not in the provided context, say so explicitly rather than guessing")
- Thinking blocks — where supported, ask Claude to reason inside <thinking> tags before producing the final output
- Negative examples — show Claude what not to do, alongside positive examples