Using static text prompts as the basis for an engineering business will lead to inconsistency because they are managed as pieces of source code through version-controlled repositories. Version control also assists us with prompt "regressions" caused by changes to a model or phrase that suddenly degrades the quality of the output generated. Our teams use a Git-like flow so they can perform peer-review and maintain clear records of which prompt version created an output in production. Modular architecture allows us to dynamically generate prompts using a library of reusable components. Instead of creating bespoke, extensive prompts for all tasks, we isolate the function we want to accomplish; the domain that we're focused on; and the rules we needed to follow to format properly in three separate components. These providers provide the developer with the most current approved components for developing the final prompt when completing an AI-driven task through a workflow. This means that when a coding standard or security standard is modified for a component, those changes are automatically propagated to each workflow that links to that component. Establishing an engineering process for using prompts is critical to establishing reliability. For our most important tasks, we maintain a set of golden outputs, and all prompt changes must first be validated against these outputs prior to release into production. There must be an engineering oversight of AI, just like any other engineering component, and the objective must not be to put AI into a box but to treat AI as another predictable option within the development stack. The biggest challenge is changing the mind-shift about regarding natural language prompts as a serious input within the engineering process. Once the perception of prompts changes from a message to a configuration, you will find a far more consistent output for your professional workflows.
The majority of professionals I know view prompts as important resources that could be used repeatedly, rather than just a clever thing they thought of once. Discipline 1: Separation. All prompts have multiple parts: Role, constraints, inputs, and outputs. By keeping these separate, it's easier to identify when you can reuse and troubleshoot them as needed for your project. A well-organized prompt library looks more like internal documentation than creative writing. Discipline 2: Version control. Many teams will have a central repo for all of their work, with their associated documentation on which versions of their prompts were used for what purpose, including assumptions it made and errors it produced. This makes it much easier to track the history of what worked and what didn't, so that if a prompt ever needs to be reused, you know exactly how to do so. Discipline 3: Testing in small increments. Most conversations with experts end with them not continuing to repeat or improve their prompts, but rather changing only one element of each prompt before seeing what impact that element has on the final product (e.g., change the tone, modify the length). Therefore, the resulting output should be consistent and as close to the previous version as possible, as they would be less likely to modify the same element multiple times.