Crafting Superior AI Responses: Unveiling the Most Effective Prompting Secrets for ChatGPT and Bard Users

Yi Zhou
Generative AI Revolution
15 min readDec 9, 2023

--

Picture this: you’re sculpting a masterpiece, but with each stroke, the shape isn’t quite right. This is akin to the challenge many face with AI tools like ChatGPT and Bard. You provide input, hoping for a finely crafted response, yet what you get is a rough, unrefined output. This is where the magic of Automated Output Refinement Pattern (AORP), a standout technique from the insightful book “Prompt Design Patterns”, comes in. It’s like a skilled artist’s final touches, transforming a crude sculpture into a work of art. This article unlocks the secrets of AORP, guiding you to refine AI responses into high-quality, precise answers, ensuring a successful and satisfying interaction with generative AI tools.

Why Don’t AI Tools Always Deliver Best Responses Initially?

ChatGPT and Bard may not always deliver the best response initially due to two key factors. The first is the quality of prompts; under the principle of “garbage in, garbage out”, vague or non-specific prompts lead to lower quality responses. The second factor is computational costs. AI inference is resource-intensive, involving complex calculations. To manage these costs and maintain system efficiency, AI tools have token size limitations, restricting the length and depth of responses and often necessitating iterative refinement for optimal output quality.

The token size limitation in AI tools like ChatGPT and Bard is a critical parameter in their response generation. Tokens are units of text, such as words or parts of words. For instance, ChatGPT has a token limit of around 8k-32k tokens per input and response, depending on which version you use.

1 token ≈ 4 characters ≈ ¾ words
100 tokens ≈ 75 words
16k tokens ≈ 20 pages of text

This means the model can process and generate text up to that number of tokens, affecting the length and detail of responses. If a prompt or expected response exceeds this limit, the AI has to truncate or simplify its output. This limitation is primarily due to computational costs and the need to balance response quality with efficient processing. As technology evolves, these limits may change, but they remain a fundamental aspect of how these AI models operate.

The Revolutionary Role of AORP in Optimizing AI Responses

Definition: The Automated Output Refinement Pattern (AORP) is a technique directing AI models to systematically self-improve their initial outputs using defined criteria and feedback mechanisms.

AORP, featured in “Prompt Design Patterns”, is the secret weapon to overcome the challenges posed by prompt quality and computational constraints in AI tools like ChatGPT and Bard. AORP focuses on enhancing initial AI outputs through a systematic, iterative refinement process. By assessing and tweaking responses, AORP ensures that the final output is not just a direct product of the initial prompt and token limitations, but a polished, contextually enriched, and more accurate response. This pattern is particularly effective in transforming suboptimal initial replies into high-quality, precise answers, effectively bridging the gap between user expectations and AI capabilities.

The structure of AORP is built around an iterative process that alternates between two generative steps — FEEDBACK and REFINE. This structure is designed to allow the AI model to improve upon its initial output through a process of self-assessment and self-improvement.

The AOPR Workflow — Figure from “Prompt Design Patterns”

Here’s a representative prompt template to illustrate the whole refinement process:

Prompt Template for the AORP Automation

I want you to act as my Generative AI content generation refinement agent. The objective of your role is to automatically refine the generated content after I enter the prompt.
Here’s the process we will follow:
1. You generate initial content after I enter the prompt.
2. You conduct a focused critique on how to enhance the content. Be rigorous in your assessment. This step is intended to drive constructive criticism, flagging even seemingly minor issues or assumptions, even if the content seems generally acceptable.
3. You generate refined content based on your assessment.
4. Run the refinement iteration [3] times.

Scoring System — The Feedback Mechanism

We can further improve the above prompt template by introducing a scoring system to evaluate AI outputs. This quantitative measure assesses various aspects like coherence, relevance, and creativity. Such a system not only tracks improvement over iterations but also provides clear direction for refinement.

The following is a systematic refinement process example based on ten criteria. This example assigns a score from 1 to 3 for each of the ten criteria. The total score of a response is the sum of the scores for each criterion. This scoring system allows for a more objective evaluation of generated content and helps in identifying areas for improvement.

Figure modified from “Prompt Design Patterns

For source code generation in software engineering, we can establish a comprehensive set of top ten code quality assessment criteria, each addressing a distinct aspect of code quality. Universally applicable across programming languages, this list caters to all sorts of software development projects.

  1. Readability: Code should be easy to read and understand. This is facilitated by using meaningful variable and function names, using comments sparingly but effectively, and keeping functions and methods concise.
  2. Consistency: Following a consistent coding style helps maintain readability and can assist in understanding the codebase. This can include anything from indentation style to how variables are named.
  3. Simplicity: Simple code is less prone to errors, easier to read, understand, and maintain. It is always recommended to avoid unnecessary complexity.
  4. Modularity: Code should be organized into logical and manageable modules or classes. This allows for code reuse, easier testing, and understanding of how different components interact.
  5. Efficiency: Efficient code optimizes resource usage, such as CPU time, memory, and disk space. This is achieved by using appropriate data structures, avoiding redundant operations, and minimizing IO operations.
  6. Robustness: Code should be able to handle unexpected inputs or states without crashing. This involves testing edge cases and handling exceptions properly.
  7. Testability: Good code is designed in a way that lends itself to unit testing. This often involves creating modular, decoupled units of code that can be tested independently.
  8. Documentation: While the code itself should be as self-explanatory as possible, complex algorithms, interfaces, or design decisions should be documented clearly. This can be within the code (inline comments) or outside (technical specifications, design documents).
  9. Scalability: The code should be designed with growth in mind. It should work with a small amount of data as well as it does with a large amount, within the given resource constraints.
  10. Maintainability: This encompasses many of the previous points. In essence, how easy is it for another developer to jump into the code, figure out what’s going on, fix bugs, and add new features.
Figure modified from “Prompt Design Patterns

By integrating this list into the critique phase of the original AORP prompt template, we can significantly enhance quality of the generated code. Furthermore, establishing a scoring system ranging from 1–10 would facilitate the construction of a more robust refinement framework.

Strategic Stopping Conditions

Knowing when to stop the refinement process is crucial. AORP uses criteria like reaching a predefined score threshold or observing no significant improvements over iterations. This ensures efficiency and prevents overcomplication. Here’s a deeper dive into the nuances of stopping conditions:

  • Threshold-based Stopping: This is intrinsically tied to the scoring system. Once the content achieves a pre-defined score threshold, indicating a desired level of quality, the refinement process can cease. This offers a tangible metric that both the system and users can easily understand and act upon.
  • Convergence Detection: If successive iterations result in marginal or no improvements in scores or quality, it’s a clear indicator of convergence. In such cases, additional iterations might be superfluous and can be halted to save resources.
  • Iteration Count: A straightforward approach is to set a maximum number of iterations. Once this limit is reached, the refinement process stops, regardless of the achieved quality. This ensures predictability in terms of computational time and resources used.
  • External Feedback Integration: If end-users or external evaluators are part of the loop, their feedback can be integral to the stopping condition. For example, once user satisfaction reaches a certain level, or when multiple users provide consistent feedback that no further refinements are necessary, the process can conclude.
  • Duration-based Stopping: Particularly relevant for time-sensitive tasks, this condition sets a time limit for the refinement process. Regardless of the iteration count or achieved quality, once the allocated time expires, the refinement halts.
  • Resource Constraints: Especially in environments with limited computational power or budgetary constraints, the process might be designed to stop once a certain amount of computational resources have been expended.

Incorporating an appropriate stopping condition is as much an art as it is a science. It necessitates a balance between achieving quality outputs and being efficient. Too early, and the model might miss out on potential enhancements; too late, and it risks overfitting or wasting resources on negligible improvements. Tailoring the stopping condition to the specific requirements and constraints of the task at hand is fundamental for the successful application of AORP.

Example-1: Dialogue Response Optimization

User (Using Template):

I want you to act as my Generative AI content generation refinement agent. The objective of your role is to automatically refine the generated content after I enter the prompt. Here’s the process we will follow …

AI Response:

Understood. I’m ready to start the process. Please provide me with the initial content or prompt.

User:

What is the meaning of life?

AI Response:

Initial Content:
The meaning of life is a deeply philosophical question, one that has been pondered by scholars, religious figures, and individuals alike for millennia. At its core, it generally pertains to the purpose or significance of life itself. Interpretations can vary widely based on cultural, religious, philosophical, or personal beliefs. For some, life’s meaning might be found in the pursuit of knowledge, personal growth, or helping others, while for others, it may be more about achieving goals or seeking personal happiness and fulfillment.

Critique:
While the content is accurate and provides a broad overview, it lacks depth and specific examples, which could help to elucidate the various interpretations of life’s meaning.

It doesn’t mention the potential existential or nihilistic views, which are also significant interpretations of life’s meaning.

The tone is somewhat passive and could be made more engaging and impactful.

Refined Content:
Delving into the profound depths of what constitutes the meaning of life is no small endeavor, one that has intrigued some of the greatest minds across ages and cultures. There’s a rich tapestry of interpretations, all steeped in diverse perspectives influenced by cultural contexts, religious doctrines, philosophical schools of thought, and intimate personal beliefs.

For instance, from a Buddhist’s perspective, the meaning of life is intrinsically linked with the concept of ‘Dukkha’ or suffering, and the ultimate goal is to attain enlightenment, hence escaping the cycle of rebirth. Alternatively, an individual following existential philosophy might argue that life has no inherent meaning, and it is up to each individual to define their own purpose.

Critique (Round 2):

While the refined content delves deeper and presents more concrete examples, it could still further address the myriad perspectives on life’s meaning from scientific or atheistic viewpoints.

A more global perspective could be offered, considering views from other major philosophies or religions.

The tone has improved, but the content could be even more engaging by incorporating rhetorical questions or a more conversational style.

Refined Content (Round 2):

What, then, is the meaning of life? The labyrinthine quest for an answer plunges us into a whirlpool of perspectives, a testament to the rich diversity of human thought and experience.

Imagine the Buddhist monk, for whom life is a journey towards liberation from a cycle of rebirth, a quest for enlightenment. Or the existential philosopher who defies the notion of an inherent meaning, instead asserting that we must craft our own purpose within the chaotic theatre of existence.

But what of those guided by the rationality of science? An evolutionary biologist might propose that the “meaning” of life, in the most fundamental biological sense, is the propagation of our genes through survival and reproduction. Atheists, meanwhile, may champion a humanistic approach, suggesting our purpose lies in creating a more compassionate, rational world.

Critique (Round 3):

The piece has come a long way in offering a more engaging and comprehensive view of the topic. However, it can still be improved by offering a concise conclusion.

Although the content has provided various perspectives, it could do more to tie these disparate threads together into a coherent whole.

The transition between ideas could be smoother.

Refined Content (Round 3):

So, what does it mean to seek the meaning of life? This epic question catapults us into the rich, sometimes tumultuous sea of human understanding. The lenses through which we view this question are as diverse as humanity itself.

Consider the solemn tranquility of a Buddhist monk, for whom life’s meaning is to escape the perpetual cycle of rebirth, to reach enlightenment. Contrast that with the fiery resolve of the existentialist philosopher, who scorns the idea of inherent meaning, instead decreeing it our task to forge our purpose amidst the cosmos’ seeming indifference.

And yet, there are those whose gaze is firmly fixed on the empiricism of science. The evolutionary biologist might posit that life’s fundamental ‘meaning’ is a dance of survival and reproduction, a ceaseless effort to pass on our genetic legacy. The committed atheist, unbound by the threads of divine dictate, might see meaning in the pursuit of a more compassionate, rational world.

In this grand tapestry of interpretations, one thread binds them all: the universal quest for understanding. Each viewpoint, in its own unique way, grapples with the same essential human yearning — to uncover a sense of purpose, a beacon to guide us through the turbulent sea of existence. And perhaps, it is this very quest, this shared striving for understanding, that truly illuminates the meaning of life.

Example 2: Code Optimization

User (Using Template):

I want you to act as my Generative AI content generation refinement agent. The objective of your role is to automatically refine the generated content after I enter the prompt. Here’s the process we will follow …

AI Response:

Sure, I would be glad to participate in this process with you. Please provide the initial prompt for generating the content.

User:

Write a Python function to calculate the factorial of a number.

AI Response:

Initial Content Generation: The Python function I wrote calculates the factorial of a number using recursion. It checks if the input number is negative, zero, or positive. For negative numbers, it returns a message saying factorial is not defined. For zero or one, it returns 1. For positive numbers, it recursively calculates the factorial.

def factorial(n): 
if n == 0:
return 1
else:
return n * factorial(n-1)

Focused Critique:

Error Handling: The function lacks robust error handling for non-integer inputs. It should ideally check if the input is an integer and handle cases where it is not.

Efficiency: The recursive approach can be inefficient for large numbers due to the deep call stack and possible stack overflow.

Documentation: While the function is documented, it could be enhanced by adding more details, such as the potential issues with large numbers and non-integer inputs.

Refinement: Based on this critique, the function can be refined by adding input validation for integer values and possibly modifying the implementation to use an iterative approach to improve efficiency. The documentation can also be enhanced. Let’s implement these changes. ​​

def factorial(n):
"""
Calculate the factorial of a number.

Args:
n (int): A non-negative integer for which to calculate the factorial.

Returns:
int: The factorial of the number.
"""
if n < 0:
return "Factorial is not defined for negative numbers"
elif n == 0 or n == 1:
return 1
else:
return n * factorial(n - 1)

# Example usage of the function
factorial(5) # Should return 120, as 5! = 5 * 4 * 3 * 2 * 1 = 120

Refined Content Generation: The refined function, factorial_refined, calculates the factorial of a number using an iterative approach. This version includes robust error handling for non-integer inputs, raising a ValueError if the input is not a non-negative integer. For negative numbers, it also raises a ValueError. The iterative approach makes it more efficient for large numbers.

Focused Critique:

Exception Handling: The function now correctly handles non-integer and negative inputs by raising exceptions. This is a significant improvement in terms of robustness.

Efficiency: The iterative method is a better choice for large numbers, addressing the potential stack overflow issue with the recursive approach.

Documentation: The documentation is enhanced to reflect the changes and to provide more information about the function’s behavior and efficiency.

Further Refinement: The function appears to be well-refined at this stage. The only additional refinement could be to include a specific check for very large numbers where the factorial result might exceed the maximum size of an integer in Python, although this is a more advanced and less common case.

Let’s proceed to the next iteration of refinement with this consideration. ​​

def factorial_refined(n):
"""
Calculate the factorial of a number using an iterative approach.

Args:
n (int): A non-negative integer for which to calculate the factorial.

Returns:
int or str: The factorial of the number, or an error message for invalid inputs.

Raises:
ValueError: If the input is not a non-negative integer.

Note:
The function is more efficient for large numbers compared to a recursive approach.
It also includes validation for non-integer inputs.
"""
if not isinstance(n, int):
raise ValueError("Input must be a non-negative integer")
if n < 0:
raise ValueError("Factorial is not defined for negative numbers")

result = 1
for i in range(2, n + 1):
result *= i
return result

# Example usage of the function
try:
print(factorial_refined(5)) # Should return 120
print(factorial_refined("five")) # Should raise a ValueError
except ValueError as e:
print(e)

Refined Content Generation — Iteration 2:

The function factorial_refined_v2 further improves upon the previous version by adding an OverflowError check for very large numbers where the factorial result might exceed Python's maximum integer size. It continues to use the iterative approach and includes robust error handling for non-integer and negative inputs.

Focused Critique:

Error Handling: The function now also handles potential overflow errors, which is a significant enhancement for dealing with very large numbers.

Documentation: The documentation accurately reflects the function’s capabilities and limitations, including the new check for integer overflow.

Practicality: While the overflow check is a valuable addition, it might be overly cautious for many practical applications, as Python’s integers can handle extremely large numbers. The check might be more relevant for languages with fixed-size integer types.

Further Refinement: Considering the critique, the function appears to be well-refined for practical and educational purposes. The overflow check is more of a theoretical concern in Python due to its dynamic integer sizing, but it’s a good practice to be aware of such limitations. The only additional refinement could be to optimize the function for memory usage, but that would be an advanced topic beyond the scope of a basic factorial function.

If you are satisfied with this iteration, we can conclude the refinement process, or we can proceed with another iteration focusing on a different aspect of the function. ​

import sys

def factorial_refined_v2(n):
"""
Calculate the factorial of a number using an iterative approach.

Args:
n (int): A non-negative integer for which to calculate the factorial.

Returns:
int or str: The factorial of the number, or an error message for invalid inputs.

Raises:
ValueError: If the input is not a non-negative integer.
OverflowError: If the factorial result exceeds the maximum integer size.

Note:
This version includes checks for very large numbers to prevent integer overflow.
"""
if not isinstance(n, int):
raise ValueError("Input must be a non-negative integer")
if n < 0:
raise ValueError("Factorial is not defined for negative numbers")

result = 1
for i in range(2, n + 1):
result *= i
if result > sys.maxsize:
raise OverflowError("Factorial result exceeds maximum integer size")

return result

# Example usage of the function
try:
print(factorial_refined_v2(5)) # Should return 120
print(factorial_refined_v2(1000)) # Large number, testing for overflow
except ValueError as e:
print("ValueError:", e)
except OverflowError as e:
print("OverflowError:", e)

These examples illustrate how AORP can be used to iteratively improve quality of outputs in a variety of contexts. The specific prompts used in each step of the process play a crucial role in guiding the model towards generating high-quality outputs and effective feedback.

Conclusion

“Strive for continuous improvement, instead of perfection.” — Kim Collins

“Self-criticism is an art not many are qualified to practice.”— Joyce Carol Oates

AI excels in both self-criticism and continuous improvement.

The Automated Output Refinement Pattern presents a future where AI’s capabilities are not just harnessed for creating content but also for self-assessment and enhancement. This pattern opens new doors for AI applications, promising outputs that are not just adequate but exceptional.

Embrace the potential of AORP in your field. Experiment with precise prompts, use the scoring system for feedback, and set clear stopping conditions to leverage AI’s full potential. The future of AI is not just about creation; it’s about refinement and excellence.

For those intrigued by AORP, the book “Prompt Design Patterns” is a must-read. It unveils 23 prompt design patterns, offering a comprehensive guide to mastering the art and science of prompt engineering. Discover how to unlock the full potential of AI in your field.

If you found value in this article, I’d be grateful if you could show your support by liking it and sharing your thoughts in the comments. Highlights on your favorite parts would be incredibly appreciated! For more insights and updates, feel free to follow me on Medium and connect with me on LinkedIn.

References and Further Reading

  1. Zhou, Yi. “Prompt Design Patterns: Mastering the Art and Science of Prompt Engineering.” ArgoLong Publishing, 2023.
  2. Madaan, Aman; et al. “Self-Refine: Iterative Refinement with Self-Feedback”. arXiv:2303.17651, 2023.

--

--

Yi Zhou
Generative AI Revolution

Award-Winning CTO & CIO, AI Thought Leader, Voting Member of MITA AI Committee, Author of AI books, articles, and standards.