How to structure your API prompt calls for Open-Source LLMs

Short Summary of ``Prompt-Engineering for Open-Source LLMs''

Credit: Dr. Sharon Zhou, Co-Founder & CEO of Lamini, for sharing valuable insights in her talk"Prompt-Engineering for Open-Source LLMs".

This blog targets builders, developers, AI engineers, and ML engineers, highlighting the nuances of prompt engineering for open-source LLMs, specifically Mistral-7B-Instruct-v0.1 and Llama-2-7b-chat-hf, compared to closed-source models like GPT-3.5 and GPT-4.

Open vs. Closed LLMs

When switching across any LLM — especially from closed-source LLMs like OpenAI to open-source LLMs like Llama, you might find out that many of your prompts don’t work anymore.

Open-source LLMs like Mistral-7B and Llama-2-7B-Chat require a different approach to prompting compared to closed-source models like GPT-3.5 or GPT-4. The primary reason is the lack of "metatags". These metatags are like a set of specific instructions that the model is "trained" to follow, or to respond to.

Adding Metatags to your prompts

When working with open-source LLMs, it's essential to be aware of the Metatags, or specific "instructions" (or syntax) required for effective prompting. Each model has its unique requirements that must be adhered to for optimal results.

Fortunately, these are just some Strings that you need to add to your prompts when calling the LLMs.


In Mistral-7B, the prompt structure should follow:

<s>[INST] {text} [/INST]


For Llama-2, the structure is slightly different:

<s>[INST]<<SYS>>\n{system_text}<</SYS>>\n{text} [/INST]

Practical Examples

Example Prompts

Let's consider a few example prompts:

  1. "Given the fact that I'm drinking green juice, am I healthy?"

  2. "I'm drinking green juice"

Using Mistral-7B-Instruct-v0.1

With Mistral-7B, we need to wrap our prompts as follows:

prompts = [
    "Given the fact that I'm drinking green juice, am I healthy?",
    # ... other prompts

for prompt in prompts:
    # Imagine llm.generate() is the function to get responses from Mistral-7B
    print(f"=============Prompt: <s>[INST] {prompt} [/INST]=============")
    print(llm.generate(f"<s>[INST] {prompt} [/INST]"))

Using Llama-2-7b-chat

For Llama-2-7b-chat, the structuring is different. We add a system and a user separation on the initial prompt, plus the Metatags.:

prompts = [
        "system": "You are a health food nut.",
        "user": "I'm drinking green juice",
    # ... other prompts

for prompt in prompts:
    hydrated_prompt = f"<s>[INST] <<SYS>>\n{prompt['system']}\n<</SYS>>\n{prompt['user']} [/INST]"
    # Imagine llm_compare.generate() is the function for Llama-2-7b-chat
    print(f"=============Prompt: {hydrated_prompt}=============")

In these examples, you can see how the structure of the prompt consists of adding Metatags, then the prompt using f-strings, and finally some extra Metatags. Understanding this structure and tailoring prompts accordingly is essential in extracting the best performance from each model.


Update your open-source prompts! Use Metatags!

Credit and thanks to Dr. Sharon Zhou for sharing her expertise on this topic along with coding examples.