Toshifumi Kuga

July 14, 2025

generative ai, Google DeepMind, prompt

Prompt Engineering Mastery: The Fast Track

Toshifumi Kuga

July 14, 2025

generative ai, Google DeepMind, prompt

Since the debut of ChatGPT at the end of November 2022, the way we give instructions to computers has completely changed. Previously, programming languages like Python were necessary, but with ChatGPT, it's now possible to give instructions using the "natural languages" we use every day, such as English and Japanese. These natural language instructions are called "prompts." It has been about two and a half years since prompts came into use, and many people are likely experimenting with various prompts daily. As this is a new technology, systematically learning it can be challenging. However, Google has released a free white paper (1) of over 60 pages on the topic, so let's explore it for some hints. Let's begin!

1. Grasping the Basic Concepts

We often see simple prompt guides like "The Top 20 Prompts You Need to Know." However, it's impossible to effectively interact with a generative AI, which holds a vast amount of knowledge, with just about 20 prompts. While it may seem like a shortcut, memorizing a recommended list of 20 prompts each time is laborious and inefficient. Various studies are being conducted on how to write prompts, and the theoretical background is being investigated. While it's difficult for the average person to grasp everything, Google's white paper summarizes it concisely as follows:

Zero-shot prompting
Few-shot prompting
System prompting
Role prompting
Contextual prompting
Step-back prompting
Chain of thought
Self-consistency
Tree of thoughts

For example, the second method, "Few-shot prompting," is a technique to elicit more accurate answers from a generative AI by providing it with specific examples in "question and answer pairs." The other methods also have their own theoretical backgrounds and wide ranges of application. Rather than rote memorization, it's important to first understand the concepts and then apply them. I cannot explain them all here, so I encourage you to read the original document. I recommend taking your time to learn them one by one.

2. Memorize Useful Words

That said, taking the first step to actually write a prompt can be quite daunting. Google has provided a list of recommended verbs, which I'd like to introduce here. Choosing from these verbs to craft your prompts might help you create good ones, so it's worth a try.

Act, Analyze, Categorize, Classify, Contrast, Compare, Create, Describe, Define, Evaluate, Extract, Find, Generate, Identify, List, Measure, Organize, Parse (especially for sentences and data grammatically), Pick, Predict, Provide, Rank, Recommend, Return, Retrieve (information, etc.), Rewrite, Select, Show, Sort, Summarize, Translate, Write

When you're unsure what to write, these verbs might give you a hint. This list includes many that I frequently use myself.

3. Finding Hints from Actual Examples

When you actually try out prompts, you'll find that some cases work well while others don't. The white paper summarizes these into 15 Best Practices. Here, I'll introduce an example from page 56.

Be specific about the output

Be specific about the desired output. A concise instruction might not guide the LLM enough

or could be too generic. Providing specific details in the prompt (through system or context

prompting) can help the model to focus on what’s relevant, improving the overall accuracy.

Examples:

DO:

Generate a 3 paragraph blog post about the top 5 video game consoles.

The blog post should be informative and engaging, and it should be

written in a conversational style.

DO NOT:

Generate a blog post about video game consoles.

Indeed, we tend to write simple prompts like the bad example. However, if we can add a bit more information and write like the good example, the information we receive will be better tailored to our needs. Just knowing this can change how you write prompts from now on. This white paper is full of such examples, so I highly recommend you read it for yourself.

How was that? I hope this serves as a reference for your prompt learning journey. Prompt engineering is still in its infancy, making it a great time to start learning. Let's conclude with a message from Google: "You don’t need to be a data scientist or a machine learning engineer – everyone can write a prompt. (1)"

Stay tuned!

1) , "Prompt Engineering”, Google, Feb 2025

Notice: ToshiStats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithms or ideas contained herein, or acting or refraining from acting as a result of such use. ToshiStats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on ToshiStats Co., Ltd. and me to correct any errors or defects in the codes and the software.

Toshifumi Kuga

June 16, 2025

prompt, AI agent, startup

The Cutting Edge of Prompt Engineering: A Look at Silicon Valley Startup

Toshifumi Kuga

June 16, 2025

prompt, AI agent, startup

Hello everyone. How often do you find yourselves writing prompts? I imagine more and more of you are writing them daily and conversing with generative AI. So today, we're going to look at the state of cutting-edge prompt engineering, using a case study from a Silicon Valley startup. Let's get started.

1. "Parahelp," a Customer Support AI Startup

There's a startup in Silicon Valley called "Parahelp" that provides AI-powered customer support. Impressively, they have publicly shared some of their internally developed prompt know-how (1). In the hyper-competitive world of AI startups, I want to thank the Parahelp management team for generously sharing their valuable knowledge to help those who come after them. The details are in the link below for you to review, but my key takeaway from their know-how is this: "The time spent writing the prompt itself isn't long, but what's crucial is dedicating time to the continuous process of executing, evaluating, and improving that prompt."

When we write prompts in a chat, we often want an immediate answer and tend to aim for "100% quality on the first try." However, it seems the style in cutting-edge prompt engineering is to meticulously refine a prompt through numerous revisions. For an AI startup to earn its clients' trust, this expertise is essential and may very well be the source of its competitive advantage. I believe "iteration" is the key for prompts as well.

2. Prompts That Look Like a Computer Program

Let's take a look at a portion of the published prompt. This is a prompt for an AI agent to behave as a manager, and even this is only about half of the full version.

Here is my analysis of the prompt above:

Assigning a persona (in this case, the role of a manager)
Describing tasks clearly and specifically
Listing detailed, numbered instructions
Providing important points as context
Defining the output format

I felt it adheres to the fundamental structure of a good prompt. Perhaps because it has been forged in the fierce competition of Silicon Valley, it is written with incredible precision. There's still more to it, so if you're interested, please view it from the link. It's written in even finer detail, and with its heavy use of XML tags, you could almost mistake it for a computer program. Incredible!

3. The Future of Prompt Engineering

I imagine that committing this much time and cost to prompt engineering is a high hurdle for the average business person. After learning the basics of prompt writing, many people struggle with what the next step should be.

One tip is to take a prompt you've written and feed it back to the generative AI with the task, "Please improve this prompt." This is called a "meta-prompt." Of course, the challenges of how to give instructions and how to evaluate the results still remain. At Toshi Stats, we plan to explore meta-prompts further.

So, what did you think? Even the simple term "prompt" has a lot of depth, doesn't it?As generative AI continues to evolve, or as methods for creating multi-AI agents advance, I believe prompt engineering itself will also continue to evolve. It's definitely something to keep an eye on. I plan to provide an update on this topic in the near future.

That's all for today. Stay tuned!

ToshiStats Co., Ltd. offers various AI-related services. Please check them out here!

Prompt design at Parahelp, Parahelp, May 28, 2025

Toshifumi Kuga

July 14, 2024

AI, artificial intelligence, generative ai, in context learning, prompt

Google DeepMind's new prompt engineering technique, "Many-Shot In-Context Learning," is amazing!

Toshifumi Kuga

July 14, 2024

AI, artificial intelligence, generative ai, in context learning, prompt

I recently came across an interesting research paper, "Many-Shot In-Context Learning" (1), by Google DeepMind, and I'd like to share a brief overview. Although it's a highly technical paper, it offers valuable insights that we can apply to our own prompt writing. Let's dive in.

1. Utilizing Context Effectively

When you write prompts for language models or generative AI like ChatGPT, you probably input the information you want, like a search engine, such as "What is the capital of Japan?" However, generative AI can handle much larger amounts of information. For example, as shown in the chart below, you can load a PDF document and then write a prompt like "Summarize this," and the AI will output a summary of the PDF's content. Think of a prompt as an "instruction to the generative AI." The additional information you provide is called the context.

2. What's Needed to Use Generative AI in a Business Setting

Now that we have a basic understanding of how to use generative AI, let's consider what's needed to use it in a company or business setting. Obviously, when you represent your company and interact with customers, you wouldn't express "personal opinions or feelings." You wouldn't say, "I personally don't think this new product will sell." Specifically, companies have established rules and manuals that employees must follow. Normally, employees cannot violate these rules. Therefore, to use generative AI in a company, it must output answers that comply with each company's "rules and manuals," not just general answers. So, how do you convey these rules to the generative AI? One way is to input the "rules and manuals" directly into the generative AI along with the prompt, as shown in the chart above. Many recent generative AIs have "context windows" of 100,000 tokens or more. This represents the amount of information that can be input and output at once, and 100,000 tokens is about 70,000 words in English. You can input a considerable amount of "rules and manuals." Some models, like Google's Gemini 1.5 Pro, can input up to 2 million tokens, which is enough for about 3,000 pages of English manuals. That's amazing. These context windows are sometimes called "long context windows."

3. Many-Shot In-Context Learning

"Many-Shot In-Context Learning" is a technique that utilizes these "long context windows" even more effectively. You may have heard of a similar term, "Few-Shot Learning." "Few-Shot Learning" is a method where you first provide the generative AI with a few "question and answer pairs" as examples and then ask the question you want to know. For instance, you might give examples like "The capital of the United States is Washington, D.C." and "The capital of China is Beijing," and then ask the AI, "What is the capital of Japan?" "Many-Shot In-Context Learning" increases the number of these "question and answer pairs" to 10-10,000. This is said to improve accuracy. The graph below shows that in machine translation and summarization tasks, increasing the number of examples to 500-1,000 improves accuracy. 2 to the power of 10 is 1024. The idea is to put as many examples as possible into the "long context window" since it can easily handle them.

The relationship between accuracy and the number of examples in machine translation and summarization.

What do you think? If simply increasing the number of examples improves accuracy, it might be worth trying. For those who say, "I can't create so many examples myself," "Many-Shot In-Context Learning" also suggests a method to create synthetic data using an LLM (language model). If you're interested, please check out the paper. But if it's just about 10 examples, you could probably create them yourself. I'll give it a try and update here if I get good results. That's all for today. Stay tuned!

1) "Many-Shot In-Context Learning", Rishabh Agarwal, Avi Singh, Lei M. Zhang, Bernd Bohnet, Luis Rosias, Stephanie Chan, Biao Zhang, Ankesh Anand, Zaheer Abbas, Azade Nova, John D. Co-Reyes, Eric Chu, Feryal Behbahani, Aleksandra Faust, Hugo Larochelle, Google DeepMind, 22 May 2024, https://arxiv.org/abs/2404.11018

Toshifumi Kuga

December 31, 2023

synthetic data, AI, fine-tuning, LLM, Google DeepMind, promt

"REST MEETS REACT" is a new prompt-engineering method using synthetic data. It holds immense potential for enhancing AI without relying on human-generated data

Toshifumi Kuga

December 31, 2023

synthetic data, AI, fine-tuning, LLM, Google DeepMind, promt

Happy New Year! Thank you for your continued support. Promptly, Google DeepMind has announced a new, advanced prompt engineering method suitable for the new year. It is a paper titled "REST MEETS REACT: SELF-IMPROVEMENT FOR MULTI-STEP REASONING LLM AGENT"(1). It incorporates fine-tuning with synthetic data, which looks promising! Let's get started.

1.Prompt Structure

This prompt is designed with a web Q&A system in mind that answers complex questions. The structure is as follows:

The blue part in the figure above represents the flow of the agent described in the prompt, aiming to answer complex questions using web search. In the latter half, "Relevance self-check" and "Grounding self-check" are functions for the agent to check its own answers. It's a self-check function. For a detailed explanation of the entire flow, please refer to the paper.

2. "Reward Model" - The Key to Success

Now, let's explain the core part of self-improvement. In a nutshell, it's about "creating new high-quality data and fine-tuning the model with it." . This function consists of three parts:

Grow: Start with a model capable of running Search Agent, using Google PaLM 2-L model for this purpose. Trajectories are collected based on a selected set of 2000 public questions. Trajectory, though an unfamiliar term, refers to the reasoning process and is commonly used in reinforcement learning.
Improve: Convert trajectories into data for fine-tuning, using the Reward model to select only high-quality data. No external data, like labels, are used.
Fine-tuning: Fine-tune a new model of the same size with this new data, ensuring it performs better than the original.

This process is repeated with the better model using the new data. As a result, accuracy improves while maintaining the original data, without adding external data. Therefore, the accuracy of the Reward model in ranking is crucial. The Reward model is constructed as a set of prompts in this paper. Let's look more closely at these prompts, showing only the initial part.

The goal of this rating is to filter out bad actions so that they'll be excluded from the fine-tuning dataset.
Overall, we want the agent to produce relevant and grounded answers with minimal steps. Anything deviating from this goal is considered bad.
If any element (thoughts, comments, etc.) is empty, then it's automatically bad.

"Filter out" indicates a method of discarding items that don't meet the standards and adopting only the high-quality data that remains. Please see the paper (p19) for details.

3.Improve Accuracy with Synthetic Data

Papers including this one have been published in late 2023, focusing on using the Reward model to create high-quality synthetic data for model fine-tuning and accuracy improvement. Vigorous research is expected to continue in 2024, yielding various results. Especially in the LLM field, collecting high-quality training data is becoming increasingly difficult, and fine-tuning with synthetic data is anticipated as a solution.

How was it? The improvement in model accuracy with synthetic data is expected to be a very effective development method for startups like us, who cannot collect vast amounts of data independently. Our blog will continue to follow these synthetic data and other technological innovations, so stay tuned. Wishing you a great year!

1) “REST MEETS REACT: SELF-IMPROVEMENT FOR MULTI-STEP REASONING LLM AGENT" Renat Aksitov†1 , Sobhan Miryoosefi†1 , Zonglin Li†1 , Daliang Li†1 , Sheila Babayan†2 , Kavya Kopparapu†2 , Zachary Fisher1 , Ruiqi Guo1 , Sushant Prakash1 , Pranesh Srinivasan3 , Manzil Zaheer2 , Felix Yu1 , and Sanjiv Kumar1, 1Google Research, 2Google DeepMind, 3Google †Core contributors, 15 Dec 2023, https://arxiv.org/abs/2312.10003