Toshifumi Kuga

July 14, 2025

generative ai, Google DeepMind, prompt

Prompt Engineering Mastery: The Fast Track

Toshifumi Kuga

July 14, 2025

generative ai, Google DeepMind, prompt

Since the debut of ChatGPT at the end of November 2022, the way we give instructions to computers has completely changed. Previously, programming languages like Python were necessary, but with ChatGPT, it's now possible to give instructions using the "natural languages" we use every day, such as English and Japanese. These natural language instructions are called "prompts." It has been about two and a half years since prompts came into use, and many people are likely experimenting with various prompts daily. As this is a new technology, systematically learning it can be challenging. However, Google has released a free white paper (1) of over 60 pages on the topic, so let's explore it for some hints. Let's begin!

1. Grasping the Basic Concepts

We often see simple prompt guides like "The Top 20 Prompts You Need to Know." However, it's impossible to effectively interact with a generative AI, which holds a vast amount of knowledge, with just about 20 prompts. While it may seem like a shortcut, memorizing a recommended list of 20 prompts each time is laborious and inefficient. Various studies are being conducted on how to write prompts, and the theoretical background is being investigated. While it's difficult for the average person to grasp everything, Google's white paper summarizes it concisely as follows:

Zero-shot prompting
Few-shot prompting
System prompting
Role prompting
Contextual prompting
Step-back prompting
Chain of thought
Self-consistency
Tree of thoughts

For example, the second method, "Few-shot prompting," is a technique to elicit more accurate answers from a generative AI by providing it with specific examples in "question and answer pairs." The other methods also have their own theoretical backgrounds and wide ranges of application. Rather than rote memorization, it's important to first understand the concepts and then apply them. I cannot explain them all here, so I encourage you to read the original document. I recommend taking your time to learn them one by one.

2. Memorize Useful Words

That said, taking the first step to actually write a prompt can be quite daunting. Google has provided a list of recommended verbs, which I'd like to introduce here. Choosing from these verbs to craft your prompts might help you create good ones, so it's worth a try.

Act, Analyze, Categorize, Classify, Contrast, Compare, Create, Describe, Define, Evaluate, Extract, Find, Generate, Identify, List, Measure, Organize, Parse (especially for sentences and data grammatically), Pick, Predict, Provide, Rank, Recommend, Return, Retrieve (information, etc.), Rewrite, Select, Show, Sort, Summarize, Translate, Write

When you're unsure what to write, these verbs might give you a hint. This list includes many that I frequently use myself.

3. Finding Hints from Actual Examples

When you actually try out prompts, you'll find that some cases work well while others don't. The white paper summarizes these into 15 Best Practices. Here, I'll introduce an example from page 56.

Be specific about the output

Be specific about the desired output. A concise instruction might not guide the LLM enough

or could be too generic. Providing specific details in the prompt (through system or context

prompting) can help the model to focus on what’s relevant, improving the overall accuracy.

Examples:

DO:

Generate a 3 paragraph blog post about the top 5 video game consoles.

The blog post should be informative and engaging, and it should be

written in a conversational style.

DO NOT:

Generate a blog post about video game consoles.

Indeed, we tend to write simple prompts like the bad example. However, if we can add a bit more information and write like the good example, the information we receive will be better tailored to our needs. Just knowing this can change how you write prompts from now on. This white paper is full of such examples, so I highly recommend you read it for yourself.

How was that? I hope this serves as a reference for your prompt learning journey. Prompt engineering is still in its infancy, making it a great time to start learning. Let's conclude with a message from Google: "You don’t need to be a data scientist or a machine learning engineer – everyone can write a prompt. (1)"

Stay tuned!

1) , "Prompt Engineering”, Google, Feb 2025

Notice: ToshiStats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithms or ideas contained herein, or acting or refraining from acting as a result of such use. ToshiStats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on ToshiStats Co., Ltd. and me to correct any errors or defects in the codes and the software.

Toshifumi Kuga

June 23, 2025

multi AI agent, AI, prompt, generative ai

How can we achieve best practices for constructing multi-agent AI systems?

Toshifumi Kuga

June 23, 2025

multi AI agent, AI, prompt, generative ai

Lately, I've been hearing a lot about multi-agent AI systems. As someone who is always thinking about not just using these services but building them myself, I've been keen to know how to construct high-performance AI agents. Last week, Anthropic published an article titled, "How we built our multi-agent research system(1)," which describes their construction method in detail. So today, using this article as a reference, I'd like to explore the best practices for creating multi-agent AI systems with all of you. Let's get started!

1. Why do we need so many agents?

ChatGPT, which debuted at the end of November 2022, was a single model. Since then, several services using generative AI have appeared, but initially, most of them used a single AI. So why have we recently seen a rise in methods that connect multiple generative AIs to operate as a single system? I believe it's because it has become clear that there are limits to what a single generative AI can accomplish when faced with complex tasks. It has gradually become apparent that by connecting and integrating several agents, even complex tasks can be handled. This trend has become particularly noticeable in conjunction with the performance improvements of standalone generative AI models like Gemini 1.5 Pro and OpenAI's o3.

2. What kind of agent structure should we build?

The Anthropic article included a wonderful chart that I'd love to reference. The key lies with the "Lead agent" and the "sub-agents" placed beneath it.

Here is Anthropic's explanation: "The multi-agent architecture in action: user queries flow through a lead agent that creates specialized subagents to search for different aspects in parallel" . While the chart shows three sub-agents, it's a matter of course that more may be needed to handle more complex tasks.

3. How do you coordinate many agents?

I've described the move to multi-agent AI as if it's all upside, but it requires numerous AI agents to function as expected. Getting a desired response from a single generative AI can be quite a challenge, so is it even possible to control multiple, simultaneously operating AI agents to meet our expectations? The key seems to lie in the "prompt." In fact, the Anthropic article contains countless, very helpful methods for prompt creation. Here, I'd like to introduce two representative examples. For the rest, I highly recommend reading the original article for yourself.

"Teach the orchestrator how to delegate. In our system, the lead agent decomposes queries into subtasks and describes them to subagents. Each subagent needs an objective, an output format, guidance on the tools and sources to use, and clear task boundaries. Without detailed task descriptions, agents duplicate work, leave gaps, or fail to find necessary information.

"Guide the thinking process. Extended thinking mode, which leads Claude to output additional tokens in a visible thinking process, can serve as a controllable scratchpad. The lead agent uses thinking to plan its approach, assessing which tools fit the task, determining query complexity and subagent count, and defining each subagent’s role.

In a nutshell, I think it comes down to "describing things meticulously." Apparently, simple and short instructions like "Research the semiconductor shortage" did not work well, so it seems necessary to write prompts for multi-agent AI as meticulously as possible. I'm going to work on writing better prompts from now on.

What did you think? It appears that various techniques are necessary to make multi-agent AI systems operate as intended. As the performance of generative AI improves in the future, the required orchestration techniques will also change. I want to continue to stay updated and incorporate the latest cutting-edge technologies. That's all for today. Stay tuned!

Toshi Stats Co., Ltd. provides a wide range of AI-related services. Please see here for more details!

1) , "How we built our multi-agent research system”, Anthropic, June 13, 2025

Toshifumi Kuga

June 16, 2025

prompt, AI agent, startup

The Cutting Edge of Prompt Engineering: A Look at Silicon Valley Startup

Toshifumi Kuga

June 16, 2025

prompt, AI agent, startup

Hello everyone. How often do you find yourselves writing prompts? I imagine more and more of you are writing them daily and conversing with generative AI. So today, we're going to look at the state of cutting-edge prompt engineering, using a case study from a Silicon Valley startup. Let's get started.

1. "Parahelp," a Customer Support AI Startup

There's a startup in Silicon Valley called "Parahelp" that provides AI-powered customer support. Impressively, they have publicly shared some of their internally developed prompt know-how (1). In the hyper-competitive world of AI startups, I want to thank the Parahelp management team for generously sharing their valuable knowledge to help those who come after them. The details are in the link below for you to review, but my key takeaway from their know-how is this: "The time spent writing the prompt itself isn't long, but what's crucial is dedicating time to the continuous process of executing, evaluating, and improving that prompt."

When we write prompts in a chat, we often want an immediate answer and tend to aim for "100% quality on the first try." However, it seems the style in cutting-edge prompt engineering is to meticulously refine a prompt through numerous revisions. For an AI startup to earn its clients' trust, this expertise is essential and may very well be the source of its competitive advantage. I believe "iteration" is the key for prompts as well.

2. Prompts That Look Like a Computer Program

Let's take a look at a portion of the published prompt. This is a prompt for an AI agent to behave as a manager, and even this is only about half of the full version.

Here is my analysis of the prompt above:

Assigning a persona (in this case, the role of a manager)
Describing tasks clearly and specifically
Listing detailed, numbered instructions
Providing important points as context
Defining the output format

I felt it adheres to the fundamental structure of a good prompt. Perhaps because it has been forged in the fierce competition of Silicon Valley, it is written with incredible precision. There's still more to it, so if you're interested, please view it from the link. It's written in even finer detail, and with its heavy use of XML tags, you could almost mistake it for a computer program. Incredible!

3. The Future of Prompt Engineering

I imagine that committing this much time and cost to prompt engineering is a high hurdle for the average business person. After learning the basics of prompt writing, many people struggle with what the next step should be.

One tip is to take a prompt you've written and feed it back to the generative AI with the task, "Please improve this prompt." This is called a "meta-prompt." Of course, the challenges of how to give instructions and how to evaluate the results still remain. At Toshi Stats, we plan to explore meta-prompts further.

So, what did you think? Even the simple term "prompt" has a lot of depth, doesn't it?As generative AI continues to evolve, or as methods for creating multi-AI agents advance, I believe prompt engineering itself will also continue to evolve. It's definitely something to keep an eye on. I plan to provide an update on this topic in the near future.

That's all for today. Stay tuned!

ToshiStats Co., Ltd. offers various AI-related services. Please check them out here!

Prompt design at Parahelp, Parahelp, May 28, 2025

Toshifumi Kuga

July 14, 2024

AI, artificial intelligence, generative ai, in context learning, prompt

Google DeepMind's new prompt engineering technique, "Many-Shot In-Context Learning," is amazing!

Toshifumi Kuga

July 14, 2024

AI, artificial intelligence, generative ai, in context learning, prompt

I recently came across an interesting research paper, "Many-Shot In-Context Learning" (1), by Google DeepMind, and I'd like to share a brief overview. Although it's a highly technical paper, it offers valuable insights that we can apply to our own prompt writing. Let's dive in.

1. Utilizing Context Effectively

When you write prompts for language models or generative AI like ChatGPT, you probably input the information you want, like a search engine, such as "What is the capital of Japan?" However, generative AI can handle much larger amounts of information. For example, as shown in the chart below, you can load a PDF document and then write a prompt like "Summarize this," and the AI will output a summary of the PDF's content. Think of a prompt as an "instruction to the generative AI." The additional information you provide is called the context.

2. What's Needed to Use Generative AI in a Business Setting

Now that we have a basic understanding of how to use generative AI, let's consider what's needed to use it in a company or business setting. Obviously, when you represent your company and interact with customers, you wouldn't express "personal opinions or feelings." You wouldn't say, "I personally don't think this new product will sell." Specifically, companies have established rules and manuals that employees must follow. Normally, employees cannot violate these rules. Therefore, to use generative AI in a company, it must output answers that comply with each company's "rules and manuals," not just general answers. So, how do you convey these rules to the generative AI? One way is to input the "rules and manuals" directly into the generative AI along with the prompt, as shown in the chart above. Many recent generative AIs have "context windows" of 100,000 tokens or more. This represents the amount of information that can be input and output at once, and 100,000 tokens is about 70,000 words in English. You can input a considerable amount of "rules and manuals." Some models, like Google's Gemini 1.5 Pro, can input up to 2 million tokens, which is enough for about 3,000 pages of English manuals. That's amazing. These context windows are sometimes called "long context windows."

3. Many-Shot In-Context Learning

"Many-Shot In-Context Learning" is a technique that utilizes these "long context windows" even more effectively. You may have heard of a similar term, "Few-Shot Learning." "Few-Shot Learning" is a method where you first provide the generative AI with a few "question and answer pairs" as examples and then ask the question you want to know. For instance, you might give examples like "The capital of the United States is Washington, D.C." and "The capital of China is Beijing," and then ask the AI, "What is the capital of Japan?" "Many-Shot In-Context Learning" increases the number of these "question and answer pairs" to 10-10,000. This is said to improve accuracy. The graph below shows that in machine translation and summarization tasks, increasing the number of examples to 500-1,000 improves accuracy. 2 to the power of 10 is 1024. The idea is to put as many examples as possible into the "long context window" since it can easily handle them.

The relationship between accuracy and the number of examples in machine translation and summarization.

What do you think? If simply increasing the number of examples improves accuracy, it might be worth trying. For those who say, "I can't create so many examples myself," "Many-Shot In-Context Learning" also suggests a method to create synthetic data using an LLM (language model). If you're interested, please check out the paper. But if it's just about 10 examples, you could probably create them yourself. I'll give it a try and update here if I get good results. That's all for today. Stay tuned!

1) "Many-Shot In-Context Learning", Rishabh Agarwal, Avi Singh, Lei M. Zhang, Bernd Bohnet, Luis Rosias, Stephanie Chan, Biao Zhang, Ankesh Anand, Zaheer Abbas, Azade Nova, John D. Co-Reyes, Eric Chu, Feryal Behbahani, Aleksandra Faust, Hugo Larochelle, Google DeepMind, 22 May 2024, https://arxiv.org/abs/2404.11018