agentic coding

Predicting Loan Payback through "Agent Skills": The New Standard for Enterprise AI

The most common complaint about AI agents in business? 'The output isn't what I wanted.' In a corporate landscape, consistency is everything—without pre-defined formats, users get lost. Instead of just teaching everyone to prompt better, why not embed that expertise into the organization itself? By providing standardized prompts upfront, users get perfect results from day one. The secret to this is 'Agent skills' (1). Let’s see how it works!

 

1. What are Agent Skills?

Announced as "skills" by the AI giant Anthropic in October 2025, Agent Skills have since been adopted by almost every major AI company. They have become the de facto standard for providing domain-specific knowledge to generative AI. According to Anthropic:

“Agent Skills are modular capabilities that extend Claude's functionality. Each Skill packages instructions, metadata, and optional resources (scripts, templates) that Claude uses automatically when relevant.”

The beauty of defined Agent Skills is their portability—once created, they can be used across different platforms.

 

2. Creating Agent Skills

Now, let's dive right in. I’m going to create an 'Agent Skill' using Claude Cowork. I uploaded the PRD (Product Requirements Document) I typically use for building prediction models and input the following prompt.

‍  ‍           Claude Cowork

Since Claude Cowork has a built-in skill creator, it automatically generates an Agent Skills folder containing a skill.md file. This skill.md stores the most fundamental information for the Agent Skill, and its header always includes the following content. AI agents like Claude Code are designed to read this section first.

         skill.md 1

For tasks related to predictive modeling, the agent reads the specific implementation logic defined in the skill (which, in this case, spans about 240 lines) before moving to the coding phase.

           skill.md 2

 

3. Building a Prediction Model via Agent Skills

Next, I utilized Claude Code for agentic coding. As shown below, the "skills" we just created are active and recognized by the environment.

Claude Code

Because the detailed modeling process is already governed by the Agent Skill, my manual prompt can be as simple as: "Please create a prediction model." For this project, I used data from the Kaggle "Predicting Loan Payback" competition (2), where the goal is to predict whether a borrower will repay their loan. The entire implementation was completed in about two hours with almost no manual corrections. The stability of Opus 4.6 (3) is truly remarkable!

The model achieved an AUC of 0.92435 on the Kaggle leaderboard—a score that is well within the range of practical, production-ready application.

Kaggle leaderboard

One secret behind this high accuracy was the creation of new features based on ratios. By analyzing feature importance, we ensured only the most impactful variables were included in the final model.

new features based on ratios

 

4. Testing the Resulting Model

Let’s look at the model built via Agent Skills in action. First, we calculate the probability of repayment for an individual customer. In this example, the probability exceeds 96%, resulting in a "Success" (likely to repay) classification based on a 50% threshold. This threshold is, of course, adjustable depending on the specific business objectives.

prediction for an individual customer

To avoid the "black box" problem, I use SHAP analysis to explain why a customer received a specific score. As seen in the graph, the length of the red arrows indicates the contribution of each feature. Here, employment_status was the most significant factor driving the "Success" prediction. This transparency is crucial for corporate accountability.

SHAP analysis for a customer

 

We can also apply SHAP to the entire dataset. Again, employment_status emerges as the top contributor across all customers. We can see that this feature also carries a high degree of contribution across the entire customer base.

SHAP analysis for all customers

Furthermore, SHAP allows us to visualize the non-linear relationship between specific features and repayment probability. For example, with credit_score, the probability doesn't just rise linearly. The data shows that the probability remains flat until a score of 550, starts to rise at 600, and accelerates significantly after 700. This level of granular insight is what makes SHAP so valuable.

‍ ‍ Feature-wise SHAP Analysis

 

By using Agent Skills, you can embed entire libraries of domain knowledge directly into your AI’s workflow. These skills are reusable, portable, and—in my opinion—will soon be a requirement for any business using AI agents.

I look forward to seeing how Agent Skills continue to permeate the corporate world and what innovations they will trigger. TOSHI STATS Co. will continue to lead the way in this space.

Stay tuned!

 

You can enjoy our video news ToshiStats-AI from this link, too!

1) Agent Skills
2) Predicting Loan Payback, Yao Yan, Walter Reade, Elizabeth Park. Kaggle, 2025
3) Introducing Claude Opus 4.6, Anthropic, Feb 5 2026

Copyright © 2026 Toshifumi Kuga. All right reserved
Notice: ToshiStats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithms or ideas contained herein, or acting or refraining from acting as a result of such use. ToshiStats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on ToshiStats Co., Ltd. and me to correct any errors or defects in the codes and the software.

From Zero to Production: How Opus 4.6 Agentic Coding Revolutionizes Insurance Analytics

In the ever-evolving landscape of InsurTech, cross-selling is a literal goldmine. Utilizing Opus 4.6 and Agentic Coding, I have constructed a sophisticated "Insurance Cross-Sell Prediction Model" implementation pipeline, covering everything from memory-optimized data loading to complex feature engineering. Let’s dive in!

 

1. Agentic Coding with Opus 4.6

Unlike traditional coding, Agentic Coding with Opus 4.6 (1) allows the AI to function as an autonomous engineer. It goes beyond writing snippets; it manages directory structures, ensures memory efficiency for datasets of 11.5 million rows, and completes a production-ready Streamlit dashboard.

In this process, my role was simply to write the "Product Requirement Document (PRD)”—a document in natural language (Japanese or English) defining what I wanted to build. No Python knowledge was required on my part. By putting Claude Code into plan mode, an implementation blueprint is automatically generated, allowing me to verify the coding logic before Opus 4.6 executes it. While I monitored the progress, I never had to write a single line of code myself. Truly remarkable.

 

2. Project Overview

This project features a robust ecosystem designed for real-world application:

  • LightGBM + Optuna: Automated hyperparameter optimization to maximize AUC.

  • 50 Ratio-Based Features: Generation of 50 unique indicators to capture hidden customer behavior patterns.

  • Explainability via SHAP: Implementation of SHAP values to visualize why a specific customer is likely to purchase.

The data was sourced from a Kaggle competition regarding automobile insurance cross-selling (2).

Kaggle competition regarding automobile insurance cross-selling

Performance Results: When evaluating the model built via Opus 4.6 Agentic Coding on the Kaggle leaderboard, it achieved a high score of AUC = 0.88343. This level of accuracy is more than sufficient for practical business use.

Kaggle leaderboard

 

3. Key Features of the Implementation

The model provides two primary functions: individual customer prediction and total customer portfolio analysis.

Individual Prediction

We set the threshold for a "successful" cross-sell at a probability of 35% or higher. Below is an example of a customer predicted to be a successful cross-sell target. To avoid the "Black Box" problem, we use SHAP values to show the contribution of each feature. The larger the SHAP value, the higher its contribution to the positive prediction. This allows staff to understand the concrete reasoning behind the AI's decision.

customer predicted to be success

feature contribution

Conversely, for customers predicted to fail (probability below 35%), the SHAP values indicate which factors are pulling the probability down.

customer predicted to fail

feature contribution

Customer portfolio Analysis

We can also analyze the "Cross-Sell Success Rate" across an entire customer portfolio. In this demo, we imported a CSV of 30,000 customers. With the threshold set at 35%, the model identified 3,708 potential targets. By adjusting the threshold, marketing teams can narrow or broaden their focus for specific campaigns. The dashboard also displays the overall probability distribution across the entire dataset.

probability distribution

 

4. Business Impact

This high-precision model provides sales representatives with a prioritized "Hot Lead" list. Thanks to the Streamlit-based GUI, non-technical staff can execute batch predictions and verify the reasoning via SHAP instantly. This is the definition of Data-Driven Marketing.

 

Conclusion

The synergy between Opus 4.6 and human expertise is redefining the speed of machine learning development and implementation. The potential is, quite frankly, staggering. At TOSHI STATS, we will continue to explore innovations in this field.

Stay tuned!

 

1) Introducing Claude Opus 4.6, Anthropic, Feb 5 2026
2) Binary Classification of Insurance Cross Selling,  Walter Reade and Ashley Chow, Kaggle

You can enjoy our video news ToshiStats-AI from this link, too!

Copyright © 2026 Toshifumi Kuga. All right reserved
Notice: ToshiStats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithms or ideas contained herein, or acting or refraining from acting as a result of such use. ToshiStats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on ToshiStats Co., Ltd. and me to correct any errors or defects in the codes and the software.

AGI in 2 Years or 5 Years? — Survival Strategies for 2030

In January 2026, several interviews with CEOs of top AI labs were released. One particularly fascinating encounter was the face-to-face interview (1) between Anthropic CEO Dario Amodei and Google DeepMind CEO Demis Hassabis. I have summarized my thoughts on what their comments imply. I hope you find this insightful!

 

1. Will AGI Arrive Within 2 Years?

Dario seems to hold a more accelerated timeline for the realization of AGI. While prefixing his thoughts with "It is difficult to predict exactly when it will happen," he pointed to the reality within his own company: "There are already engineers at Anthropic who say they no longer write code themselves. In the next 6 to 12 months, AI might handle the majority of code development. I feel that loop is closing rapidly." He argued that AI development is hitting a flywheel effect, particularly noting that progress in coding and research is so remarkable that AI intelligence will surpass public expectations within a few short years.

A prime example is Claude Code, released by Anthropic last year. This revolutionary product is currently taking the software development world by storm. It is no exaggeration to say that the common refrain "I don’t code manually anymore" is a direct result of this tool. In fact, I recently used it to tackle a past Kaggle competition; I achieved an AUC of 0.79 with zero manual coding, which absolutely stunned me (3).

 

2. AGI is Still 5 Years Away

On the other hand, Demis maintains his characteristically cautious stance. He often remarks that there is a "50% chance of achieving AGI in five years." His reasoning is grounded in the current limitations of AI: "Today’s AI isn't yet consistently superior to humans across all fields. A model might show incredible performance in one area but make elementary mistakes in another. This inconsistency means we haven't reached AGI yet." He believes two or three more major breakthroughs are required, which explains his longer timeline compared to Dario.

Unlike Anthropic, which is heavily optimized for coding and language, Google is focusing on a broader spectrum. One such focus is World Models—simulations of the physical spaces we inhabit. In these models, physics like gravity are reproduced, allowing the AI to better understand the "real" world. Genie 3 (2) is their latest version in this category. While it has only been released in the US so far, I am eagerly anticipating its global rollout. The "breakthroughs" Demis mentions likely lie at the end of this developmental path.

 

3. Are We Prepared for AGI?

While their timelines differ, Dario and Demis agree on one fundamental point: AGI—which will surpass human capabilities in every field—is not far off. Exactly ten years ago, in March 2016, DeepMind’s AlphaGo defeated the world’s top Go professional. Since then, no human has been able to beat AI in the game of Go. Soon, we may reach a point where humans can no longer outperform AI in any field. What we are seeing in the world of coding today is the precursor to that shift.

It is a world that is difficult to visualize. Industrial structures will be upended, and the very role of "human work" will change. It is hard to say that we are currently prepared for this reality. In 2026, we must begin a serious global dialogue on how to adapt. I look forward to engaging in these discussions with people around the world.

I highly recommend watching the full interview with Dario and Demis. These two individuals hold the keys to our collective future. That’s all for today. Stay tuned!

 

1) The Day After AGI | World Economic Forum Annual Meeting 2026, World Economic Forum,  Jan 21, 2026
2) Genie 3, Google DeepMind, Jan 29, 2026
3) Is agentic coding viable for Kaggle competitions?, January 16, 2026



You can enjoy our video news ToshiStats-AI from this link, too!

Copyright © 2026 Toshifumi Kuga. All right reserved
Notice: ToshiStats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithms or ideas contained herein, or acting or refraining from acting as a result of such use. ToshiStats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on ToshiStats Co., Ltd. and me to correct any errors or defects in the codes and the software.

Is agentic coding viable for Kaggle competitions?

The "Agentic Coding" trend continues to accelerate as we enter 2026. In this post, I will challenge myself to see how high I can push accuracy by delegating the coding process to an AI agent, using data from the Kaggle competition Home Credit Default Risk [1]. Let's get started right away.

 

1. Combining Claude Code and Opus 4.5

I will be using Opus 4.5, a generative AI renowned for its coding capabilities. Additionally, I will use Claude Code as my coding assistant, as shown below. While I enter instructions into the prompt box, I do not write any Python code myself.

You can see the words "plan mode" at the bottom of the screen. In this mode, Claude Code formulates an implementation plan based on my instructions. I simply review it, and if everything looks good, I authorize the execution.

Let's look at the actual instructions I issued. It is quite long for a "prompt," spanning about two A4 pages. The beginning of the implementation instructions is shown below. I wrote it in great detail. I'd like you to pay special attention to the final instruction regarding the creation of 50 new features using ratio calculations.

              Part of the Product Requirement Document

Below is a portion of the implementation plan formulated by the AI agent. It details the method for creating new features via ratio calculations. Although I only specified the quantity of features, the plan shows that it selected features likely to be relevant to loan defaults before calculating the ratios.

The AI agent utilized its own domain knowledge to make these selections; they were certainly not chosen at random. This demonstrates the high-level judgment capabilities unique to AI agents.

              New feature creation plan by the AI Agent

            Part of the new features actually created by the AI Agent

 

2. Achieving an AUC of 0.79

By adopting LightGBM as the machine learning library, using the newly created features, and performing hyperparameter tuning, I was able to achieve an AUC of 0.79063, as shown below.

Reaching this level without writing a single line of Python code myself marks this experiment as a success. The data used to build the machine learning model consisted of seven different CSV files. These had to be merged correctly, and the AI agent handled this task seamlessly. Truly impressive!

                 Evaluation results on Kaggle

 

3. Will AI Agents Handle Future Machine Learning Model Development?

While the computation time depends on the number of features created, it generally took between 1 to 4 hours. I ran the process several times, and the calculation never stopped due to syntax errors. The AI agent likely corrected any errors itself before proceeding to the next calculation step.

Therefore, once the initial implementation plan is approved, the results are generated without any further human intervention. This could be revolutionary. You simply input what you want to achieve via a PRD (Product Requirement Document), the AI agent creates an implementation plan, and once you approve it, you just wait for the results. The potential for multiplying productivity several times over is certainly there.

 

How was it? I was personally astonished by the high potential of the "Claude Code and Opus 4.5" combination. With a little ingenuity, it seems capable of even more.

This story is just beginning. Opus 4.5 will likely be upgraded to Opus 5 within the year. I am already looking forward to seeing what AI agents will be capable of then.

That’s all for today. Stay tuned!




1) Home Credit Default Risk, kaggle



You can enjoy our video news ToshiStats-AI from this link, too!



Copyright © 2026 Toshifumi Kuga. All right reserved
Notice: ToshiStats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithms or ideas contained herein, or acting or refraining from acting as a result of such use. ToshiStats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on ToshiStats Co., Ltd. and me to correct any errors or defects in the codes and the software.