Industry TrendsAI エージェント

What Is an AI Agent? Why Nobody Can Define It in One Sentence — and the 10 Major Agents to Know in May 2026

An AI agent is "an AI that, given a goal, plans its own steps, calls tools, and works a task to completion." Nobody summarizes it cleanly because vendors disagree across autonomy, tool-use, and task scope. The practical lens is the 3-tier gradient: Reactive → Tool-use → Autonomous. This article maps the 2026 landscape, lists the 10 agents to know, gives a free 3-step start path, and covers the 5 risks plus a sales context.

中澤 圭志

中澤 圭志

@keishi_nakazawa

Sales Claw maintainer

·13 min
What Is an AI Agent? Why Nobody Can Define It in One Sentence — and the 10 Major Agents to Know in May 2026
This English article is a concise version of the original. For the full Japanese deep-dive, see the Japanese original.

Key Facts

One-line definition

An AI that plans, calls tools, and completes multi-step work toward a goal

3-tier framework

Reactive (ChatGPT) / Tool-use (Claude, Copilot) / Autonomous (Claude Code, Codex)

Top 10 agents

Claude Code, Codex, Devin, Replit Agent, Cursor, ChatGPT Agent, Microsoft Copilot Agent, Salesforce Agentforce, ChatGPT Atlas Agent, Perplexity Comet

Try free today

ChatGPT free + Web Search / Claude.ai + MCP / Microsoft Copilot free / Perplexity free

"I keep hearing 'AI agent' — what is it? How is it different from ChatGPT? Claude calls itself one, Copilot calls itself one — are they all the same?" This article unpacks the term "AI agent" for non-technical readers, working from primary definitions published by Anthropic, OpenAI, Google, and Microsoft. We cover why nobody can summarize it in a sentence and how to draw practical distinctions you can act on.

Primary sources for this article: Anthropic's "Building Effective Agents" (2024-12), the OpenAI Agents Platform Docs, Google's Gemini Agents Whitepaper, and Microsoft Copilot Studio Docs. For deep dives into individual products, see our Claude Code slash-commands guide and Codex Mobile explainer. For the toolbox standard, see the MCP complete guide, and for the browser angle the ChatGPT Atlas explainer.

1. What is an AI agent — "an AI that moves its own hands"

Mid-density whiteboard explainer titled 'What is an AI agent? In one line: an AI that moves its own hands.' Central metaphor: LLM brain with three hands holding a tool, a notebook, and an autonomous loop arrow. Left zone 'Regular AI chat': question / one reply / no hands / ChatGPT default. Right zone 'AI agent': goal / plan / tool call / observe / next step. Yellow sticky callout: 'Vendors disagree on the definition; understand it as a 3-tier gradient.'
Figure: What is an AI agent — contrast with regular AI chat (mid-density whiteboard)

The simplest analogy: regular ChatGPT is a "teacher who reads the textbook aloud"— answers your questions but doesn't walk to the library or take notes. An AI agent is more like a "research assistant": ask it to "prep next week's meeting deck" and it tries to fetch sources, summarize, open PowerPoint, and lay out slides.

Three machinery pieces make this possible: tool use, memory, and an autonomous loop. Anthropic puts it this way:

【Official statement】 The key phrase is "dynamically direct"— the AI decides the next step, not the human at every turn. That's the line between "chat" and "agent."

Why did this take off in 2026?

【Author view】 AI agents had been researched for years, but only became practical in 2024-2025. Three breakthroughs lined up: (1) long context (Claude 3.5 Sonnet hit 200K tokens, Gemini 2 crossed 1M); (2) standardized tool-use APIs (OpenAI function calling → MCP); (3) better reasoning (Claude Opus 4 / GPT-5 families made multi-step plans reliable). Devin, Replit Agent, Claude Code, and Codex reached production in 2025; 2026 is the mass-adoption year.

Timeline of major AI-agent product arrivals 2023-2026. X-axis spans 2023 H1 to 2026 H1. Markers: 2023 H1 GPT-4 function calling, 2024 H1 ChatGPT Custom GPTs, 2024 H2 Anthropic MCP / Cursor Agent / Devin preview, 2025 H1 Claude Code 1.0 / Replit Agent / Codex CLI, 2025 H2 ChatGPT Agent / Microsoft Copilot Agent / Salesforce Agentforce, 2026 H1 Claude Code 2.x / Codex Mobile / ChatGPT Atlas Agent mode / Perplexity Comet. Color coding: coding blue, business automation green, browser orange. Annotation: 'H2 2024 is the inflection — MCP plus stable reasoning lined up.'
Figure: Timeline of major AI-agent products 2023-2026 (Python diagram)

2. Why nobody can summarize it — three reasons the definition splits

High-density whiteboard explainer titled 'Three reasons AI-agent definitions split across vendors.' Three numbered zones: (1) Autonomy interpretation differs (where each vendor draws the line for 'agent'), (2) Tool-use scope differs (one tool vs many), (3) Task scope differs (coding-only vs business automation vs support). Comparison footer: Anthropic separates workflow vs agent strictly; OpenAI bundles everything as Agents Platform; Microsoft labels anything from Copilot Studio a Copilot Agent. Yellow sticky: 'Definition drift is normal in new markets.'
Figure: Figure 1: Three reasons the AI-agent definition splits (high-density whiteboard)

Reason 1: autonomy is interpreted differently

Vendors disagree on how autonomous something must be before it counts. Anthropic is strict: workflow = human directs each step; agent= LLM directs its own steps. OpenAI groups ChatGPT, Codex, and Agents Platform together. Microsoft calls anything generated in Copilot Studio a "Copilot Agent."

Reason 2: tool-use scope differs

Does a single retrieval call qualify? Anthropic says "augmented LLM" — not an agent. Many SaaS vendors market "ChatGPT with search = agentic AI" anyway. That's the rift.

Reason 3: task scope differs

Claude Code, Codex, and Devin are coding-only. Microsoft Copilot Agent targets business automation. Salesforce Agentforce focuses oncustomer support. Comparing them as one category fuels confusion.

VendorHow they use "AI agent"Character
AnthropicStrict separation: workflow vs agentOnly LLM-directed counts as agent
OpenAIBundled under Agents PlatformChatGPT Agent, Codex, Assistants API all in
GoogleGemini Agents Whitepaper"Observe and act toward a goal"
MicrosoftCopilot Agent ≈ generative business toolAnything built in Copilot Studio
SalesforceAgentforce = customer-facing AISupport and engagement focus

【Author view】 Definition drift is normal in new markets — RAG (2024) and prompt engineering (2023) went through the same phase. Expect convergence by 2027.

3. Versus "AI chat" and "automation tools"

Versus AI chat (ChatGPT)

Plain ChatGPT replies once and stops — no hands. An AI agent has tool-call permission: ask it to "send Monday's deck to Slack" and it chains calendar lookup → drive search → summary → PowerPoint → Slack post on its own.

Versus automation (Zapier / RPA)

Zapier and UiPath need humans to spell out every step. They break on edge cases. AI agents get a goal and improvise the path — they can pivot when things go wrong. The flip side: for rigid, repetitive jobs, classic RPA is cheaper, faster, and more reliable. Agents fit exploratory or variable work.

項目AI chat (ChatGPT)Automation (Zapier / RPA)
InputQuestions, instructions, dialoguePredefined triggers (e.g., email received)
OutputText, code, images (conversation only)API calls, file ops, notifications
FlexibilityHigh (tries any question)Low (rigid scripted flows)
CostCheap (per-message)Medium (monthly subscription)
How an agent differsAgents add hands (tool use)Agents plan their own steps

4. The three tiers — Reactive / Tool-use / Autonomous

High-density whiteboard explainer titled 'Understand AI agents in three tiers — Reactive / Tool-use / Autonomous.' Three numbered stages: (1) Reactive — no tools, just chat (ChatGPT default), zero autonomy; (2) Tool-use — calls search/files/APIs, multi-step (Claude, ChatGPT Agent, Microsoft Copilot Agent), partial autonomy; (3) Autonomous — given a goal, plans and loops (Claude Code, Codex, Devin, Replit Agent), high autonomy. Right arrows show autonomy / tool permissions / risk all climbing together. Yellow sticky: 'Start with Tool-use; supervise more as autonomy rises.'
Figure: Figure 2: The AI-agent 3-tier framework (high-density whiteboard)

Tier 1: Reactive — plain ChatGPT

Baseline. You ask, the AI answers. No tools, no outside effect. ChatGPT, Claude.ai, Gemini Web all sit here. Technically not yet an "agent," but a useful zero point for comparisons.

Tier 2: Tool-use — half-autonomous

Add tools (search, files, APIs, browser, calculator) and you have an agent. Ask "plan my weekend around the forecast" and it calls a weather API, then composes a plan from the results. ChatGPT Agent (Plus/Pro/Business in May 2026), Claude with MCP, Microsoft Copilot Agent, and ChatGPT Atlas Agent mode all live here.

Tier 3: Autonomous — high-autonomy

The most "agentic" tier. Hand it a goal and it loops: plan → act → observe → fix → next. Humans approve at branching moments (delete, send, pay). Examples: Claude Code, Codex, Devin, Replit Agent, Cursor Composer, Aider— all coding-focused, all able to chew on goals like "fix this bug and open a PR" for minutes to hours.

【Author view】 Autonomy is also where risk spikes. File deletes, force-pushes, payments executed without supervision are hard to undo. Anthropic and OpenAI both recommend policy-gated autonomy(humans approve consequential steps); production agents almost universally land there. Vendors selling "high-autonomy, just trust us" deserve careful enterprise scrutiny.

5. The 10 major AI agents shipping in 2026

AgentTask domainAutonomy tierPlan example
Claude CodeCoding (CLI)AutonomousClaude Pro / Max
CodexCoding (CLI / web / mobile)AutonomousAll ChatGPT plans
DevinCoding (web)AutonomousCognition monthly
Replit AgentFull-stack dev (web)AutonomousReplit Core/Teams
Cursor ComposerCoding (IDE)AutonomousCursor Pro
ChatGPT AgentBusiness research, task hand-offTool-useChatGPT Plus/Pro/Business
Microsoft Copilot AgentOffice / Teams automationTool-useMicrosoft 365 Copilot
Salesforce AgentforceCustomer supportTool-useSalesforce upper tiers
ChatGPT Atlas Agent modeBrowser automationTool-useChatGPT Plus/Pro/Business
Perplexity CometBrowser + researchTool-usePerplexity Pro

【Author view】 Three lanes: coding (autonomous), business automation (tool-use), browser (tool-use). Non-technical users get the smoothest first experience with business-automation or browser agents — start with ChatGPT Agent or Microsoft Copilot Agent. Coding agents stay in the engineer's lane; they multiply dev productivity rather than replacing roles.

Scatter map of the 10 AI agents. X-axis = task domain (coding / business automation / browser). Y-axis = autonomy tier (Reactive / Tool-use / Autonomous). Bubble size = approximate user-base scale. Claude Code, Codex, Devin, Replit Agent, Cursor cluster top-right (coding × Autonomous). ChatGPT Agent, Microsoft Copilot Agent, Salesforce Agentforce sit center (business × Tool-use). ChatGPT Atlas Agent mode, Perplexity Comet sit right (browser × Tool-use). Annotation: 'May 2026, drawn from vendor docs; bubble size approximate.'
Figure: Figure 3: 10 AI agents — domain × autonomy map (Python diagram)

6. How to try one today — three free steps

High-density whiteboard explainer titled 'Three steps for regular users to try an AI agent today.' Numbered zones: (1) Pick a free Tool-use agent (ChatGPT free + web search / Claude.ai with MCP / Microsoft Copilot free / Perplexity free); (2) Give it a goal-style task (e.g., 'plan my weekend around next week's weather' / 'compare three competitors and produce a table'); (3) Observe results and refine — note where the AI thought vs where it called a tool. Yellow sticky callout: 'Avoid jumping straight to Autonomous; never feed sensitive data.'
Figure: Figure 4: Three steps to try an AI agent today (high-density whiteboard)

Step 1: pick a free Tool-use agent

  • ChatGPT free + Web Search: built-in web tool with cited sources
  • Claude.ai free + MCP: add MCP servers to Claude Desktop for files, GitHub, etc.
  • Microsoft Copilot free tier: Bing search + image gen + light research
  • Perplexity free: search-focused, Pro Search runs multi-step lookups

Pick one and use it 5 minutes a day. ChatGPT free + Web Search is the lowest-friction starting point.

Step 2: give it a goal-style task

  • "Look up next weekend's Tokyo weather and propose three weekend plans that work rain or shine."
  • "Compare three competitors' sites and produce a price / feature / support table."
  • "Summarize the top five industry trend stories for next week's meeting."
  • "Search for this error message and give me three likely causes and fixes."

Step 3: study where tools got called

Agent answers leave tool-call traces — citations, search results, computed values. Watching what the AI thought versus what it offloaded teaches you its strengths and weak spots. Once Tool-use feels natural, you can graduate to Autonomous tools (Claude Code / Codex) without flying blind.

7. Risks and guardrails — when to trust an agent

Destructive actions

Tool use is the appeal — and the risk vector. File deletes, force-pushes, payments, and leaked API keys are all in scope once an agent has shell or API access. The mitigation is policy-gated human approval: Claude Code, Codex, and Microsoft Copilot Agent all ship per-command approval. Disabling or rubber-stamping it is how incidents happen.

Sensitive data

Most agents run in vendor clouds. Customer PII, payment data, sealed strategy docs, and credentials should never sit in prompts; they end up in logs (and possibly in training data when opt-in). Mitigations: vet vendor policy, use Enterprise plans with training disabled, and run sensitive work in local-execution OSS like Sales Claw.

Hallucinations

Agents still hallucinate in 2026 — misreading search results, inventing functions, fabricating citations. Important decisions need a human cross-check.

Audit logs

For business use, "who ran what, when, through which tool" must be loggable. Microsoft Copilot Studio, Salesforce Agentforce, OpenAI Agents Platform, and Anthropic Claude for Enterprise all ship audit logging. Sales Claw, as local-execution OSS, logs every send by design.

Vendor lock-in

Most agents are cloud SaaS, exposing you directly to vendor policy changes, repricing, or shutdown. 2026 has already seen frequent plan changes — long- range plans need slack.

Radar chart of five AI-agent risks (destructive action / data leak / hallucination / audit gap / vendor lock-in), each scored 1-5. Tool-use type (e.g., ChatGPT Agent) in blue: destructive 2 / leak 3 / hallucination 3 / audit 2 / vendor 4. Autonomous type with all-permissions on (e.g., Claude Code unrestricted) in red: destructive 5 / leak 3 / hallucination 3 / audit 4 / vendor 4. Local-execution type (e.g., Sales Claw) in green: destructive 1 / leak 1 / hallucination 2 / audit 1 / vendor 1. Annotation: 'May 2026 author view; shifts substantially with operational design.'
Figure: Figure 5: Five-risk radar (Python diagram, indicative)

8. Business use and the Sales Claw context

Research and summarization

The most mature use case in 2026. ChatGPT Agent, Claude, and Perplexity Pro can all crawl multiple pages and produce summaries fit for competitive research, prospect prep, or weekly industry digests.

Office automation

Microsoft Copilot Agent integrates cleanly into Office / Teams / Outlook estates. Salesforce Agentforce does the same on the Salesforce side. Both centralize access control and meet enterprise audit requirements.

Sales automation (Sales Claw)

The "sales agent" category includes Sales Claw, Apollo.io, Outreach AI, and Salesloft, but the design philosophies diverge sharply. Cloud SaaS (Apollo / Outreach / Salesloft) runs lead-extraction-to-send on the vendor's infrastructure — fast to onboard but customer data lives with the vendor. Sales Clawis local-execution OSS specialized in delivering contact-form messages to prospects' sites.

Sales Claw runs policy-gated autonomy: pre-send automated checks, sales-NG detection, CAPTCHA-aware stop, send-rate limits, and full audit logsreduce the risk of misdelivery and policy violations. On the 3-tier framework, its execution loop is Autonomous but bounded by pre-send policy — not a free hand.

A multi-agent stack

Rather than betting everything on one vendor, real-world adoption blends agents by task profile.

項目Cloud agents (ChatGPT Agent / Copilot)Local-execution (Sales Claw)
FitsResearch, summarization, office automation, codingSales-form sends, sensitive-data processing
Data locationVendor cloud (training-off available on Enterprise)Local PC / self-hosted only
CostUSD 20-200 / user / monthOSS free + self-run cost
Time to valueSame day1-3 days setup
Vendor lock-inHigh (vendor policy hits you directly)Low (OSS continuity is portable)

Pre-rollout checklist (7 items)

  1. Map purpose to task granularity × data sensitivity first; pick agents accordingly
  2. Vet vendor data policy with infosec (training off, retention windows)
  3. Mandate human approval for destructive, sending, and payment actions
  4. Require audit logging (Enterprise plan or local-execution)
  5. Define a hallucination cross-check process for important numbers and citations
  6. Diversify vendors — don't pile everything on one provider
  7. Review each agent's feature and pricing changes quarterly

This is an English overlay of the Japanese-language original article. The Japanese version is canonical. 日本語原文はこちら.

In the AI-agent era, use ChatGPT Agent / Claude for research, Microsoft Copilot for office automation, and Sales Claw for compliance-sensitive contact-form sending. Sales Claw is local-execution OSS with pre-send checks, sales-NG detection, CAPTCHA-aware stop, send-rate limits, and audit logging — locking the foundation of AI sales automation to your policy, not someone else's.

無料・MIT ライセンス。インストールせずにライブデモも試せます。

よくある質問

What is an AI agent?
An AI agent is an AI that, given a goal, plans, calls tools (search, file ops, APIs, browser, etc.), and runs a multi-step loop until the task is done. Unlike plain ChatGPT — which answers once and stops — an agent iterates 'goal → plan → tool call → observe → next step.' Anthropic defines an agent as a system where the LLM 'dynamically directs its own processes and tool usage' (Building Effective Agents, 2024-12). Major examples: Claude Code, Codex, Devin, Replit Agent, ChatGPT Agent, Microsoft Copilot Agent. 2026 is widely called 'the year of the AI agent.'
Why can't anyone summarize "AI agent" in one sentence?
Because definitions diverge across vendors on three axes. (1) Autonomy: Anthropic strictly separates 'workflow' (human-directed) from 'agent' (LLM-directed); OpenAI bundles everything as 'Agents Platform.' (2) Tool-use scope: some vendors count one search call as agentic; others require multiple tools. (3) Task domain: coding-only (Claude Code / Codex / Devin) vs business automation (Microsoft Copilot Agent) vs customer support (Salesforce Agentforce) are not the same thing. The most useful mental model is the 3-tier gradient: Reactive / Tool-use / Autonomous.
How is an AI agent different from regular ChatGPT?
The biggest gap is tool-call permission. Plain ChatGPT lives inside the conversation — it can't write files, hit APIs, or drive a browser. An AI agent has tool privileges and chains multi-step work on its own. Example: 'Plan my weekend around next week's weather' — plain ChatGPT says it has no real-time data; a Tool-use agent calls a weather API and writes a plan from the results. As of May 2026, OpenAI ships this as 'ChatGPT Agent' for Plus, Pro, and Business plans.
Can I use an AI agent for free?
Yes — multiple Tool-use agents have free tiers. (1) ChatGPT free with Web Search: built-in retrieval, cited sources, lowest barrier. (2) Claude.ai free with MCP connectors: add MCP servers to Claude Desktop for file ops, GitHub, etc. (3) Microsoft Copilot free: Bing search, image generation, light research. (4) Perplexity free: search-focused, Pro Search runs multi-step lookups. Autonomous-tier agents (Claude Code, Codex, Devin) are largely paid — Claude Pro (USD 20/mo), ChatGPT Plus (USD 20/mo), Cursor Pro (USD 20/mo) are reasonable entry points.
How do the three tiers (Reactive / Tool-use / Autonomous) differ?
They mark an autonomy gradient. (1) Reactive: no tools, just answers. ChatGPT default, Claude.ai, Gemini Web. Strictly not yet an 'agent.' (2) Tool-use: calls search / file ops / APIs / browser, runs multi-step work. ChatGPT Agent, Claude with MCP, Microsoft Copilot Agent, ChatGPT Atlas Agent mode. Half-autonomous; humans still drive the conversation. (3) Autonomous: hand it a goal and it loops plan → act → observe → fix. Humans approve at branching moments. Claude Code, Codex, Devin, Replit Agent, Cursor Composer. Convenience and risk both rise per tier — non-technical users should start at Tool-use.
Which 10 AI agents matter in 2026?
Three families. Coding (Autonomous): (1) Claude Code, (2) Codex, (3) Devin, (4) Replit Agent, (5) Cursor Composer. Business automation (Tool-use): (6) ChatGPT Agent, (7) Microsoft Copilot Agent, (8) Salesforce Agentforce. Browser (Tool-use): (9) ChatGPT Atlas Agent mode, (10) Perplexity Comet. Non-technical users get the smoothest start with business or browser agents — try ChatGPT Agent or Microsoft Copilot Agent first. Coding agents stay in engineering and multiply dev productivity rather than replacing roles.
What should I watch out for when using AI agents for business?
Five risk axes. (1) Destructive actions (delete, force-push, payment, send): keep humans in the loop. Claude Code, Codex, and Microsoft Copilot Agent all ship per-command approval. (2) Sensitive data: don't paste cards, PII, credentials, or sealed docs into prompts; use Enterprise plans with training disabled. (3) Hallucinations: AI still invents facts in 2026 — cross-check important numbers, citations, and function names. (4) Audit logs: require 'who / when / what / via which tool' logging. (5) Vendor lock-in: spread workloads across multiple providers. High-sensitivity, compliance-sensitive sends (like contact-form outreach) are best served by local-execution OSS like Sales Claw.

参考文献

本記事は X 公式アカウントと公式ドキュメントを一次情報として参照しています。

  1. [01]
  2. [02]
  3. [03]
  4. [04]
  5. [05]
  6. [06]
  7. [07]
  8. [08]
  9. [09]
  10. [10]

この記事の著者

中澤 圭志

中澤 圭志

Sales Claw maintainer

Designs and develops Sales Claw. Writes from the field on B2B sales automation and applied AI.

Share this article