# 策展 · X (Twitter) 🔥🔥🔥🔥🔥

> 📖 本站完整內容索引（documentation index）：[llms.txt](/llms.txt)

> 作者：Claude (@claudeai) · 平台：X (Twitter) · 日期：2026-07-01

> 原始來源：https://x.com/claudeai/status/2072017450611142835

## 中文摘要

Anthropic 發布 Claude Sonnet 5，推理／工具使用／程式撰寫全面升級，效能逼近 Opus 4.8。

<video src="https://pub-75d4fe1e4e80421b9ecb1245a7ae0d1a.r2.dev/curated/1782867145855-sfoswg1f.mp4" poster="https://pub-75d4fe1e4e80421b9ecb1245a7ae0d1a.r2.dev/curated/05f5ae17eb941c59.jpg" controls playsinline preload="metadata" style="max-width:100%;height:auto;display:block;margin:1rem 0"></video>
> 這是一段以植物插畫組合成數字「5」並展示「Sonnet 5」標題的動態演示。

**核心升級與效能表現**
Claude Sonnet 5 是 Anthropic 目前最具 Agentic 特性的 Sonnet 系列模型，其核心改進在於推理、工具使用、程式撰寫及知識工作處理能力。相較於前代 Sonnet 4.6，Sonnet 5 的效能已逼近 Opus 4.8，但價格更具競爭力。該模型能自主制定計畫、操作瀏覽器與終端機，並在無需額外提示的情況下主動檢查輸出結果，解決了過去 Sonnet 模型在處理長任務時容易中斷的問題。

![](https://pub-75d4fe1e4e80421b9ecb1245a7ae0d1a.r2.dev/curated/bee0dde013dc5bb5.png)
> Claude Sonnet 5 在推理、工具使用、程式設計與知識工作等指標上較前代 Sonnet 4.6 有顯著提升，且其表現已逼近 Opus 4.8。

**安全性與防護機制**
在安全性評估方面，Sonnet 5 在 Agentic 語境下的表現較 Sonnet 4.6 更為穩健，且在拒絕惡意請求及抵抗 prompt injection 攻擊的能力上有所提升。
- 幻覺與諂媚行為（sycophancy）的發生率較前代降低。
- 針對網路安全任務，Anthropic 並未刻意訓練其相關能力，且在開發軟體漏洞等危險技能的評估上，表現顯著弱於 Opus 4.8 與 Mythos 5。
- 惟在自動化行為稽核中，Sonnet 5 的不當行為比率仍略高於 Opus 4.8 與 Claude Mythos Preview。為確保安全，Sonnet 5 預設啟用與 Claude Opus 4.7 及 4.8 相同的網路安全防護機制，能即時偵測並阻擋危險操作。

**取得方式與定價策略**
Claude Sonnet 5 即日起全面上線，並已成為 Free 與 Pro 方案的預設模型，同時開放給 Max、Team 與 Enterprise 使用者。開發者可透過 [Claude Platform](https://www.anthropic.com/news/claude-sonnet-5) 使用 `claude-sonnet-5` API。
- 推廣定價（至 2026 年 8 月 31 日）：輸入每百萬 token 2 美元，輸出每百萬 token 10 美元。
- 標準定價（2026 年 9 月 1 日起）：輸入每百萬 token 3 美元，輸出每百萬 token 15 美元。
- 官方已針對 Chat、Cowork、Claude Code 及 Claude Platform 提升速率限制（rate limits），以支援更高強度的 Agentic 任務需求。

## 媒體內容

**這是一段以植物插畫組合成數字「5」並展示「Sonnet 5」標題的動態演示。**

**影片中的 Prompt 與操作**

操作步驟：

1. （00:00）植物插畫元素逐漸生長並排列成數字「5」
2. （00:08）畫面轉換顯示「Sonnet 5」文字

**Claude Sonnet 5 在推理、工具使用、程式設計與知識工作等指標上較前代 Sonnet 4.6 有顯著提升，且其表現已逼近 Opus 4.8。**

**數據表**

|   | Sonnet 5 | Sonnet 4.6 | Opus 4.8 |
| --- | --- | --- | --- |
| Agentic coding (SWE-bench Pro) | 63.2% | 58.1% | 69.2% |
| Agentic coding (Terminal-Bench 2.1) | 80.4% | 67.0% | 82.7% |
| Multidisciplinary reasoning (Humanity's Last Exam - no tools) | 43.2% | 34.6% | 49.8% |
| Multidisciplinary reasoning (Humanity's Last Exam - with tools) | 57.4% | 46.8% | 57.9% |
| Computer use (OSWorld-Verified) | 81.2% | 78.5% | 83.4% |
| Knowledge work (GDPval-AA v2) | 1618 | 1395 | 1615 |

## 標籤

功能更新, 新產品, Agent, LLM, Claude, Anthropic, Claude