# 策展 · X (Twitter) 🔥🔥🔥🔥🔥

> 📖 本站完整內容索引（documentation index）：[llms.txt](/llms.txt)

> 作者：ClaudeDevs (@ClaudeDevs) · 平台：X (Twitter) · 日期：2026-06-30

> 原始來源：https://x.com/ClaudeDevs/status/2071671418245492926

## 中文摘要

Spotify 用 Claude Code Agent SDK 打造程式碼自動遷移工具 Honk，PR 成功率從約 25% 拉到 80%、73% PR 由 AI 輔助。

<video src="https://pub-75d4fe1e4e80421b9ecb1245a7ae0d1a.r2.dev/curated/1782787069304-2aijgau1.mp4" poster="https://pub-75d4fe1e4e80421b9ecb1245a7ae0d1a.r2.dev/curated/4f3960dc20d91b71.jpg" controls playsinline preload="metadata" style="max-width:100%;height:auto;display:block;margin:1rem 0"></video>
> How Spotify runs agents across 20M+ lines of code, with Niklas Gustavsson

**AI 驅動的開發流程**
Spotify 目前有 73% 的 PR 歸功於 AI 輔助開發，工程師 Niklas Gustavsson 的日常工作流程已完全改變。他習慣在終端機中開啟多個 `tmux` 視窗，並針對不同的 `git worktree` 分配專屬的 Agent 在背景執行任務。儘管 Spotify 的 monorepo 規模超過 2,000 萬行程式碼，但 Claude Code 在此環境下表現優異，能有效參考既有程式碼並提供解決方案，讓工程師從繁瑣的程式碼編輯中解放，轉而專注於原型製作與產品決策。

<video src="https://pub-75d4fe1e4e80421b9ecb1245a7ae0d1a.r2.dev/curated/1782783500093-ura7cj5x.mp4" poster="https://pub-75d4fe1e4e80421b9ecb1245a7ae0d1a.r2.dev/curated/6c6d7ba358e2fd2c.jpg" controls playsinline preload="metadata" style="max-width:100%;height:auto;display:block;margin:1rem 0"></video>
> 兩位軟體工程領域的專業人士正在討論 AI 工具如何改變軟體開發流程與效率。

**自動化基礎設施「Honk」的演進**
為了應對程式庫成長速度遠超人力維護能力的挑戰，Spotify 開發了名為「Honk」的自動化系統。該系統的演進過程反映了 AI 技術的成熟：
- **初期階段**：依賴確定性腳本進行程式碼遷移，但因 API 表面積過大，腳本需處理數千種邊緣案例，維護成本極高。
- **引入判斷器**：初期嘗試將 LLM 作為「判斷器」（judge）整合進流程，成功將 PR 成功率從約 25% 提升至 80%。
- **現行架構**：隨著模型能力提升，團隊已移除判斷器，改為在 Kubernetes pod 中運行基於 Claude Code Agent SDK 的 Agent。使用者可自行掛載內部工具，並透過自動化驗證機制（包含 Linux 與 macOS 建置）確保程式碼品質。

<video src="https://pub-75d4fe1e4e80421b9ecb1245a7ae0d1a.r2.dev/curated/1782783630257-8ws4xkvo.mp4" poster="https://pub-75d4fe1e4e80421b9ecb1245a7ae0d1a.r2.dev/curated/6709c31fd63f1f03.jpg" controls playsinline preload="metadata" style="max-width:100%;height:auto;display:block;margin:1rem 0"></video>
> 兩位工程專家探討自動化程式碼遷移工具「Honk」的開發歷程與架構演進。

**驗證迴圈與工程文化**
Niklas Gustavsson 強調，在 Agent 執行自主任務的「端到端」開發中，最關鍵的投資在於「驗證迴圈」。Spotify 透過強化測試自動化，讓工程師能更有信心地監督 Agent，而非手動執行重複性工作。他認為「速度與品質」並非二分法，透過將品質實踐編碼為 `CLAUDE.md` 或 MCP 等形式的 skill，反而能顯著提升開發效率。

<video src="https://pub-75d4fe1e4e80421b9ecb1245a7ae0d1a.r2.dev/curated/1782783687195-0o20ysca.mp4" poster="https://pub-75d4fe1e4e80421b9ecb1245a7ae0d1a.r2.dev/curated/81d3aaf7289792e9.jpg" controls playsinline preload="metadata" style="max-width:100%;height:auto;display:block;margin:1rem 0"></video>
> 兩位專業人士在辦公室討論軟體開發中的自動化驗證與代理（Agent）技術。

**對未來的影響**
AI 工具不僅提升了工程師的生產力，更打破了開發門檻。目前 Spotify 內部甚至有高階主管能利用這些工具，在數小時內完成端到端原型製作，並透過內部應用程式商店分享。Niklas Gustavsson 建議其他工程領導者，應持續投資於測試自動化與程式庫標準化，因為高度一致的程式碼環境能讓 Agent 運作得更精準，這正是邁向高效 AI 驅動開發的核心關鍵。

<video src="https://pub-75d4fe1e4e80421b9ecb1245a7ae0d1a.r2.dev/curated/1782783551595-hf1kgt8f.mp4" poster="https://pub-75d4fe1e4e80421b9ecb1245a7ae0d1a.r2.dev/curated/7f53272742d3e35e.jpg" controls playsinline preload="metadata" style="max-width:100%;height:auto;display:block;margin:1rem 0"></video>
> 兩位講者在辦公室討論如何將 Claude Code 整合進軟體開發工作流程中。

## 媒體內容

**兩位軟體工程領域的專業人士正在討論 AI 工具如何改變軟體開發流程與效率。**

**影片中的 Prompt 與操作**

操作步驟：

1. （00:00）兩位講者進行關於軟體工程變革的對談
2. （03:59）畫面顯示 Claude Code 品牌標誌

**逐字稿**

- `00:00` 我知道 Spotify 曾談過工程師可以在地鐵上提交 PR，這真的很酷。（I know Spotify has talked about engineers, you know, like shipping PRs on the subway,）
- `00:03` 所以，顯然工程領域正在改變。對於身處其中的每個人，以及正在摸索方向的人，你有什麼建議？（which is really cool. So, you know, obviously engineering is changing. What's your advice）
- `00:07` 我是一個一直以來都很享受程式撰寫中解決問題過程的人。（to everyone that's in the middle of it and, you know, trying to figure it out?）
- `00:10` 這聽起來可能很宅，但在閒暇時，我偶爾會參加程式競賽。（I'm someone who's always have truly enjoyed the problem solving part of coding. This is）
- `00:18` 我們之前談過這件事是如何改變的，它徹底改變了我們的工作方式。（going to sound as nerdy as it is, but like in my spare time, I will do like competitive）
- `00:22` 從我個人的角度來看，我曾經非常擔心，我會不會失去那種解決問題時艱難的腦力挑戰感？（programming at times. And we were talking about before of like how this was changed,）
- `00:26` 但現在我發現自己有五個 Agent 在背景運作，我與它們互動的方式，與一兩年前的工作方式截然不同。（completely changing the way we were working. And I was pretty worried about that from just）
- `00:30` 我喜歡做的事是解決問題，而我解決這些問題的方式，結果證明並不是對我來說最關鍵的部分。（my personal point of view of like, am I going to miss that part of like the hard mental challenge）
- `00:35` 我發現自己不僅生產力更高，能夠從工作中創造更多價值，而且還能解決以前無法解決的問題。（of solving problems? And now I find myself having, you know, five agents working in the background）
- `00:41` 我可以進入那些以前需要花幾天甚至幾週才能上手的程式庫，並做出以前我根本做不到的貢獻。（and my way of interacting with them is very different from the way that I was working）
- `00:46` 所以對我來說，這真的很棒。這對不同的人來說可能看起來不一樣，但我認為不妨試試看，找到一種你喜歡使用這些工具的方式。（a year or two ago. The thing that I like to do is solving problems and the way that I solve those）
- `00:52` 我覺得對我而言，我看到了實作時間上的巨大轉變，因為現在 Claude Code 會在背景處理這些，而我則可以做其他事情。（problems turn out to not to be the most critical piece for me. I find myself both to be more）
- `00:59` 沒錯，取而代之填補這些時間的，是思考下一步該做什麼、與客戶溝通，以及進行比我預期中多得多的原型開發。（productive in that I can bring more value from the work that I can do. I can also solve problems）
- `01:04` 其中有些是為了外部產品，有些則是為了內部自動化。這種轉變如何？對你來說這種改變是什麼樣子的？（that I really couldn't solve before. I can jump into code bases that I, that would have taken me days）
- `01:09` Claude 和類似工具所解鎖的功能之一，就是讓任何人都能將他們的想法，無論是什麼想法，用自然語言表達出來，然後讓 Claude 去實作。（or weeks to get into before and be contributing things that I just could not do before. So for me,）
- `01:16` 所以今天，我們有一種非常簡單的方式來開始建構行動應用程式和後端的端到端原型。（that's been amazing. That's going to look different for different people, but I think give it a shot）
- `01:22` 這對大家來說是一個真正的解鎖，讓他們能夠表達那些過去需要動員一群工程師來幫忙建構的想法。（and find a way that you, you can use those tools in the way that you like. I feel like for me, I've）
- `01:28` 現在你可以在一兩個小時內，就有一個可運作的原型，你可以開始與他人分享，展示那個想法在現實中看起來是什麼樣子。（seen this big shift from implementation time because now, you know, Claude Code does it in the, in the background）
- `01:33` 這些事情在一年以前是無法想像的，而現在我們每天都在做。（while I do other stuff. Yeah. And instead what's filled up that time for me is thinking about what's）
- `01:39` 沒錯，我喜歡這樣。（next, talking to customers and also like actually much more prototyping than I expected. And some of it is）
- `01:45` 用於外部產品。有些則是用於內部自動化。這種轉變是如何發生的？對你來說，這種改變是什麼樣子的？（for external products. Some of it is for internal automations. How, how is that shift? How's that）
- `01:50` Claude 和類似工具所帶來的其中一個突破，就是讓任何人都能將他們的想法，無論是什麼樣的想法，透過自然語言表達出來，並讓 Claude（change looked for you? One of the things that Claude and similar tools has unlocked is to allow）
- `01:55` 接著去實作它。所以今天我們有一種非常簡單的方式，可以開始建構一個端到端的（anyone to take their idea or whatever that idea is, express that in natural language and have Claude）
- `02:02` 原型，包含我們的行動裝置應用程式和後端。這對大家來說是一個真正的突破，能夠（then go implement that. So today we have a very simple way of getting going to build an end-to-end）
- `02:08` 表達出過去需要動員一群工程師來幫你建構的想法。而現在，（prototype in our mobile apps and our backend. And that's been a real unlock for folks to be able to）
- `02:15` 你可以在一兩個小時內，就擁有一個可運作的原型，並開始與他人分享，（express ideas that used to take motivating a bunch of engineers to try to build that for you. And now）
- `02:20` 展示那個想法在現實生活中實際運作的樣子。這類事情在一年之前是無法想像的。（you can go in and with the, within an hour or two, you have a working prototype that you can start sharing）
- `02:26` 而現在我們每天都在做這些事。沒錯，我非常喜歡這一點。（with people to show what that actual idea looks like in real life. Those types of things were unimaginable a）
- `02:33` 一年前。而現在我們每天都在做這些事。沒錯，我喜歡這樣。（year ago. And now we're doing them every day. Yeah. I love that.）

**兩位講者在辦公室討論如何將 Claude Code 整合進軟體開發工作流程中。**

**逐字稿**

- `00:00` 那麼你現在的工作流程是什麼樣的？（So what's your workflow like today?）
- `00:01` 你如何使用 Claude Code？Spotify 是如何使用 Claude Code 的？（How do you use Claude Code? How does Spotify use Claude Code?）
- `00:05` 是的，我使用它的方式，我想說，算是相當標準的用法。（Yes, I use it in a, I'm going to say, fairly vanilla way, I think.）
- `00:09` 我在終端機裡執行一堆 tmux 執行緒。（I run it in a bunch of T-Max sessions in a terminal.）
- `00:15` 每當我進行一些工作時，我通常會讓一堆 Agent 在背景執行。（I usually have a bunch of agents running in the background whenever I do some work.）
- `00:20` 開了幾個終端機分頁？（How many terminal tabs?）
- `00:22` 我大概會開五到十個分頁。（So I will have anything in between five and ten tabs.）
- `00:26` 然後我會使用一些分割視窗，因為我喜歡有一個終端機可以讓我實際執行 git diff 之類的操作。（And then I use some paints because I like to have a terminal where I can actually, like, get diff and whatnot.）
- `00:32` 所以我設定了一個矩陣，包含 Claude Code 執行緒和對應的終端機，並在我的工作目錄中運作。（So I have this set up with a matrix of Claude Code sessions and matching terminals in a set of work trees that I work in.）
- `00:41` 我們的架構是擁有幾個非常大的單一程式庫（monorepo），我們正逐漸朝這個方向轉移。（The way that we're set up is that we have a few very large monorepos, which we're gradually moving towards.）
- `00:49` 但我們仍然有數千個小型多程式庫（polyrepos）留存。（But we still have thousands of small polyrepos for that remains.）
- `00:54` 所以，我大部分的工作都是在那些單一程式庫中進行的。（So I'm, most of my work happens in those monorepos.）
- `00:58` 因此，在任何給定的時間點，我通常都會在那裡同時執行幾個 Claude Code 和終端機。（So I usually have a few clouds and terminals going on there at any given point in time.）
- `01:03` 而當我需要切換到其中一個多程式庫時，我就會在那裡開啟一個比較臨時的 Claude Code 執行緒。（And then when I need to dip into one of our polyrepos, I will open up a more temporary cloud session there.）
- `01:09` 你覺得單一程式庫還是多程式庫比較適合 Claude Code？（Do you feel like one, like, monorepo or polyrepo is a better fit for Claude Code?）
- `01:14` 老實說，我原本對單一程式庫的設定和 Agent 有點擔心，因為我覺得我們之前使用的一些工具，在索引等方面出現過問題。（I was a bit worried, to be honest, about the monorepo setup and agents originally, because I think with some of the prior tools we've been using, we've been seeing issues with indexing and things like that.）
- `01:26` 而且這些都是相當大的儲存庫，我們的後端單一程式庫有超過 2000 萬行的程式碼。（And these are fairly large repositories that are back in monorepo is more than 20 million lines of code.）
- `01:33` 但事實證明，Claude Code 在這些儲存庫中運作得非常好。（But it turns out that Claude Code works amazingly well in those repositories.）
- `01:37` 我想我們發現的一件事是，Claude Code 在查看儲存庫中其他程式碼以獲取解決問題的靈感時，表現得有多好。（And I think one of the things we found is how good Claude Code is looking at other code in the repository to get, I guess, inspiration for the problem you're trying to solve.）

**兩位工程專家探討自動化程式碼遷移工具「Honk」的開發歷程與架構演進。**

**影片中的 Prompt 與操作**

操作步驟：

1. （00:00）兩位講者在會議桌前進行對談。
2. （00:22）畫面切換至 Boris Cherny 的特寫鏡頭。
3. （00:34）畫面切換至 Niklas Gustavsson 的特寫鏡頭。
4. （02:36）畫面切換至俯視視角，展示兩人在會議桌前的互動。
5. （05:23）影片結尾出現 Claude Code 的 Logo。

**逐字稿**

- `00:00` 我們很早就發現的一件事是，程式碼擁有龐大的 API 介面，所以試圖對程式碼進行變更會很快變得非常複雜。（one of the things we found pretty early was code has an enormous api surface so trying to make）
- `00:05` 因此，我們很快就遇到了瓶頸，發現我們能進行的變更複雜度有限，甚至連替換方法和 API 都變得相當複雜。（changes to code gets very complicated very quickly so we pretty quickly ran into sealing of）
- `00:13` 當你可以用五種不同的方式呼叫它時，僅僅使用傳統的靜態分析（例如 AST 轉換）來處理這件事會變得非常困難。（how complex changes we can do even even switching out the method and api becomes pretty complicated）
- `00:21` 因為舉例來說，假設有一個 API，你總是會碰到變數之類的東西，現在你需要某種變數和狀態追蹤。（when you can call that in five different ways so doing with this with just traditional ecstatic）
- `00:26` 沒錯，這很混亂。是的，所以我們必須遷移程式碼的每個腳本，最後都變成了數千行程式碼，用來處理該程式碼中的每一個邊緣案例。（analysis like ast transformation exactly because like let's say there's an api you just like）
- `00:29` 這啟發了我們，正如我之前提到的，幾乎在早期大型語言模型出現時，我們就想著：嘿，我們能將這些東西應用到這個問題上嗎？（you always hit a variable or something now now you need kind of like variable and state tracking that's）
- `00:34` 早期它運作得並不怎麼好，部分原因是模型還不夠強大，部分原因是我們在嘗試解決問題的方法上太過天真。（exactly right that's messy yeah so each script that we had to migrate code turned into thousands of）
- `00:40` 我們基本上只是把程式碼丟給模型，試圖讓它一次性完成變更，結果當然失敗了。（lines of taking care of every edge case in that code so that inspired us as i mentioned before）
- `00:46` 隨著時間推移，模型進步了，我們對於如何處理這個問題的思考也進步了，所以我們開始應用一個判斷機制來確保輸出符合預期。（as pretty much as soon as the early llms came along of like hey these things can we apply them to this）
- `00:53` 我們開始拆解問題，並以各種方式分解問題，經歷了非常多次的迭代，以及許多內部的 hacks 來嘗試以不同方式解決這個問題。（problem and early on it didn't work at all all that well uh partially because the models weren't）
- `01:01` 我們開始整合這些成果，這就成了我們現在所說的 honk。（good enough partially because we just we were very naive in how we were trying to do it we basically）
- `01:06` 它最初是一個非常不同的東西，並不是建立在 Claude 之上，它更多是我們內部開發的一堆東西，但它是隧道盡頭的第一道曙光，讓我們覺得：是的，這確實是一個我們可以解決的問題。（just put the code in front of the model and try to get it one shot that that change so that didn't work）
- `01:14` 之後我們對 honk 進行了非常多次的迭代。（over time models improved and are thinking about how to do this improved so we started applying）
- `01:20` 是的，所以今天我們發布了我們所謂的 v2 版本，但我認為實際上它可能是 v8 或類似的版本，我們只是沒有追蹤我們所做的迭代次數。（a lens that's judged to make sure that the output was as intended we started breaking down the problem）
- `01:26` 它最初是作為一個自動化程式碼變更的排程工具，並在我們所有的程式庫中進行編排，但工程師很快就發現，嘿，這對其他事情也很有用。（and decomposing the problem in various ways so many many many iterations of this uh and many internal）
- `01:32` 我想在 Slack 上提到這件事，並讓它為我執行一個任務，或是諸如此類的事情，所以今天 honk 已經發展成為我們更無所不在的工具。（hacks to try to take on this problem in different ways uh we started consolidating that and that then）
- `01:37` 跟我談談 honk 的架構，主要的組成部分是什麼？（became what we now call honk um it was a very different beast originally it was not on top of claude um）
- `01:45` 所以你提到有一個負責編寫程式碼的 Agent，而這只是建立在 Claude Code Agent SDK 之上，對吧？（um it was more a bunch of homegrown type of things in there but it was the first sort of light in the）
- `01:52` 是的，然後你還有一個驗證步驟，像是一個 Agentic 驗證器，跟我多說說這個。（tunnel of like yeah this is actually a problem that we can solve and then we've done many many iterations）
- `01:57` 我們過去在 honk 中有一個判斷機制，但我們後來真的把它移除了嗎？因為我們發現 Agent 和模型，再次回到 四或五代，已經變得足夠好，我們不再需要那個判斷機制了。（on honk yeah so today we were we released what we call v2 but i think in reality it's v8 or something）
- `02:04` 判斷機制在 honk 的早期迭代中非常重要，它讓我們從（如果我沒記錯數字的話）大約 20% 到 30% 的 PR 成功率，提升到了大約 80% 的成功率。（like that we just didn't keep track of the of the iterations we did on it and it started out as this）
- `02:09` 所以這是一個巨大的改變，但正如我們所討論的，模型跟上了，Agent harness 也跟上了，所以我們現在已經從 honk 中移除了那個判斷機制。（like automate these code changes schedule that and orchestrate over all our repositories but pretty）
- `02:14` 工程師很快就發現，嘿，這對其他事情也很有用，我想在 Slack 上提到這件事，並讓它為我執行任務，或是諸如此類的事情，所以今天（quickly engineers figured out that hey this is useful for other things as well i want to）
- `02:20` honk 已經成長為我們更無所不在的工具，跟我聊聊 honk 的架構吧（mention this thing on slack and have it do a task for me or or all of those types of things so today）
- `02:24` 像是主要的組成部分有哪些？你提到了有一個負責寫程式碼的 Agent，（honk is has grown into being a much more ubiquitous tool for us tell me about the architecture of honk）
- `02:30` 而這正是建立在 Claude Code Agent SDK 之上的，沒錯，然後你還有一個（like how what are the big pieces so you talked about having uh there there's a there's the agent that）
- `02:37` 驗證步驟，像是一個 Agentic 程式開發驗證器，多跟我說說這部分，我們過去在 honk 裡有一個判斷器，但（codes and this this is just built on the Claude Code agent sdk yes um and then you also have you have a）
- `02:42` 我們真的有把它移除嗎？因為我們發現 Agent 和模型再次回到（verification step like a agentic verifier tell me more about so we used to have a judge in honk but）
- `02:48` 四或五代時已經夠強大，我們不再需要那個判斷器了，判斷器在 honk 的初期版本中非常重要，（do we actually ever remove that because we found that the agent and models just again going back to）
- `02:55` 它讓我們從——如果我沒記錯數字的話——大約（four or five got good enough that we don't didn't need the judge anymore the judge was very important）
- `03:00` 20% 到 30% 的 PR 成功率提升到 80% 的成功率，所以這是一個非常巨大的改變，但接著（in the first iterations of honk so it it made us go from if i remember the numbers correctly like）
- `03:05` 正如我們所討論的，模型跟上了，Agent harness 也跟上了，所以我們現在已經（roughly like 20 30 success rate on prs to like 80 success rate so so it's a big big change but then）
- `03:13` 從 honk 中移除了那個判斷器。（again as we've talked about the models caught up and and the agent hardness caught up so we have now）
- `03:20` 從 harness 中消除了那個 判斷器（eliminated that judge from from honk）

**兩位專業人士在辦公室討論軟體開發中的自動化驗證與代理（Agent）技術。**

**影片中的 Prompt 與操作**

操作步驟：

1. （00:00）兩位講者在辦公室進行對話。
2. （02:21）畫面出現 Claude Code 的 Logo。

**逐字稿**

- `00:00` 我覺得驗證這件事，我們經常在討論，但我覺得...（I feel like verification is it's one of these things that we talk about a lot yeah but but I）
- `00:04` 當你在進行這種閉環開發時，也就是當一個 Agent 被賦予一項（think when you're doing this kind of closed loop development where it's an agent that it's given a）
- `00:09` 任務，然後它可能需要將任務拆解，並且在沒有人類介入的情況下執行大量（task and then it has to maybe like thin out and break down the task and it just needs to do a lot）
- `00:15` 工作時，我認為我見過的一個常見錯誤是，公司（of work without a human in the loop and I feel like one of the common mistakes I see is companies）
- `00:19` 在驗證機制的完善程度上投入不足。我認為這非常正確，對我們來說也是如此。我們在工程實踐中做出的重大改變之一，（under invest in how well that verification works. I think that's very true and I think it's true for）
- `00:25` 就是強化我們的測試自動化。我們在 Spotify 內部有非常強烈的軟體所有權概念。（us as well. One of the major changes that we did in our in our engineering practices as part of that）
- `00:30` 我們將程式庫劃分為數千個組件，每個組件（was to strengthen our test automation. We have a very strong notion of software ownership within）
- `00:38` 都有明確的所有權，因此由特定的團隊擁有，而該團隊需負完全責任。（Spotify. We have divided our code base into many thousands of components each of those components）
- `00:44` 他們可能最初設計了它，實作了它，並負責營運它。該團隊（have well-defined ownership so it's owned by a particular team and that team is fully responsible）
- `00:49` 對於合併到其程式庫中的每一次變更都會進行把關。這意味著在某些情況下，（for that. They probably designed it originally they implemented it and they operate it. That team）
- `00:55` 我們在測試自動化上可以稍微鬆懈，因為該團隊總是可以檢查每一個 PR，（was in the loop for every change that got merged to their code base. That meant that in some case）
- `01:01` 如果他們需要的話。但隨著開始對我們的原始程式碼進行 PR 自動化，其中一件事是（we could be a bit sloppy on cost test automation because that team could always check every PR）
- `01:05` 我們需要改變對團隊的期望，例如我們將會自動合併這些變更中的大多數，（if they needed to but with starting to automate PRs towards our source code one of the things was）
- `01:11` 而無需你們查看 PR。這意味著必須建立更好的測試自動化，以（we needed to change the expectations for teams like we're going to be automerging most of these changes）
- `01:16` 確保我們所有的軟體都能夠承受這類自動化變更。現在聚焦於（without you ever seeing the PR. So that meant then having to build out much better test automation to）
- `01:22` 我們目前的狀況，這對我們非常有幫助，因為現在我們可以讓 Agent 處理這些任務，（make sure that all our software could sort of survive those types of automated changes. Now zooming into）
- `01:30` 並使用我們之前已經部署好的相同驗證機制。（where we are today that's been very very helpful for us because now we can throw agents at that）
- `01:35` 並使用我們之前已經建立好的相同驗證方式。（and use the same verification that we had in place before.）

**How Spotify runs agents across 20M+ lines of code, with Niklas Gustavsson**

**逐字稿**

- `00:09` 我其實記得去年九月左右我們談過，你當時說了些什麼（I actually remember talking to you back in I think September last year and you said something）
- `00:16` 像是「我覺得年底前沒人會再用 IDE 了」，我在心裡（like yeah I don't think at the end of the year no one is going to be using an ID and in my head I）
- `00:20` 想著這太瘋狂了，這不可能發生，我可以想像這在（was thinking like that's crazy that's never going to happen like I could imagine that happening on）
- `00:25` 兩年內發生之類的情況，但兩個月感覺太極端了，結果兩個月後（a two-year time frame something like that but two months seemed extreme and then two months later）
- `00:31` 我發現自己也不再使用 IDE 了，而且我工作的方式已經完全（I found myself not using an ID anymore and like the the way that I was working had completely）
- `00:37` 改變了，這是我從事這類工作 30 年來從未見過的改變。（changed it changed that I had not seen in the 30 years that I've been doing this type of work）
- `00:42` 有趣的是，內部感受到的和外部是一模一樣的，好吧，你知道我們有（it's funny internally it felt exactly the same way that it did externally okay but you know we had a）
- `00:47` 幾週的領先優勢，沒錯，就是這樣，但感覺完全一樣，所以，我想從（head start of like a few weeks yeah that was it but it felt exactly the same way so okay here I wanted）
- `00:54` 你是如何開始程式撰寫的開始談起。我的正式背景其實是生物學，所以我是（to start with how did you get into coding my formal background is actually in biology so I'm a）
- `01:00` 分子生物學家出身，在那個領域，當我攻讀博士學位時，我們開始有了（molecular biologist by training and in that area when I was doing my PhD studies we started having）
- `01:07` 當時被認為是大數據的東西，我們有大量來自基因組定序的資料，所以我感覺我需要（what was then considered big data so we had a lot of data from genome sequencing so I felt that I needed）
- `01:13` 提升我的程式撰寫能力，所以我轉向了原本預計是休假（to improve my ability to do programming essentially so I switched over what was intended to be a sabbatical）
- `01:19` 的一年，最後變成了我想現在快 30 年都在這個產業裡，所以快轉到（year ended up being I guess now close to 30 years of being in this in this industry so fast forward to）
- `01:28` 今天，隨著現在 Agent 和大型語言模型的所有變革，我覺得你個人的使用方式和（today with with all the change right now with with agents and LLMs I feel like your personal usage and）
- `01:34` Spotify 的使用方式處於我所見產業的最前線，你第一次的（Spotify usage is on the frontier of what I see in the industry what was what was your first field）
- `01:42` AGI 時刻是什麼時候？個人而言，我想我有過幾次，取決於我們（AGI moment personally I think I have I've had a few depending on a little bit of the problem that we）
- `01:48` 試圖解決的問題。我們在大型語言模型剛出現時就開始很早嘗試用它們來自動化程式碼（were trying to solve we started pretty early as LLMs came about to try to use them to automate code）
- `01:54` 變更，起初這真的很掙扎，但過了一段時間，當我們開始摸索出（changes and that was a real struggle to begin with but after a while as we started figuring out like）
- `02:01` 如何使用大型語言模型和判斷器之類的東西時，我們開始從中得到一些相當鼓舞人心的結果，（how we can use LLMs and judges and whatnot we started getting some pretty inspiring results from that）
- `02:08` 這大約是幾年前的事了，沒錯，那是在 Claude 之前，是在 GPT 的早期階段（and this this was like a few years ago yeah it was pre pre-clawed and pre it was like early GPT day）
- `02:15` 之類的時候，再說一次，我們當時得到的結果並不是說我們可以解決所有問題，（something like that and again like the results we got then wasn't like we can fix all our problems）
- `02:20` 但它給了我一個關於未來發展方向的洞察，所以那絕對是其中一個。（but it was giving an insight of like where this is heading in the future so that was certainly one）
- `02:25` 我必須說，對我個人程式撰寫而言，真正的突破時刻可能是 Opus 4 或 5，（for I have to say for my own personal coding the real breakthrough moment was probably opus 4 or 5）
- `02:33` 回到去年 11 月、12 月，它從這種聰明的自動完成功能，變成了我可以（back in November December it went from being this like smart autocomplete to something that I can）
- `02:39` 真正丟給它實際問題的東西，而且我不需要做太多的 prompt 工程。對我來說，最大的（actually throw real problems at and I didn't have to do all that much prompt engineering the biggest）
- `02:45` 收穫也是不用再編輯程式碼了，因為在那之前的我的工作流程是，我有（thing for me was also just not having to edit code anymore because my workflow up to then was I have）
- `02:51` 模型，你知道，可能寫出 80% 或 70% 的程式碼，取決於（the model right you know like maybe 80 percent of the code or 70 percent of the code depending on the）
- `02:56` 模型，然後我總是必須進入 IDE 去做最後一哩路的編輯，沒錯，而我只是停止了（model and then I always had to go into the IDE to do the last mile edits yeah and I just stopped）
- `03:01` 必須那樣做，沒錯，那真的很瘋狂，沒錯，嗯，但我想那也是為什麼它感覺像是一次大躍進的重要原因。你現在的工作流程是怎樣的？你如何（having to do that right and that was that was crazy yeah um but yeah that I think that's a big part）
- `03:06` 使用 Claude Code？Spotify 如何使用 Claude Code？是的，我用一種我會說是相當（of the reason that it felt like such a leap what's your so what's your workflow like today like how）
- `03:10` 標準的方式使用它，我想我在終端機裡跑了一堆 tmux 視窗，嗯，通常我做一些工作時會有（how do you use Claude Code how does Spotify use Claude Code yes I use it in a I'm gonna say fairly）
- `03:17` 一堆 Agent 在背景執行，有多少個終端機分頁？所以我會有五到十個分頁，然後我會使用一些分割視窗，因為我喜歡有一個終端機（vanilla way I think I run it in a bunch of tmax sessions in a terminal um usually have a bunch）
- `03:25` 在那裡我可以實際執行 git diff 之類的操作，所以我用一組（of agents running in the background whenever I do some some work um how many terminal tabs so I will）
- `03:31` Claude Code 視窗和對應的終端機，在一些我工作的 worktree 中進行設定。我們設定的方式是（have anything in between five and ten tabs and then I use some pains because I like to have a terminal）
- `03:38` 我們有幾個非常大的 monorepo，我們正逐漸朝向那個方向移動，（that where I can actually like get diff and whatnot um so I have this setup with a matrix of）
- `03:43` 但我們仍然有數千個小的 poly repo，所以我的大部分工作發生在（Claude Code sessions and term and matching terminals in a in a set of uh work trees that I work in the way）
- `03:50` 那些 monorepo 裡，所以我通常在那裡同時跑幾個 Claude 和終端機，在任何給定的時間點，（that we're set up is that we have a few very large monorepos which we're gradually moving towards）
- `03:57` 然後當我需要深入我們其中一個 poly repo 時，我會在那裡打開一個比較臨時的（but we still have thousands of small poly repos for that that remains so I'm most of my work happens）
- `04:05` Claude 視窗。你覺得 monorepo 還是 poly repo 更適合 Claude Code，或者（in those uh monorepos so I usually have a few clouds and terminals going on there and at any given point）
- `04:11` 老實說，我原本對 monorepo 的設定和 Agent 有點擔心，因為（in time and then when I need to dip into one of our poly repos I will open up a more temporary）
- `04:17` 嗯，我想我們之前使用的一些工具，在索引和（cloud session there do you feel like one like monorepo or poly repo is a better fit for for Claude Code or）
- `04:23` 類似的事情上遇到過問題，而且這些是相當大的儲存庫，那個後端 monorepo 有（I was a bit worried to be honest about the monorepo setup and agents originally because）
- `04:28` 超過 2000 萬行程式碼，但事實證明 Claude 在這些儲存庫中運作得非常好，（um I think with some of the prior tools we've been using we've been seeing issues with indexing and）
- `04:33` 而且我想我們發現的一件事是，Claude 在查看儲存庫中其他程式碼以獲得（things like that um and this these are fairly large repositories that are back in monorepo is）
- `04:39` 我想是解決你試圖解決的問題的靈感方面有多好。我想要問關於（more than 20 million lines of code but turns out it claude works amazingly well in those repositories）
- `04:46` 你建立的一些基礎設施，所以你知道，在 Spotify，顯然你建立了 honk，沒錯，我覺得（and um I think one of the things we found is how good claude is looking at other code in the）
- `04:55` 從最早嘗試模型到建立 honk 和在 Agent SDK 上建立背景 Agent，（repository to get I guess inspiration for the problem you're trying to solve um I I wanted to ask about）
- `05:00` 你比其他人更早看到了未來，關於這個，是什麼原因？（some of the infra that that you built so you know at spotify obviously you built honk yep I feel like）
- `05:07` 從最早開始嘗試模型，到建構 honk 以及在 Agent SDK 上建構背景 Agent，沒錯，你比其他人更早看見未來，那究竟是關於什麼？（from the earliest days of experimenting with models to building honk and building background agents on you）
- `05:14` 你知道，在 Agent SDK 上，沒錯，你比其他人更早看見未來，那究竟是關於什麼？（know on the agent SDK yep you see the future before other people do what what is it about the the）
- `05:21` 文化或是參與其中的人們，這類因素導致了這種情況，請跟我說說那個故事，以及它是如何發展的？（culture or the people working on it that kind of leads to this and just tell me that story and how）
- `05:26` 五、六年前，我們發現我們的程式庫成長速度遠遠超過了我們工程師團隊的支援能力，大概快了七倍，這意味著隨著時間推移，我們需要維護的程式碼越來越多。（how has it been going five six years ago now um we identified that our code base was growing much much）
- `05:33` 而 Spotify 是一家擁有無窮無盡想法的公司，我們有很多想提供給使用者的功能，所以被維護工作拖累並不是一個理想的狀態。（faster than the number of engineers we had to support us like seven times faster so that meant that we over）
- `05:40` 因此，我們開始嘗試自動化，盡可能地自動化這些維護工作。（time we just had more and more code that we needed to maintain uh and spotify is a company that has）
- `05:46` 其中很多工作相當枯燥，像是遷移到最新的 Java 版本、更新程式庫，或是諸如此類的事情。（an endless source of ideas of things we want to ship to our users so being bogged down by our maintenance）
- `05:52` 很多時候是在我們所有的程式碼中，將某個 API 遷移到另一個 API。所以我們建立了這套基礎設施，我們稱之為「車隊管理」（fleet management）。（was not a good place to be in so we started automating trying to automate as much of that maintenance as）
- `05:58` 這一切的重點在於，想像一下在過去，當我們進行遷移時，我們會發送遷移說明或教學給所有團隊，要求他們手動為各自的元件進行遷移。（possible a lot of that was pretty dull work like migrating to the latest java version or library update or）
- `06:05` 而我們現在想像的是，我們是否能找到方法，對整個程式庫進行變更，而不是分散在數千個儲存庫中，因為每個團隊都在做同樣的事情。（whatever a lot of it was moving from some api to some other api across all our code um so we built）
- `06:13` 沒錯，數百個團隊在數千個元件上重複進行同樣的手動操作，所以每次遷移都需要耗費數月才能完成，我們一年可能只能完成 10 次左右。（out this infrastructure that we call fleet management which all about like instead of imagining before that）
- `06:19` 因此，我們幾乎無法跟上我們所使用的框架的支援版本。（when we were doing a migration we would send out the migration description or like um tutorial to all）
- `06:25` 所以，我們開始著手自動化，建立了所有這些基礎設施來達成目標。我們已經合併了數百萬個這類型的 PR。（our teams and ask them to do that migration manually for all of their components and instead of doing that）
- `06:31` 但它們全都依賴於這些確定性的腳本，你套用這些腳本來進行程式碼變更或設定變更。（we imagine like can we find ways where we can do mutations towards our entire code base instead）
- `06:37` 我們很早就發現，程式碼擁有巨大的 API 表面積，所以試圖對程式碼進行變更很快就會變得非常複雜。（living in thousands of repositories because every every team was kind of doing the same thing yeah）
- `06:42` 我們很快就遇到了瓶頸，無法處理太複雜的變更，甚至連替換方法和 API 都變得相當複雜。（yeah hundreds of teams doing the same operation manually over thousands of components so each of）
- `06:48` 因為你可以在五種不同的地方呼叫它，所以僅僅使用傳統的靜態分析，像是抽象語法樹（AST）轉換，是行不通的。（these migrations took months and months and months to complete we could maybe do 10 of them a year so）
- `06:53` 沒錯，因為假設有一個 API，你將它別名化為一個變數之類的，現在你需要變數和狀態追蹤。（we were barely keeping up with um being on the supported version of the frameworks that we're on）
- `07:00` 這確實很混亂。所以我們用來遷移程式碼的每個腳本，最後都變成了數千行程式碼，用來處理程式碼中的每一個邊緣案例。（so again we started automating this we built out all of this infrastructure to do this we've merged）
- `07:05` 這啟發了我們，正如我之前提到的，當早期的大型語言模型出現時，我們就在想：「嘿，這些東西能應用在這個問題上嗎？」（millions and millions of those types of prs and but they all relied on these like deterministic scripts）
- `07:11` 起初效果並不理想，部分原因是模型不夠強大，部分原因是我們嘗試的方法太過天真。（that you would apply and that would make those code changes or configuration changes and one of the）
- `07:16` 我們基本上只是把程式碼丟給模型，試圖一次性完成變更，所以那樣行不通。隨著時間推移，模型進步了，我們對如何處理這個問題的想法也進步了。（things we found pretty early was code has an enormous api surface so trying to make changes to code）
- `07:22` 我們開始應用大型語言模型作為評審（judge），以確保輸出符合預期。我們開始以各種方式拆解問題。（gets very complicated very quickly so we pretty quickly ran into sealing of）
- `07:29` 經歷了無數次的迭代，以及許多內部的駭客式嘗試，試圖以不同方式解決這個問題。（how complex changes we can do even even switching out the method and api becomes pretty complicated）
- `07:36` 我們開始整合這些成果，這就變成了我們現在所說的「Honk」。（where you can call that in five different ways so so doing with this with just traditional like）
- `07:41` 它最初的樣貌非常不同，並不是基於 Claude Code。（static analysis like ast transformation exactly because like let's say there's an api you just like）
- `07:45` 它更多是我們內部開發的一堆東西，但它是隧道盡頭的第一道曙光，讓我們覺得：「沒錯，這確實是一個我們可以解決的問題。」之後我們對 Honk 進行了多次迭代。（you alias it to a variable or something now now you need kind of like variable and state tracking that's）
- `07:49` 沒錯，所以今天我們發布了所謂的 v2 版本，但我認為實際上它可能是 v8 或類似的版本，我們只是沒有記錄下我們對它進行的所有迭代。（exactly right that's messy yeah so each script that we had to migrate code turned into thousands of）
- `07:55` 它最初是作為一個自動化程式碼變更、排程並協調我們所有儲存庫的工具。（lines of taking care of every h case in that code so that inspired us as i mentioned before as pretty）
- `08:02` 但工程師們很快就發現，嘿，這對其他事情也很有用。我想在 Slack 上提到這件事，讓它為我執行一個任務，諸如此類。所以今天，Honk 已經成長為我們更無所不在的工具。（much as soon as the early llms came along of like hey these things can we apply them to this problem）
- `08:10` 跟我說說 Honk 的架構，有哪些核心組件？你提到有一個負責寫程式的 Agent，這是基於 Claude Code Agent SDK 構建的，對吧？（and early on it didn't work at all all that well uh partially because the models weren't）
- `08:16` 是的，然後你還有一個驗證步驟，像是一個 Agentic 驗證器，再多跟我說說這個。（good enough partially because we just we were very naive in how we were trying to do it we were）
- `08:21` 我們過去在 Honk 中確實有一個驗證步驟，像是一個 Agentic 驗證器，一個評審。（basically just put the code in front of the model and try to get it one shot that that change so that）
- `08:29` 但我們後來真的移除了它嗎？因為我們發現……（didn't work over time models improved and are thinking about how to do this improved so we started）
- `08:35` 應用大型語言模型來判斷以確保輸出符合預期，我們開始拆解問題，（applying llmsets judge to make sure that the output was as intended we started breaking down the problem）
- `08:41` 用各種方式分解問題，經過了非常多次的迭代，還有許多內部的（decomposing the problem in various ways so many many many iterations of this uh and many internal）
- `08:47` hacks 來嘗試以不同方式解決這個問題，我們開始整合這些成果，這後來（hacks to try to take on this problem in different ways uh we started consolidating that and that then）
- `08:53` 成為我們現在所稱的 honk，它最初是一個截然不同的東西，並不是建立在 Claude 之上，（became what we now call honk um it was a very different beast originally it was not on top of claude）
- `09:00` 嗯，它更多是我們內部開發的一堆東西，但它是第一道曙光，（um it was more a bunch of homegrown type of things in there but it was the first sort of light in the）
- `09:08` 讓我們覺得，這確實是一個我們可以解決的問題，之後我們對 honk 進行了非常多次的（tunnel of like yeah this is actually a problem that we can solve and then we've done many many）
- `09:12` 迭代，沒錯，所以今天我們發布了所謂的 v2 版本，但我認為實際上它可能是 v8 或（iterations on honk yeah so today we we released what we call v2 but i think in reality it's v8 or）
- `09:19` 之類的版本，我們只是沒有追蹤我們對它進行的迭代次數，它最初是（something like that we just didn't keep track of the of the iterations we did on it and it started out）
- `09:24` 像這樣自動化這些程式碼變更、排程並在我們所有的程式庫中進行編排，但（as this like automate these code changes schedule that and orchestrate over all our repositories but）
- `09:29` 工程師很快就發現，嘿，這對其他事情也很有用，我想在 Slack 上（pretty quickly engineers figured out that hey this is useful for other things as well i want to）
- `09:35` 提到這件事，並讓它為我執行一個任務，或者諸如此類的事情，所以今天（mention this thing on slack and have it do a task for me or or all of those types of things so today）
- `09:40` honk 已經成長為我們更普及的工具，跟我聊聊 honk 的架構吧，（honk is has grown into being a much more ubiquitous tool for us tell me about the architecture of honk）
- `09:46` 像是主要的組成部分有哪些？你提到了有一個（like how what are the big pieces so you talked about having uh there there's a there's the agent that）
- `09:52` 負責寫程式的 Agent，這就是建立在 Claude Code Agent SDK 上，沒錯，然後你還有一個（codes and this this is just built on the Claude Code agent sdk yes um and then you also have you have a）
- `09:58` 驗證步驟，像是 Agentic 驗證器，多跟我說說，所以我們過去有（you have a verification step like an agentic verifier tell me more about so we used to have）
- `09:58` 像 Agentic 這樣的驗證步驟，（verification step like a agentic）
- `09:59` 驗證，（verification）
- `10:02` 在 honk 中有一個判斷器，但我們真的有移除它嗎？因為我們發現（a judge in honk but do we actually ever remove that because we found that the）
- `10:08` Agent 和模型進步到四或五代後，已經夠強大到我們不再需要判斷機制了。判斷機制在 Honk 的初期版本中非常重要，它讓我們從...（agent and models just again going back to four or five got good enough that we don't didn't need）
- `10:13` 如果我沒記錯數據的話，大約從 20% 到 30% 的 PR 成功率提升到...（the judge anymore the judge was very important in the first iterations of honk so it it made us go）
- `10:19` 80% 的成功率，所以這是一個非常巨大的改變。但正如我們所談到的，模型跟上了，（from if i remember the numbers correctly like roughly like 20 30 success rate on prs to like）
- `10:25` 而且 Agent harness 也跟上了，所以我們現在已經從 Honk 中移除了那個判斷機制。因此 Honk（80 success rate so so it's a big big change but then again as we talked about the models caught up and）
- `10:32` 在架構上相當簡單，它就是一個在 Kubernetes pod 中運行的 Agent SDK，它擁有存取（and the agent hardness caught up so we have now eliminated that judge from from honk so honk）
- `10:39` 一組工具的權限。在 v2 版本之前，這些工具是預先定義好且在允許清單中的一組（architecturally is fairly simple so it's the agent sdk running in a kubernetes pod um it has access to）
- `10:48` 我們信任並提供給該 Agent 的工具。現在在 v2 版本中，使用者可以自行添加他們的工具，（a set of tools um it used to be prior to v2 that those tools were a predefined allow listed set of）
- `10:58` 所以現在 Agent 可以使用我們任何的內部工具。其中一個最重要的工具（tools that we trusted to give to that agent now in v2 um users can add their own tools just off）
- `11:06` 就是它擁有執行驗證的能力，基本上就是執行 CI 建置，而且它可以在（those tools so now the agent can use any of our internal tools and one of the most important tools）
- `11:12` Linux 和 macOS 上執行這些建置。macOS 對我們來說特別重要，因為例如任何 iOS 開發（that it has access to is that it can run verification like basically run ci builds um and it can run those）
- `11:20` 都需要 macOS 建置。這只是單純的建置，還是你們會做像是開啟（both on linux and mac os so mac os is particularly important to us because any ios development for）
- `11:28` iOS 模擬器並讓模型啟動應用程式之類的操作？它的深度能到什麼程度？它確實可以執行這些（example needs mac os builds and is this just building or are you doing like a full like open up）
- `11:34` 類型的測試。我們絕對有案例是將模擬器與 Claude 整合，以自動化（the ios simulator have the model like start the app kind of how deep does it go it it can do those）
- `11:40` 像是直接從 Figma 設計稿轉換為 UI 實作的流程。我們一直都在使用（types of tests we definitely have cases where we integrate the simulator and claude to automate）
- `11:46` 這個功能，例如將我們的電視應用程式從 iOS 應用程式移植過來。我覺得驗證是（things like going directly from uh designs and figma to ui implementations and we've been using）
- `11:55` 我們經常討論的話題之一，沒錯。但我認為當你在進行這種閉環開發時，也就是給予 Agent 一個任務，（that for porting for example our tv apps from from our ios apps i feel like verification is it's one of）
- `12:03` 然後它可能需要進行分支並拆解任務，且需要在沒有人類介入的情況下完成大量工作，（these things that we talk about a lot yeah but but i think when you're doing this kind of closed）
- `12:08` 這絕對是最重要的一件事。沒錯，我覺得我看到的一個常見錯誤是（loop development where it's an agent that it's given a task and then it has to maybe like）
- `12:13` 公司對驗證迴圈運作良好的投資不足。我認為這非常正確，而且對於我們來說也是如此。（fin out and break down the task and it just needs to do a lot of work without a human in the loop yes）
- `12:18` 我們在工程實踐中所做的重大改變之一，就是強化我們的測試自動化。我們將程式庫拆分為數千個（it's just the single most important thing yeah and i i feel like one of the common mistakes i see is）
- `12:24` 元件，每個元件都有明確的擁有權，也就是由特定的團隊負責，（companies under invest in how well that verification loop works i think that's very true and i think it's）
- `12:29` 該團隊對此負全責。他們可能最初設計了它，實作了它，（true for us as well one of the major changes that we did in our in our engineering practices as part of）
- `12:35` 並且維護它。在我們對機群管理進行投資之前，部分原因是（that was to strengthen our test automation we have divided our code base into many thousands of）
- `12:42` 該團隊必須參與合併到他們程式庫中的每一次變更，（components each of those components have a well-defined ownership so it's owned by a particular team）
- `12:48` 這意味著在某些情況下，我們在測試自動化上可能會有點草率，因為（and that team is fully responsible for that they probably designed it originally they implemented it）
- `12:53` 該團隊在必要時總是可以檢查每個 PR。但隨著開始對我們的原始程式碼進行 PR 自動化，（and they operate it and part of that prior to the investments we did in fleet management was around）
- `12:59` 其中一件事就是我們需要改變對團隊的期望，例如你可能不再（like that that team was in the loop for every change that got merged to their code base and that）
- `13:06` 需要參與這些變更，我們將會自動合併大部分這些變更，而不需要（mean that that meant that in some case we could be a bit sloppy on post test automation because that）
- `13:11` 你親眼看到 PR。這意味著必須建立更好的測試自動化，以確保（team could always check every pr if they needed to but with starting to automate prs towards our source）
- `13:18` 我們所有的軟體都能夠在這些類型的自動化變更中存活下來。現在聚焦於我們目前的狀況，（code one of the things was we needed to change the expectations for teams like you might not no longer be）
- `13:23` 這對我們非常有幫助，因為現在我們可以讓 Agent 處理這些問題，並使用我們之前建立的相同（in the loop for for these changes we're going to be automerging most of these changes uh without you）
- `13:29` 驗證機制。工程界一直在討論的一個權衡取捨是（ever seeing the pr so that meant then having to build out much better test automation to make sure that）
- `13:34` 可靠性和品質放在一邊，而速度則放在另一邊。（uh all our software could sort of survive those types of automated changes now zooming into where we are）
- `13:43` 沒錯，對我來說這感覺像是一個錯誤的二分法，因為如果你想跑得更快，你需要做的是（today that's been very very helpful for us because now we can throw agents at that and use the same）
- `13:50` 自動化你的品質實踐，這樣它就能被更好地編碼，而不是只存在於（uh verification that we had in place before there's one of these trade-offs that people talk about all）
- `13:54` 某人的腦海中。它實際上應該是一個 skill，或是在 CLAUDE.md 中，或是在某組 MCP 中，（the time in engineering of uh reliability and quality on one side and speed on on the other side）
- `13:59` 這是 Claude Code 可以做到的事情，而這最終才是讓你跑得更快的關鍵。這只是（yep and to me it feels kind of like a false dichotomy because if you want to go faster the thing that you）
- `14:05` 工程生產力總是關於投資基礎設施的另一個例子，這不是（need to do is you need to automate your quality practices so that it's better encoded it's not in）
- `14:10` 關於加班，而是關於讓基礎設施變得越來越好。這聽起來（someone's head it's it's actually like in a skill or in a Claude Code md or in some set of mcps it's）
- `14:16` 就是你在說的。我們看到我們在保持品質指標中立的同時，（something that Claude Code can do and that's ultimately what lets you go faster and this is just another）
- `14:20` 顯著提升了我們的速度。但這並非免費，我們必須對（example of how in engineering productivity is always about investing in infrastructure it's not）
- `14:26` 測試自動化進行這些投資。現在，正如我們所討論的，（about working more hours it's about just making the infrastructure better and better and that sounds）
- `14:30` 我認為我們將必須繼續投資於我們的可靠性實踐。其中一些實踐也正在（like what you're talking about we're seeing that we're keeping our quality metrics neutral while）
- `14:35` 隨著這次轉型而改變。而且我想當你試圖跑得越來越快時，（significantly improving our our speed but that does not come for free we we've needed to to make）
- `14:43` 你必須在可靠性上投入更多。沒錯，這完全正確。所以我們每天進行大約（these investments into into test automation now we as we talked about um i think we're going to have）
- `14:50` 四千五百次生產環境部署，因此有很多機會（to continue our investments into uh our reliability practices as well some of those are changing as）
- `14:55` 出錯，所以我們確實需要良好的實踐來確保每一件事物都...（part of this this transition as well and and i guess as you try to go kind of faster and faster and）
- `15:00` 速度越快，你就必須在可靠性上投入更多，沒錯，完全正確，所以我們（faster you have to invest even more in reliability yeah yes that's exactly right so yeah so we make）
- `15:06` 每天進行大約四千五百次正式環境部署，所以有很多機會（something like four and a half thousand production deployments every day so there's a lot of opportunity）
- `15:13` 可能會出錯，所以沒錯，我們需要有良好的實踐來確保每一件事，（for things to go wrong so yeah we need to have good practices around making sure that everything that）
- `15:19` 要將產品部署上線並達到我們想要的品質，進行這麼多次部署的構想是什麼？（ships into production has the the quality that we want what's the idea with doing this many deployments）
- `15:23` 這是否就像過去的持續部署，而現在對 Agent 來說或許是更快的訊號，還是你是怎麼思考這件事的？（is it kind of in the past it was just continuous deployment and now maybe it's faster signal for the）
- `15:28` 這件事是 Spotify 自成立以來我們一直致力於優化的方向。（agent or how are you thinking about it this is something we've always been optimizing for for as long as）
- `15:33` 我認為我們希望基本上能讓開發者有一個想法時，（spotify existed i think we we want to be able to basically have an idea and for a developer to have）
- `15:41` 就能盡可能快速地將其部署到生產環境中。這在幾年前可能需要數週或數個月的時間，（an idea and be able to ship that into production as quickly as possible that used to be weeks or months）
- `15:47` 而我們一直持續嘗試優化這一點，現在大概只需要一小時左右。（back back um back a few years and we've just uh continues to try to optimize that and now it's you）
- `15:55` 就像我之前提到的，我們有很多想法想要驗證，（know an hour or something like that like as i mentioned before we have lots of ideas we want to validate）
- `15:59` 並探索這些想法，我們能越快獲得回饋，在某些情況下，這可能是（and explore those ideas and the faster we can get feedback on that and in some cases that might be）
- `16:05` 來自我們內部使用者的回饋，在某些情況下可能是來自外部使用者的回饋，但在這兩種情況下，（feedback from our internal users in some cases might be feedback from our external users but in both of）
- `16:13` 我們發現迭代速度越快，我們不僅能打造出更好的產品，也能更快地將產品交付給使用者。（those cases the faster we can iterate we found that we um we both build better products and we're able）
- `16:20` 並非每個想法都能在一小時內發布，許多想法需要大量的（to ship them faster to our users not every idea ship in an hour many ideas takes you know lots of）
- `16:26` 探索才能發布，但能夠獲得快速驗證的概念對我們來說非常重要，（expiration before we're able to ship them but but the notion of being able to um get that quick）
- `16:33` 沒錯，Agent 當然也是這個循環的一部分。所以對於 Spotify 來說，工程團隊相當龐大，有數千名工程師對吧？（validation is super important to us and yeah agents are certainly part of that loop as well so for）
- `16:38` 沒錯，大概有 2,900 名工程師左右。（spotify the the engineer org is fairly big it's like thousands of engineers right yeah it's 2 900）
- `16:44` 當你在做這些事情時，你是如何思考 ROI 的？（engineers something like that how how do you think about as as you do all this stuff how do you think）
- `16:50` 例如測量方式，確保你們正朝著正確的方向前進。（about roi uh like measurements just making sure you're moving in the right direction）
- `16:54` 在測量 ROI 方面，這一直都很容易，我們在該領域看到了非常明確的訊號。（in terms of measuring roi like we've been it's been easy and we've seen very um clear signals in that）
- `17:02` 例如，我們看到 PR 頻率提升了 75% 以上，我們可以將其直接（space we're seeing a 75 plus improvement in pr frequency for example uh that we can directly）
- `17:09` 歸功於 AI 工具，而且我認為到目前為止，約 73% 的 PR 直接歸功於 AI（attribute to ai tooling and i think by now 73 ish percent of prs are directly attributed to being ai）
- `17:17` 輔助撰寫，所以這類指標我們做得相當不錯，但當然，我們希望（authored um so those types of metrics we're doing pretty well on but then of course we want to）
- `17:24` 將其與使用者價值和營收連結起來。你該如何測量這類指標？是透過 A/B 測試、某種保留測試，還是案例研究？你是怎麼思考的？（connect that to user value and revenue and how do you how do you measure something like that is it sort）
- `17:30` 沒錯，我們希望基本上能將工程師的交付成果（例如 PR、部署）與我們稱之為「工作項目」的東西連結起來，（of uh like a b tests or some kind of holdout like case studies like how are you thinking yeah so we）
- `17:37` 也就是我們規劃的工作，（want to connect basically be able to connect the deliverables that the engineer engineers do so prs）
- `17:43` 然後將其與 A/B 測試和發布連結起來，接著我們就能從中看出，（deployments into we call them work items so basically like the the planned work that we have）
- `17:48` 基本上可以歸因說這個 PR 貢獻了我們擁有的某個目標，而這又貢獻了（and then that connects to uh a b tests and rollouts and then we're able to from that see like）
- `17:55` 這個使用者價值，這就是我們的構想，我們現在正試圖建立這些連結。（basically attribute back to say this pr contributed to this uh uh dod that we have and that contributed）
- `18:03` 沒錯，我覺得過去我們在開發者生產力領域工作了一段時間，當你（to this user value that's the idea and we're trying to build those connections right now yeah i i feel）
- `18:09` 擁有一個大團隊時，你會希望讓他們更具生產力，沒錯。（like back in the day you know like we we've worked in developer productivity for a while like when you）
- `18:14` 我覺得在過去，（have a big team you want to make them more productive yep and i i feel like back in the day）
- `18:18` 一個巨大的勝利可能只是幾個百分點的提升，沒錯，沒錯，如果你夠幸運能測量出來的話。（a big win was it was like a few percent single percentage point exactly exactly yeah yeah if you're）
- `18:23` 是的，現在的改進對每個人來說都顯而易見，（lucky enough to be able to measure that yes and like with improvements nowadays it's just so obvious）
- `18:28` 但作為工程師，我們仍然想要測量它。沒錯，我想說的是，最初關於 ROI 的（to everyone yet you know as engineers we still want to measure it yeah i'm gonna say like the roi）
- `18:35` 討論相當容易，因為我們看到了巨大的進步，（discussion initially was fairly easy because we could see such large improvements and um but as the）
- `18:46` 但隨著成熟度提高，成本也在改善，我認為對這些 ROI 估算的精確度要求也隨之提高，（maturity is getting there and the costs have been improving i think the precision around those roi）
- `18:53` 這就是為什麼我們試圖改進我們進行這類測量的方式。（estimates the expectations on the precision is going up as well so that's why we're trying to improve）
- `18:57` 其中一部分是關於生產力的提升，（how we can how we can do that type of measurement part of it is about the improvement in productivity）
- `19:02` 而另一部分是獲得這種提升需要付出多少成本，這完全正確。現在（and then part of it is how much does it cost to get that improvement that's exactly right and now）
- `19:07` 人們看到了數十或數百個百分點的（you know people are seeing these like many dozens or hundreds of percentage points of）
- `19:10` 提升，現在你真的想要將其歸因，以弄清楚這花費了多少 token，（improvement and now you really want to attribute it to figure out like how many tokens did it take）
- `19:16` 花費了多少小時，以及產出了多少生產力，沒錯，這完全正確。（how many hours did it take yep what was the productive output yeah that's exactly right um i want）
- `19:21` 我想以一個問題作結，你會給你的同行什麼建議？你會給（to end on uh maybe one question what what advice would you give your peers what advice would you）
- `19:27` 其他 CTO 和工程領導者，例如其他公司的工程副總裁什麼建議？（give to to other ctos and you know engineering leaders like vps of engineering at other companies）
- `19:33` 我們發現這些對基礎能力的投資，我們談到了測試自動化和驗證，（what we've found is that these investments in foundational capabilities we talked about test）
- `19:39` 我想說的是，我們看到的另一個面向也是如此，那就是標準化。（automation and verification i'm going to say the same is true for uh where another aspect that we've seen）
- `19:45` 所以我們一直在推動更一致的程式庫，以及在我們使用的工具上達成更多共識，（is standardization so we've been driving you know more consistent code bases more alignment on the tools）
- `19:53` 以及我們使用的框架。（that we use the um frameworks that we use）
- `19:56` 我們使用的框架，我們發現這最初是為了簡化（frameworks that we use and we've seen that this was originally investment we did to simplify）
- `19:57` 人類的工作並提高人類生產力所做的投資，（um）
- `20:03` 但我們發現同樣的事情也很好地轉移到了 Agent 上。所以如果你有，我之前提到過 Claude 能找到靈感。（things for humans and make humans more productive but we've seen the same thing transition really）
- `20:07` 對 Agent 來說也是如此，所以我之前提過 Claude 能夠找到靈感，（well to agents as well so if you have i mentioned this before on claude being able to find inspiration）
- `20:14` 如果程式庫中的其他程式碼片段看起來有 10 種不同的寫法，Claude 會感到更困惑，所以我們發現一致性越高，（from other pieces of code in our monorepos if they look in 10 different ways claude is going）
- `20:20` 我們的 Agent 運作效果就越好，所以我認為如果我能給出一個建議，那就是不要忽視這些類型的（to be more confused so we've been seeing the more consistency we have the more the better）
- `20:26` 投入，你需要具備與我們過去相同的工程實踐，（our agents work so i think if there's one advice i would give would be to not not ignore those types）
- `20:34` 在新的世界裡，這些實踐依然適用，雖然看起來可能不同，因為有新的參與者在你的（of investments you need to have the same sort of the same same engineering practices that we had）
- `20:40` 程式庫中，但基本原則似乎同樣適用，至少在我們的（before still applies in this new world might look different the there's a new actor being in your）
- `20:46` 環境中是這樣的情況。對於那些可能已經從事工程（code base but the fundamental seems to apply equally well at least that's been the case in our）
- `20:51` 工作一段時間的工程師，你有什麼建議？我知道 Spotify 曾談到工程師在搭地鐵時提交 PR，（in our environment what's your advice for engineers that you know maybe have been doing engineering）
- `20:57` 這真的很酷，所以顯然工程領域正在改變，對於那些（work for a while and i know spotify has talked about engineers you know like shipping prs on the subway）
- `21:02` 正處於轉變之中並試圖摸索的人，你有什麼建議？是的，讓我從更個人的角度來談談（which is which is really cool so you know obviously engineering is changing what's your advice to）
- `21:07` 這一點。我想我是一個一直以來真正享受程式撰寫中解決問題（everyone that's that's in the middle of it and you know trying to figure it out yeah let me talk about）
- `21:11` 那一環的人。這聽起來可能很書呆子，但在閒暇時間，我會（this from a more personal angle i think so i'm someone who's always have truly enjoyed the problem）
- `21:20` 做一些程式競賽，因為這就像是一種有趣的心理鍛鍊。在我的腦海深處，（solving part of coding this is going to sound as nerdy as it is but like in my spare time i will do）
- `21:25` 我一直有點擔心，就像我們之前談到的，這會如何徹底（like competitive programming at times because it's just like fun mental exercise in the back of my head）
- `21:31` 改變我們的工作方式，我對此感到相當擔憂，僅從我個人的觀點來看，（i was always a bit worried like we were talking about before of like how this was change completely）
- `21:36` 我會不會錯過那種解決問題的艱難心理挑戰？而現在我（changing the way we were working and i was pretty worried about that from just my personal point of）
- `21:41` 發現自己有五個 Agent 在後台運作，我與它們互動的方式（view like am i going to miss that part of like the hard mental challenge of solving problems and now i）
- `21:47` 與一兩年前我工作的方式截然不同，對我來說，結果證明（find myself having you know five agents working in the background and my way of interacting with them）
- `21:54` 我是錯的。我喜歡做的事情是解決問題，而我解決這些問題的（is very different from the way that i was working a year or two ago and for me that's been）
- `21:59` 方式，結果證明並不是對我來說最關鍵的部分。這對不同的人來說（turned out that i was wrong and i like the the thing that i like to do is solving problems and the）
- `22:07` 總會是個人的體驗，大家必須以不同的方式完成這種轉型，（way that i solve those problems turn out to not to be the most critical piece for me this is always）
- `22:13` 但我認為要專注於你能解決的問題類型。我發現自己不僅（going to be personal for for different people are going to have to make that transition in different）
- `22:17` 變得更有效率，因為我能從我所做的工作中帶來更多價值，我還能解決（ways but i think focus on the types of problems that you're able to solve um i'm i find myself both to）
- `22:26` 我以前真的無法解決的問題。我可以跳進那些以前需要花我幾天或（be more productive in that i can bring more value from the work that i can do i can also solve problems）
- `22:33` 幾週才能進入的程式庫，並做出我以前根本做不到的貢獻，所以對我來說，這（that i really couldn't solve before i can jump into code bases that i that would have taken me days or）
- `22:38` 真是太棒了。同樣地，這對不同的人來說看起來會不同，但我認為試試看，（weeks to get into before and be be contributing things that i just could not do before so for me that's）
- `22:45` 找到你喜歡使用這些工具的方式。我覺得對我來說，我看到了（been amazing um and again it's going to look different for different people but i think give it a shot and）
- `22:52` 實作時間出現了巨大的轉變，因為現在 Claude Code 在後台執行這些工作，（find the way that you you can use those tools in the way that you like i feel like for me i've seen）
- `22:58` 而我在做其他事情。是的，取而代之的是，填補我時間的是思考下一步該做什麼、（this big shift from implementation time because now you know Claude Code does it in the in the background）
- `23:04` 與客戶交談，以及實際上比我預期更多的原型製作，其中一些（while i do other stuff yeah and instead what's filled up that time for me is thinking about what's）
- `23:09` 是用於外部產品，一些是用於內部自動化。這種轉變對你來說如何？（next talking to customers and also like actually much more prototyping than i expected and some of）
- `23:15` 這種改變對你來說如何？我想對我來說也是類似的，我們之前沒談到這個，但有一件事（it is for external products some of it is for internal automations how how is that shift how's that）
- `23:21` 我們正在投入大量資源，特別是在原型製作上，這不僅針對（change worked for you i think it's been similar for me and yeah we didn't talk about this but one）
- `23:26` 工程師，也針對非工程人員。Claude 和類似工具解鎖的一件事是，讓任何人都能將他們的想法（無論是什麼想法），（thing that we're making a big investment in is is prototyping in particular um and this is targeted）
- `23:35` 用自然語言表達出來，並讓 Claude 去實作它。因此，當人們開始（both towards i'm going to say engineers but also the non-engineering cohort one of the things that）
- `23:41` 摸索這一點時，包括非工程師，他們開始嘗試在我們真實的（claude and similar tools has unlocked is to allow anyone to take their idea whatever that idea is）
- `23:47` 應用程式中這樣做，這些程式碼相當複雜，但他們開始看到一線希望，（express that in natural language and have claude then go implement that so as we as folks started）
- `23:56` 證明他們可以做到。所以我們在幾個月前開始，我們基本上建立了（figuring this out including again non-engineers um they started trying to do this in our real）
- `24:02` 基礎設施來簡化這個過程。所以今天，我們有一種非常簡單的方式來開始建構（apps and they're pretty complex beasts of code uh but they were starting to see again like signs of）
- `24:10` 端到端原型，涵蓋我們的行動裝置應用程式和後端。我們有一個內部應用程式商店，（light that they could do it so we started a few months ago we um basically built out the）
- `24:15` 用於存放這些原型，你可以在那裡分享它們，並查看別人的原型，試用別人的（infrastructure to make that simple so today we have a very simple way of getting going to build）
- `24:21` 原型，這對那些以前可能不熟悉如何在我們的行動裝置應用程式中建構東西的人來說，（an end-to-end prototype in our mobile apps and our backend and we have an internal app store for）
- `24:27` 是一個真正的解鎖，包括工程師在內。他們現在能夠表達那些過去需要動員一群工程師來嘗試（those prototypes where you can share them and like take a look at someone else try out someone else's）
- `24:32` 為你建構的想法，現在你可以在一兩個小時內，擁有一個可運作的（prototype in your um your app and that's been a real unlock for folks that maybe before and again）
- `24:39` 原型，你可以開始與他人分享，展示那個實際的想法在現實生活中，（including engineers that maybe weren't super familiar with how to build something in our mobile apps）
- `24:44` 使用真實的使用者資料等等是什麼樣子。所以是的，這些類型的東西在一年以前是無法想像的，而現在（to be able to express ideas that used to take you know motivating a bunch of engineers to try to）
- `24:50` 我們每天都在做。是的，我喜歡這一點。你有看到誰在產出這些東西出現轉變嗎？（build that for you and now you can go in and with the within an hour or two you have a working）
- `24:56` 是工程師在做嗎？還是主要來自設計師和產品經理？（prototype that you can start sharing with people to show what that actual idea looks like in real life）
- `25:02` 利用使用者的真實資料等等，所以沒錯，這些類型的事情在一年以前是無法想像的，而現在（with users real data and and so on so yeah those types of things are were unimaginable a year ago and now）
- `25:12` 我們每天都在做，沒錯，我喜歡這一點，你有看到生產這些內容的人員有什麼轉變嗎？（we're doing them every day yeah i i love that have you seen it have you seen a shift in who's producing this）
- `25:19` 是工程師在做嗎？還是主要來自設計師和產品經理？（is it is it like engineers doing it is it mostly coming coming from designers and product managers how）
- `25:24` 這有改變嗎？目前我們其中一位共同執行長在那個應用程式商店裡有原型產品，（has that changed it's everyone up to our one of our co-ceos uh have uh prototypes in that app store at the）
- `25:32` 所以其實情況如何？嗯，沒錯，我們有一群高階主管已經（moment so it's actually been is it good uh yeah yeah there's a bunch of uh like our senior execs have）
- `25:41` 打造出不錯的原型產品，就像我說的，那些他們一直以來放在腦海中的點子，（have built prototypes that are good like again like ideas that they already always had in the back of）
- `25:45` 他們有整個工程團隊可以將其開發出來，但該團隊正專注於（their head they have an entire engineering team that could build it out but that team is focused on）
- `25:50` 其他事務，所以對他們來說，能夠比以前更快地嘗試新事物，並且（other things so for them to then be able to try something out more quickly than they could before and）
- `25:56` 你知道的，親自體驗這東西看起來會是什麼樣子，這能讓你測試一個點子，（you know get a touch and feel for what this thing is going to look like yeah allows you to test out an idea）
- `26:01` 在一天之內完成，而不是花上數週或數個月。Nikos，非常感謝你。（in in a day instead of weeks or months nikos thank you so much）

## 標籤

Claude Code, CLI, 自動化, Deployment, Anthropic, Spotify
