科技

GPT-4.1淘汰了4.5：主打一個性價比，但仍不如DeepSeek R1

04月15日 11:07 新浪網 tech-auto-hilite

4.1與4.5孰大？OpenAI剛剛給出答案：

發佈GPT-4.1，比GPT-4.5強的那種。

新模型系列更新，一共帶來三個版本：GPT-4.1，GPT-4.1 mini、GPT-4.1 nano。

與通常中杯大杯超大杯的設置不同，這回翻譯過來，是中杯、小杯、超小杯。

OpenAI表示，4.1系列是API專供，不過列位非開發者先別急哈，人家也補充了，在ChatGPT里，4.1的能力將主要通過「融入最新版本的GPT-4o」體現。

能力方面，總結起來4.1系列紙面上最突出的優勢有兩點：

長上下文，3個型號均擁有100萬token上下文窗口；

性價比，用內部老哥的說法就是：

現在你可以用4%的價格，暢享GPT-4o模型品質。

OpenAI還表示，GPT-4.1系列會在API里取代GPT-4.5 Preview，後者將於今年（2025年）7月14日下架。

GPT-4.1：主打性價比

展開來看，OpenAI整體上是把GPT-4.1和GPT-4o拿來對比的。

以延遲為橫軸，以智能為縱軸，可以看到，GPT-4.1比GPT-4o強了一丟丟，而4.1 mini則超出了4o mini一大截。

定量比較的結果是，編碼方面，GPT-4.1在衡量真實世界軟件工程技能的SWE-bench Verified上得分為54.6%，比GPT-4o的分數提高了21.4%，比GPT-4.5強了26.6%。

指令遵循方面，在MultiChallenge基準中，GPT-4.1得分38.3%，而GPT-4o的得分是27.8%。

長上下文方面，在多模態長下文理解基準Video-MME上，GPT-4.1刷新SOTA，在長篇無字幕類別中得分72.0%，比GPT-4o高了6.7%。

值得注意的是，GPT-4.1 mini在多項基準測試中超過了GPT-4o。

比如在智能評估基準MMLU上，GPT-4.1 mini的得分為87.5%，超過了GPT-4o的85.7%，同時延遲降低一半，成本降低83%。

GPT-4.1 nano則被定位為OpenAI「目前速度最快、成本最低」的模型。並且在部分測試中有超出GPT-4o mini的表現。

編碼能力

OpenAI著重強調了GPT-4.1的編碼能力。除了在各種編程任務上都超過GPT-4o，OpenAI還演示了其在前端編程方面的實際優勢：

能夠創建功能更強大、更美觀的Web應用。

人類評分的結果顯示，在80%的對比測試中，GPT-4.1的網站都比GPT-4o的網站更受歡迎。

比如給出同一段提示詞：

Prompt: Make a flashcard web application. The user should be able to create flashcards， search through their existing flashcards， review flashcards， and see statistics on flashcards reviewed. Preload ten cards containing a Hindi word or phrase and its English translation. Review interface: In the review interface， clicking or pressing Space should flip the card with a smooth 3-D animation to reveal the translation. Pressing the arrow keys should navigate through cards. Search interface: The search bar should dynamically provide a list of results as the user types in a query. Statistics interface: The stats page should show a graph of the number of cards the user has reviewed， and the percentage they have gotten correct. Create cards interface: The create cards page should allow the user to specify the front and back of a flashcard and add to the user’s collection. Each of these interfaces should be accessible in the sidebar. Generate a single page React app (put all styles inline).

GPT-4o生成的網站長這樣：