Search

Showing top 139 results for "GPT-5" · filtered from 165 indexed

All sources 9to5mac.com 12 github.blog 12 pcworld.com 10 theverge.com 9 androidauthority.com 9 neowin.net 9 engadget.com 8 9to5google.com 7 xda-developers.com 6 computerbase.de 6 cnet.com 5 aws.amazon.com 4

Tracked topic

GPT-5

219 articles indexed Last updated 2d ago See topic hub

Videos

Paper page - EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions

…On EnterpriseClawBench, the best configuration reaches only 0.663 (Codex with GPT-5.5). These results show that enterprise agent evaluation must report harness--model combinations , artifact delivery , visual quality, cost, runtime…

Jun 23, 2026

So würde eine KI als Start-up-Chef abschneiden

…Opus 4.8 baute eine kohortenbasierte Cash-Prognose, GPT-5.5 analysierte Verhandlungshistorien, um Kundenpräferenzen abzuleiten. Die Varianz zwischen den Läufen desselben Modells ist ebenfalls groß. GPT-5.5 zum Beispiel schwankte…

Jun 29, 2026 · Carolin Riethmüller

Gemini 3.5 Flash lands on Google's Android coding rankings, but it's 3x the cost for slower performance

…Gemini 3.5 Flash ranks 6th in the Android Bench list under models like GPT 5.5 and Gemini 3.1 Pro Preview, which was tested in February. Gemini 3.5 Flash…

Jun 12, 2026 · Andrew Romero

Claude Fable 5 is making a dramatic return with 'extraordinarily strong' safeguards

…Anthropic claims its own testing yielded the same results with less capable models across developers, such as Opus 4.8 and GPT-5.5. Further, every model Anthropic tested produced the same…

Jul 1, 2026 · Andrew Romero

Discussions and forums

Hacker News · u/e2e4 · 1w ago

GPT-5.6 leaks in Codex PR

Google's Android coding tests reveal an unexpected Gemini 3.5 Flash weakness

…Topping the list was OpenAI’s GPT 5.5 , which scored 74, followed by GPT 5.4 and an older Google model, Gemini 3.1 Pro Preview, both with 72.4. The…

Jun 15, 2026 · Jay Bonggolto

Paper page - FineVerify: Scaling Test-Time Compute with Fine-Grained Self-Verification for Agentic Search

…With only four sampled trajectories, it improves GPT-5-mini by 8.2 accuracy points and Gemini-3-flash by 5.6% on average. With 12 samples, FineVerify enables GPT-5-mini…

Jun 2, 2026

Zugang zu Mythos und Cyber: EU verhandelt mit Anthropic und OpenAI - Hardwareluxx

…Kommissionssprecher Thomas Regnier erklärte diesbezüglich gegenüber Reuters , dass OpenAI aktiv auf die EU zugegangen sei und europäischen Institutionen, den Mitgliedstaaten, Cybersicherheitsbehörden und Partnerunternehmen die Nutzung von GPT-5.5-Cyber ermöglichen möchte…

May 12, 2026 · Martin Gerke

Followed topics

Search

GPT-5

Videos

Paper page - EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions

Top stories

GPT-5.6: OpenAI verspricht mehr Leistung bei weniger Token-Verbrauch

OpenAI limits GPT-5.6 rollout after government request, says restrictions shouldn’t be the norm | TechCrunch

OpenAI will delay GPT-5.6 after Trump administration request

OpenAI's free GPT-5.5 model makes ChatGPT better at understanding context - Engadget

So würde eine KI als Start-up-Chef abschneiden

Gemini 3.5 Flash lands on Google's Android coding rankings, but it's 3x the cost for slower performance

Claude Fable 5 is making a dramatic return with 'extraordinarily strong' safeguards

Discussions and forums

GPT-5.6 Preview System Card

GPT-5.6 Preview System Card

GPT-5.6 System Card [pdf]

GPT-5.6 family added to codex

GPT-5.6 leaks in Codex PR

Google's Android coding tests reveal an unexpected Gemini 3.5 Flash weakness

Paper page - FineVerify: Scaling Test-Time Compute with Fine-Grained Self-Verification for Agentic Search

Zugang zu Mythos und Cyber: EU verhandelt mit Anthropic und OpenAI - Hardwareluxx

OpenAI just released new personal finance features for ChatGPT customers - 9to5Mac

I tested a local LLM against a frontier cloud model, and the gap was smaller than I expected

Prepare to expect less from your cheap AI subscription