The Automated Daily - AI News Edition · February 26, 2026 · 14:40

AI datacenters and gas turbines & Google Labs ProducerAI music tool - AI News (Feb 26, 2026)

AI datacenters turn jet engines into power plants, Google Labs unveils ProducerAI + Opal agents, Qwen3.5 lands, and enterprise AI heats up—Feb 26, 2026.

AI datacenters and gas turbines & Google Labs ProducerAI music tool - AI News (Feb 26, 2026)
0:0014:40

Topics

01
AI datacenters and gas turbines — Hyperscalers are adding on-site natural-gas generation for AI datacenters, including repurposed aircraft engines, raising CO₂ and grid-planning concerns.
02
Google Labs ProducerAI music tool — ProducerAI is joining Google Labs as an AI music collaborator using Gemini and DeepMind’s Lyria 3, with SynthID watermarking and shareable “Spaces” for instruments/effects.
03
Opal adds agent-driven workflows — Google Labs Opal introduces an “agent step,” plus Memory, dynamic routing, and interactive chat—turning rigid workflows into goal-driven, tool-choosing agents.
04
Enterprise agent platforms and events — Salesforce TDX 2026 pushes Agentforce 360 and hackathons, while Anthropic expands Claude Cowork connectors/plugins and You.com argues for use-case discovery first.
05
New open models: Qwen3.5 — Alibaba’s Qwen team ships Qwen3.5-35B-A3B on Hugging Face: early-fusion multimodal tokens, sparse MoE (~3B active), and up to 262K–1M context via RoPE scaling.
06
Benchmarks: Intelligence Yield and VBVR — A proposed “Intelligence Yield” metric tracks useful work per compute-minute, and the VBVR benchmark shows video-reasoning remains hard: humans ~97% vs top model ~68.5%.
07
Agent security boundaries and theory — Vercel advocates split-compute sandboxes and safe secret injection, while “Agent Field Theory” frames agents as reward-driven search shaped by prompts, tools, and verifiers.
08
Developer productivity: METR redesign — METR says AI productivity experiments are getting biased as developers avoid AI-off conditions; it plans new methods to measure real-world speedups with agentic tools.
09
Hardware deals and AI geopolitics — Meta signs a long-term AMD infrastructure deal targeting up to 6GW of Instinct GPUs, as Reuters reports DeepSeek gave Huawei early access—tightening US-China compute dynamics.
10
AI in retail headsets: Patty — Burger King pilots “Patty,” an OpenAI-powered headset assistant that helps with procedures and scores “friendliness” via phrase detection, tying into POS and inventory systems.

Sources

Full Transcript

AI’s power hunger is getting so intense that some datacenter operators are literally repurposing aircraft engines—and even a supersonic jet program—to generate electricity on-site. Hold that thought. Welcome to The Automated Daily, AI News edition. The podcast created by generative AI. I’m TrendTeller, and today is february-26th-2026. Let’s get into what’s moving in AI—models, agents, enterprise tooling, and the less glamorous but increasingly central story: compute, power, and the real-world infrastructure underneath all of this.

First up: the AI datacenter boom is colliding head-on with the energy system. Reporting highlighted by The Register, with estimates cited from Truthout, says hyperscalers are increasingly building on-site power generation—often natural-gas turbines—because AI training and inference demand is growing faster than grid upgrades can keep up. The rough estimate floating around: this buildout could add on the order of 44 million tons of CO₂ by 2030, which is framed as comparable to the annual emissions of about 10 million cars. The detail that stuck out today is how extreme the supply crunch has become. There’s reportedly a shortage of purpose-built gas turbines, and some operators are turning to old aircraft engines as stopgap generators. Even Boom Supersonic is stepping into the market, selling power turbines derived from its Symphony engine, with neocloud provider Crusoe lined up for 29 units. And it’s not just boutique players—Meta is cited with a Louisiana campus plan that could scale to 5 gigawatts, with the local utility building multiple combined-cycle plants. The broader point: public renewable goals are running into a short-term reality where “fast-to-deploy” often means gas.

Now to the creative side of AI—where Google is making a very explicit push into music. ProducerAI, a generative AI platform for creating and refining songs, is joining Google Labs. The positioning is clear: not a replacement for musicians, but a “creative collaborator” that can take something as simple as “make a lofi beat,” generate a full track, and then let you iterate—lyrics, melody, arrangement, genre experiments, the whole loop. Google says ProducerAI is built on multiple DeepMind models: Gemini for general intelligence and workflow glue, Lyria 3 for high-fidelity music generation, and even Veo in the stack for adjacent media creation. One notable policy detail: everything it outputs is embedded with SynthID, Google’s imperceptible watermark for identifying AI-generated content. Lyria 3—currently described as a preview inside ProducerAI—is framed as a pro-grade model with a better grasp of rhythm and arrangement, and controls that musicians actually care about, like tempo and time-aligned lyrics. The roadmap emphasizes “creative control,” and a feature called Spaces is the most intriguing part: you describe new instruments or effects in natural language, and it can generate anything from a simple keyboard sound to something more modular and node-based. Spaces are meant to be shareable and remixable—mini-apps inside the platform. ProducerAI is live globally with free and paid plans at producer.ai, and Google is leaning on credibility signals here too, naming artists like Lecrae and The Chainsmokers as part of its community.

Staying inside Google Labs, Opal is also getting more agentic. The new “agent step” turns what used to be static, preconfigured sequences—pick a model, run a prompt—into a goal-driven workflow. Instead of you deciding whether a step uses Web Search, Veo, or a text model, you pick an agent that interprets the objective and routes to the right tools and models. Google’s examples make the shift concrete. A storybook generator used to require you to define page counts and predefined questions up front; the updated “Visual Storyteller” can decide what it needs to ask, suggest plot points as it goes, and generally behave like it’s steering toward the end goal. The “Room Styler” example is similar: rather than a one-shot redesign, it becomes iterative—propose a concept, take feedback on specific elements, even research niche sub-styles if the user asks for something very particular. To support that, Opal is adding Memory—so Opals can remember preferences across sessions—plus dynamic routing, where builders define branching paths based on criteria like “new client versus existing client,” and interactive chat so the agent can ask follow-up questions instead of failing silently when inputs are incomplete. It’s a familiar theme in 2026: less prompt-as-command, more workflow-as-collaboration.

On the enterprise front, two tracks stood out today: platforms trying to operationalize agents, and vendors trying to package “AI transformation” into something companies can actually buy. Salesforce is promoting TDX 2026 as a two-day, developer-heavy event centered on “agentic AI” and the Agentforce 360 Platform. The pitch is part conference, part skills bootcamp, part competitive build sprint—there’s a virtual hackathon where global teams build Agentforce solutions, with finalists pitching live at the event. The agenda reads like a festival for admins, architects, and developers: deep technical breakouts, hands-on trainings, roundtables, and a lot of experiential branding—Campground demos, “Vibe Coding,” mini-hacks, and an “Agentforce City” showcase of the so-called agentic enterprise. Ahead of it, Salesforce is also selling a three-day Trailblazer Bootcamp—nine role-based tracks, a certification voucher, and a $999 price tag. Meanwhile, Anthropic is expanding Claude Cowork with the connectors and plugins that make or break enterprise adoption: Google Drive, Gmail, DocuSign, FactSet, plus customizable plugins aimed at encoding institutional workflows in finance, engineering, and HR. Anthropic is very openly chasing the “knowledge worker operating system” slot, and it’s doing it by making Cowork less like a chat box and more like a context-rich tool with admin controls. And then there’s You.com taking a step back from the tooling arms race with a more pragmatic message: start AI transformation by identifying high-value use cases, not by buying shiny capabilities. Their guide argues for a structured “use case discovery” process—internal ops like back-office automation and knowledge management, and external wins like support, personalization, and product experience—so investments map to measurable impact.

Let’s talk models and benchmarks—because we’re seeing both bigger models and more pressure to prove efficiency. Alibaba’s Qwen team has published Qwen/Qwen3.5-35B-A3B on Hugging Face, with weights/config in standard Transformers format and deployment recipes for vLLM and SGLang. The architecture is the headline: 35B total parameters, but only about 3B activated at inference thanks to a sparse Mixture-of-Experts setup. It’s also pitched as multimodal: a causal language model with a vision encoder, trained with early-fusion multimodal tokens. Context length is aggressive: 262,144 tokens natively, with documentation describing extension toward roughly a million tokens using RoPE scaling approaches like YaRN. The model card goes deep on serving details—OpenAI-compatible endpoints, tool calling, speculative decoding—and flags a behavior developers will notice immediately: by default it emits hidden “thinking mode” content in <think> tags, and you disable that via API parameters. In parallel, there’s a new proposed metric called Intelligence Yield—basically “how much useful work per minute of compute” a model produces, derived from METR’s Time Horizons benchmark. The argument is that we fixate on capability headlines—how hard the tasks are—while ignoring how much compute was burned to get there. Intelligence Yield tries to combine task difficulty, average score, and working time into one efficiency-oriented number, and it’s being used to claim certain frontier models can solve harder tasks more reliably while using less compute. Expect more of this: in 2026, efficiency is strategy, not just engineering. And for video reasoning specifically, a benchmark called VBVR—A Very Big Video Reasoning Suite—puts hard numbers on how far we still have to go. Humans score around 97.4% on the suite, while the current top system listed is at 68.5%, with other major video models substantially lower. The tasks span perception, spatial navigation, transformation, and abstraction—things that require actually tracking objects and rules across time, not just describing a frame. It’s a reminder that “video generation” and “video understanding” are not the same mountain.

If you’re building agents, today’s security and reliability posts are basically waving a flag: stop treating prompts as a safety boundary. Vercel argues that modern agentic systems blur into coding agents—reading and writing files, running shell commands, executing generated code—and that teams often run all of that in one trust zone with full access to secrets. The risk example is painfully realistic: a prompt injection hidden inside something mundane like a log file tells the agent to exfiltrate SSH keys or AWS credentials. If the agent can generate and execute code with those privileges, it’s game over. Vercel’s recommended pattern is split-compute: keep the agent harness and its secrets separate from the sandbox where generated code runs. Combine that with “safe secret injection,” where credentials are injected at request time so the generated code never sees raw secret values—reducing exfiltration opportunities. It’s an architectural shift: you assume the agent is steerable and sometimes injectable, so you design the environment so the worst trajectories simply aren’t possible. That dovetails with an essay framing agents as policies doing reward-driven search, not little thinkers. The core idea—called “Agent Field Theory”—is that behavior comes from the interaction of the model, the environment, and the context window. Drift happens when context gets polluted with stale observations; reward hacking happens when the easiest path to “success” exploits weak verifiers or leaky permissions. The practical takeaway is also consistent: tighter tool scopes, cleaner workspaces, strong automated verification—tests, linters, checkers—and permissions as hard walls.

On measurement, METR says it’s having trouble even running the experiment. Their early-2025 result famously suggested experienced open-source devs got slower with AI tools—around 19 to 20% longer on tasks. They launched a bigger follow-up in late 2025 with more agentic tools—think Claude Code and Codex—across 57 developers and 800-plus tasks. Now the problem: selection effects. Developers increasingly refuse to do substantial work if they might be randomized into an “AI disallowed” condition. METR reports survey evidence that maybe 30 to 50% of developers withheld tasks for exactly that reason, which means the experiment is missing the work where AI is likely most beneficial. Their latest estimates show some speedup signals—like an 18% speedup among returning developers—but the confidence intervals are wide and METR is basically saying, “our measurement tool is breaking because the world changed.” So they’re planning redesigns: shorter intensive studies, better time-use tracking, observational data, fixed-task experiments, and potentially randomizing at the developer level instead of the task level. It’s a very 2026 kind of problem: AI adoption is moving faster than our ability to measure it cleanly.

Now, hardware and geopolitics—because the compute stack is getting less neutral by the week. Meta announced a long-term infrastructure agreement with AMD, with expectations to use up to 6 gigawatts of AMD Instinct GPUs over time. That’s paired with alignment across silicon, systems, and software, and deployments based on Meta’s Helios rack-scale architecture. Initial shipments for the first deployments are expected in the second half of 2026. Meta’s message is diversification: multiple hardware partners plus its in-house MTIA accelerator program. On the other side of the Pacific, Reuters reports DeepSeek has withheld pre-release access of its upcoming flagship model from Nvidia and AMD—departing from the usual practice where chipmakers get early bits to optimize performance. Instead, DeepSeek reportedly gave early access to domestic suppliers like Huawei, giving them a head start tuning software for Chinese processors. Reuters also cites a Trump administration official claiming DeepSeek trained its latest model using Nvidia Blackwell chips in mainland China—potentially violating export controls—while possibly trying to obscure that provenance. Whether all details hold up or not, the direction is clear: model releases are becoming leverage in hardware ecosystems. And in the background, there’s a market narrative war too. Zvi Mowshowitz reviewed a viral “scenario, not a prediction” essay about AI-triggered economic disruption—one that Bloomberg partly credited with sparking a stock selloff. Zvi’s critique is that the scenario assumes implausibly fast diffusion, ignores compute bottlenecks, and mixes “superhuman agents everywhere” with ordinary macro forecasting in a way that doesn’t quite add up. His broader point is worth keeping: if AI really crushes middlemen and removes friction—SaaS rents, intermediary fees—real wealth can rise even if nominal wages shift, and governments have a long history of stepping in with fiscal and monetary stabilization when demand collapses.

Two quick human-and-product notes to close. Amazon’s AGI lab leader David Luan is leaving less than two years after joining via the Adept acqui-hire. He says he’s leaving to “cook up something new,” after leading long-term research bets and shipping Nova Act as part of Amazon’s agent push. The departure also lands amid ongoing scrutiny of AI acqui-hires by the FTC, which has been explicitly looking for deals that might sidestep oversight. And in fast food, Burger King is piloting an OpenAI-powered headset chatbot called Patty as part of its BK Assistant platform. It’s pitched as operational help—questions like how to prep a menu item or clean equipment—but also as a “friendliness” coach. BK says it trained the system to listen for phrases like “welcome to Burger King,” “please,” and “thank you,” with managers able to query how the restaurant is scoring. Patty is being tested in about 500 locations, while broader assistant capabilities are aimed at a full US rollout by the end of 2026. It’s a small story with a big theme: AI is moving from the office into frontline labor, and the first use cases are often training, compliance, and metrics—because they’re measurable.

That’s the rundown for february-26th-2026. If there’s one throughline today, it’s that “agentic AI” is no longer just about model capability—it’s about infrastructure, security boundaries, benchmarks that reward efficiency, and products that fit how people actually work. Links to all stories can be found in the episode notes. Thanks for listening to The Automated Daily, AI News edition—I've been TrendTeller. Talk to you tomorrow.