December 2025: AI updates from the past month

1966woodenghost December 31, 2025 0

SaveSavedRemoved 0

Anthropic makes Skills an open standard

Skills—a capability that allows users to teach Claude repeatable workflows—was first introduced in October, and now the company is making it an open standard. “Like MCP, we believe skills should be portable across tools and platforms—the same skill should work whether you’re using Claude or other AI platforms,” the company wrote in a blog post.

Additionally, the company announced a directory of pre-built skills from companies like Notion, Canva, Figma, and Atlassian.

Other new features, which vary by plan, include the ability to provision skills from admin settings and easier methods for creating and editing skills.

OpenAI GPT-5.2-Codex released

This is a version of GPT-5.2 that is optimized for the company’s coding agent Codex. It includes “improvements on long-horizon work through context compaction, stronger performance on large code changes like refactors and migrations, improved performance in Windows environments, and significantly stronger cybersecurity capabilities,” OpenAI wrote in a post.

GPT-5.2-Codex is available across all Codex surfaces for paid ChatGPT users and is planned to be added to the API in the coming weeks after more safety improvements are made. The company also announced that it is piloting a new invite-only program where it gives access to new capabilities and more permissive models for vetted professionals and organizations in the cybersecurity space.

“By rolling GPT‑5.2-Codex out gradually, pairing deployment with safeguards, and working closely with the security community, we’re aiming to maximize defensive impact while reducing the risk of misuse. What we learn from this release will directly inform how we expand access over time as the software and cyber frontiers continue to advance,” OpenAI wrote.

Google releases Gemini 3 Flash, enabling faster, more cost effective reasoning

Google has announced the release of Gemini 3 Flash, its latest frontier model designed for speed at a lower token cost.

According to Google, this model is ideal for iterative development, as it is able to quickly reason and solve tasks in high-frequency workflows. It also outperforms all Gemini 2.5 models as well as Gemini 3 Pro in coding capabilities on SWE-bench Verified.

Additionally, due to its strong performance in reasoning, tool use, and multimodal capabilities, it is ideal for tasks like complex video analysis, data extraction, and visual Q&A, enabling more intelligent applications that demand advanced reasoning and quick answers, like in-game assistants or A/B test experiments.

Zencoder introduces AI Orchestration layer to cut down on issues in AI-generated code

Zencoder is introducing its Zenflow desktop app in an attempt to help development teams transition from vibe coding to AI-First Engineering.

According to the company, AI coding has hit a ceiling due to LLMs producing code that looks correct but fails in production or gets worse as it is iterated on.

Zenflow introduces an AI Orchestration layer to turn “chaotic model interactions into repeatable, verifiable engineering workflows.”

This orchestration layer is based on four pillars:

Structured AI workflows that follow a Plan > Implement > Test > Review cycle
Spec-driven development, where agents are anchored to technical specifications
Multi-agent verification, leveraging model diversity to reduce blind spots, such as having Claude review code written by OpenAI models
Parallel execution of multiple models running at the same time in isolated sandboxes

Google launches A2UI project to enable agents to build contextually relevant UIs

Google has announced a new project that aims to leverage generative AI to build contextually relevant UIs.

A2UI is an open source tool that generates UIs based on the current conversation’s needs. For example, an agent designed to help users book restaurant reservations would be more useful if it featured an interface to input the party size, date and time, and dietary requirements, rather than the user and agent going back and forth discussing that information in a regular conversation. In this scenario, A2UI can help generate a UI with input fields for the necessary information to complete a reservation.

“With A2UI, LLMs can compose bespoke UIs from a catalog of widgets to provide a graphical, beautiful, easy to use interface for the exact task at hand,” Google wrote in a blog post.

Patronus AI announces Generative Simulators

Generative Simulators are simulation environments that can create new tasks and scenarios, update the rules of the world over time, and evaluate an agent’s actions as it learns.

The company additionally announced a new training method called Open Recursive Self-Improvement (ORSI) that allows agents to improve through interaction and feedback without requiring a full retraining cycle between attempts.

“Traditional benchmarks measure isolated capabilities, but they miss the interruptions, context switches, and multi-layered decision-making that define actual work,” said Anand Kannappan, CEO and co-founder of Patronus AI. “For agents to perform tasks at human-comparable levels, they need to learn the way humans do – through dynamic, feedback-driven experience that captures real-world nuance.”

OpenAI announces GPT-5.2

GPT-5.2 is optimized for professional knowledge work, scoring a 70.9% (using GPT-5.2 Thinking) on knowledge work tasks on the GDPval benchmark, compared to just 38.8% for GPT-5.1 Thinking.

The company has started rolling out GPT-5.2 in ChatGPT today, with Instant, Thinking, and Pro modes, starting with paid plans. It is also available in the OpenAI API for all developers.

“Overall, GPT‑5.2 brings significant improvements in general intelligence, long-context understanding, agentic tool-calling, and vision—making it better at executing complex, real-world tasks end-to-end than any previous model,” the company said.

Google launches improved Gemini audio models

Gemini 2.5 Flash Native Audio improves the model’s ability to handle complex workflows, navigate user instructions, and hold natural conversations.

It is now available in Google AI Studio and Vertex AI, as well as being incorporated into Google’s user-facing products like Gemini Live and Search Live.

The company also announced live speech translation in the Google Translate app, which allows speech to be translated in real-time while preserving speaker intonation, pacing, and pitch. It supports over 70 languages and 2000 language pairs.

“For two-way conversation, Gemini’s live speech translation handles translation between two languages in real-time, automatically switching the output language based on who is speaking. For example, if you speak English and want to chat with a Hindi speaker, you’ll hear English translations in real-time in your headphones, while your phone broadcasts Hindi when you’re done speaking,” the company explained.

Google announces beta for Interactions API

Another update from Google this week was the beta release of the Interactions API, an interface for working with Google’s models and agents like Gemini Deep Research.

“The Gemini Interactions API represents a major step forward in how we model AI communication. Whether you are building custom agents from scratch using any framework like the ADK or connecting existing agents together via A2A, this is a new set of capabilities to start exploring today,” the company wrote in a blog post.

Mistral releases Devstral 2

Devstral 2 is the company’s latest open source coding model, and it is available in two different sizes: Devstral 2 (123B) and Devstral Small 2 (24B).

The company also released Mistral Vibe CLI, an open-source command-line coding assistant that leverages Devstral. It can explore and modify a developer’s codebase using natural language from the terminal or an IDE. Key features include project-aware context, smart references, multi-file orchestration, persistent history, autocompletion, and customizable themes.

Linux Foundation forms Agentic AI Foundation to be new home for MCP, goose, and AGENTS.md

The Linux Foundation today announced that it is forming the Agentic AI Foundation (AAIF) to promote transparent and collaborative evolution of agentic AI.

Three major projects have been donated to the foundation at launch: Anthropic’s Model Context Protocol (MCP), Block’s goose, and OpenAI’s AGENTS.md.

“Donating MCP to the Linux Foundation as part of the AAIF ensures it stays open, neutral, and community-driven as it becomes critical infrastructure for AI,” said Mike Krieger, chief product officer at Anthropic. “We remain committed to supporting and advancing MCP, and with the Linux Foundation’s decades of experience stewarding the projects that power the internet, this is just the beginning.”

Progress adds Agentic UI Generator to latest versions of Telerik and Kendo UI

Progress Software announced the latest releases of its Telerik and Kendo UI products, which both include an Agentic UI Generator that can create multi-component, fully styled, enterprise-grade page layouts.

The Agentic UI Generator is currently available for Progress Telerik UI for Blazor, Progress KendoReact, and Progress Kendo UI for Angular.

“With today’s release, AI-based code generation is now enterprise-ready, providing new horizons for UI development,” said Loren Jarrett, EVP and GM of digital experience at Progress Software. “Instead of simply generating code with AI that requires review and revision, with the Agentic UI Generator, developers can now build production-ready interfaces based on best practices from simply a prompt. This marks an important milestone—not just for Telerik and Kendo UI, but for how modern applications will be built going forward.”

Wherobots launches RasterFlow to provide foundations needed to apply AI models on satellite image datasets

Spatial intelligence company Wherobots announced the launch of a private preview of RasterFlow, a satellite image preparation and inference solution that will make it easier to gain insights from that type of data.

“RasterFlow is a new compute engine that is going to help feed data about the physical world to all sorts of different types of applications, but then also make it so that we can process it and serve other applications as well,” said Ben Pruden, head of go-to-market at Wherobots.

By streamlining this process, customers will be able to run AI models on physical world data to get answers to physical world questions, such as predicting fields and their boundaries from an overhead view of farmland.

Augment Code launches Code Review Agent

As AI coding assistants churn out ever greater amounts of code, the first – and arguably most painful – bottleneck that software teams face is code review. A company called Augment Code, which has developed an AI code assistant, announced a Code Review Agent to relieve that pressure and improve flow in the development life cycle.

Guy Gur-Ari, Augment Code co-founder and chief scientist, explained that a key differentiator from other code assistants is that the Code Review Agent works at a higher semantic level, making the agent almost a peer to the developer.

“You can talk to it at a very high level. You almost never have to point it to specific files or classes,” he said in an interview with SD Times. “You can talk about, oh, add a button that looks like this in this page, or explain the lifetime of a request through our system, and it will give you good answers, so you can stay at this level and just get better results out of it.”

Anthropic acquires Bun

Bun is a JavaScript, TypeScript, and JSX toolkit, and Anthropic plans to incorporate it into Claude Code to improve performance and stability and enable new capabilities.

“Bun is redefining speed and performance for modern software engineering and development. Founded by Jarred Sumner in 2021, Bun is dramatically faster than the leading competition. As an all-in-one toolkit—combining runtime, package manager, bundler, and test runner—it’s become essential infrastructure for AI-led software engineering, helping developers build and test applications at unprecedented velocity,” Anthropic wrote in a post.

GPT-5.1-Codex-Max now available in OpenAI API

GPT-5.1-Codex-Max is the company’s latest frontier agentic coding model, and it is faster, more intelligent, and uses fewer tokens than the base GPT-5.1-Codex.

OpenAI also announced that developers can now delegate tasks from Linear to Codex. They can assign or mention Codex in an issue to trigger it, and then as Codex works through the task, it posts updates back to Linear.

Google adds Data Commons extension to Gemini CLI

Google is adding a Data Commons extension to the Gemini CLI to make it easier for developers to access and interact with publicly available data.

Data Commons is a large library of public data from around the world, gathered from sources like the United Nations, the World Bank, and a number of government agencies.

The new extension can be used to ask questions like “What are some interesting statistics about India?” or “Analyze the impact of education expenditure on GDP per capita in Scandinavian countries” directly in the CLI.

Amazon releases Nova Forge, Nova Act, and new Nova models

Nova Forge allows developers to build their own frontier models using Nova models. Users can combine their own datasets with Amazon Nova-curated training data, and then host their models on AWS.

Nova Act is a new service that helps developers build, deploy, and manage fleets of agents for UI workflows.

Finally, Nova 2 Lite is a fast and cost-effective reasoning model that supports extended thinking, and Nova 2 Sonic is a speech-to-speech model for building voice interactivity.

Amazon adds 18 new open weight models to Bedrock

The new models include ones from Google, Mistral, NVIDIA, OpenAI, Moonshot AI, MiniMax AI, and Qwen. These include the four newest models from Mistral, which are only available on Bedrock: Mistral Large 3, Ministral 3 3B, Ministral 3 8B, and Ministral 3 14B.

“With this launch, Amazon Bedrock now provides nearly 100 serverless models, offering a broad and deep range of models from leading AI companies, so customers can choose the precise capabilities that best serve their unique needs,” the company wrote in a blog post.

Parasoft releases latest version of C/C++test with agentic AI workflows

First previewed at embedded world North America last month, the updates include agentic AI workflows, static analysis for CUDA C/C++, and improved support for GoogleTest.

Parasoft’s MCP server allows AI agents to be connected to C/C++test to automatically fix violations, optimize rule sets, and generate documentation.

“This is what AI developers actually want—one that acts as a true partner,” said Igor Kirilenko, chief product officer at Parasoft. “By automating the heavy lifting, it frees up your experts to focus on more complex challenges, turning quality and compliance from a burden into their greatest advantage.”

Source: sdtimes.com…

SaveSavedRemoved 0