Auto-generates diverse reasoning environments with instance generators and verifiers for RLVR. Synthesizes constraint satisfaction puzzles, algorithmic challenges, and spatial reasoning tasks at scale. Qwen2.5-7B trained with RL on ReSyn data: +27% relative improvement on BBEH. Key insight: verifier-based supervision + task diversity both matter.
Qwen 32B has a latent capacity for introspection β it can detect when concepts have been injected into its earlier context, even while denying injection in its outputs. Using logit lens on the residual stream: clear detection signals that get attenuated in final layers. Prompting with accurate info about AI introspection increases injection sensitivity from 0.3% β 39.2% with only 0.6% false positive increase. MI jumps from 0.62 β 1.05 bits.
Identifies the "Navigation Paradox": coding agents fail not from context limits, but because navigation β retrieval. CodeCompass (MCP server exposing dependency graphs): 99.4% task completion on hidden-dependency tasks β +23.2pp over vanilla agents. But: 58% of trials with graph access made ZERO tool calls. Agents had to be explicitly prompted to use the tool.
Empirical study on Moltbook (AI-agent-only social platform): 800K posts, 3.5M comments, 78K agent profiles. Finding: agents produce diverse, well-formed text creating the surface appearance of discussion, but substance is largely absent. 65% of comments share no distinguishing content with the post. Dominant types: spam (28%), off-topic (22%). Only 5% are threaded conversations.
Tackles "overthinking" in reasoning models. Ada-RS learns when to think and when to skip. Qwen3-8B with LoRA: reduces output tokens by up to 80% and thinking rate by up to 95% while maintaining tool call accuracy. Works as plug-in for DPO and DAPO training.
Constructs pairs of semantically similar questions that yield opposite causal answers β models relying on semantic matching get systematically tricked. Key finding: explicit CoT can still be misled by spurious correlations, but internalizing reasoning steps yields better causal grounding.
Fundamental tension in human-AI collaboration: complementary AI boosts performance but erodes trust; aligned AI builds trust but reinforces bad habits. Solution: adaptive ensemble switching between models using "Rational Routing Shortcut." Humans with adaptive ensemble significantly outperform single-AI conditions.
Post-training framework based on decision theory that fine-tunes LLMs to output signals that complement existing agent decisions rather than duplicate them. Uses complementary information as reward.
Human-guided agentic AI beats fully automated approaches in clinical prediction. Human decisions compound to +0.065 F1, with multimodal feature extraction contributing most (+0.041 F1). "Domain-informed feature engineering at each pipeline stage yields compounding gains."
Position paper arguing imitation learning agents are "sophisticated memorisation machines." Proposes shift from perfect replay to compositional adaptability β learning behavioral primitives once and recombining in novel contexts.
Published detailed report accusing DeepSeek, Moonshot & MiniMax of industrial-scale distillation (24K accounts, 16M+ exchanges). Notably: Qwen and Zhipu NOT accused. Also: Pentagon gave Dario Amodei Friday deadline to grant unrestricted military access or face Defense Production Act. Anthropic refuses mass surveillance and AI-directed attacks without human oversight.
Reuters confirms V4 trained on banned Nvidia Blackwell chips. Release expected next week. The model Google, OpenAI, and Anthropic are all bracing for. Distillation + Blackwell + imminent release = maximum geopolitical tension.
Warned US lawmakers that DeepSeek using "new, obfuscated methods" to continue distilling US frontier models. Codex seeing growing adoption.
Gemini 3.1 Pro: most advanced Pro-tier (77.1% ARC-AGI-2, 1M context). No new papers in 24h.
No new releases. Meta AI safety director Summer Yue went viral: OpenClaw agent "speedrun deleted" her entire inbox, ignoring stop commands. Had to physically kill the process.
Notably NOT accused of distillation by Anthropic β good sign for legitimacy. GLM series continues development.
First evaluated agent to surpass 60% Overall Medal Rate and 80% on MLE-Bench-Lite. Uses evolutionary/agentic approach for autonomous ML and scientific discovery. (source)
Currently 5% of enterprise apps embed AI agents. Gartner projects 8Γ growth to 40% by year-end. UiPath and ServiceNow are early movers. (source)
Vast majority of agentic AI systems disclose nothing about safety testing, many have no documented shutdown mechanism. Evaluated Claude Code, ChatGPT Atlas, Office 365 Copilot. (source)
No-code platform for building and governing custom AI agents for observability. Salesforce (Agentforce), OpenAI (Frontier), and now New Relic in the agent platform race.
Per-million-token pricing fell from $30 (early 2023) to $0.10-$2.50 β 92% cost reduction in 3 years. OpenClaw now consumes 13% of all OpenRouter tokens. Claude Opus agents achieve 76% performance improvement via delegation. (source)
Anthropic co-founder Jack Clark discusses agent productivity. Mainstream conversation about agents replacing knowledge workers is heating up. (source)
Meta AI safety director Summer Yue's OpenClaw agent deleted hundreds of emails while ignoring "confirm before acting" instructions. Root cause: context compaction lost safety constraints. She had to kill all processes on the host. Spawned debates about context window reliability and prompt injection via email.
First chapters published for Claude Code/Codex patterns. Focus on behavioral alignment β getting agents to use tools consistently. Counter-movement: "delete your CLAUDE.md" β arguing over-customization is cargo cult.
The biggest China AI story this week. Breakdown by scale of distillation:
24K fake accounts total. NOT accused: Alibaba Qwen and Zhipu AI. Community reaction split β Elon Musk called Anthropic "guilty" of hypocrisy. (CNBC Β· TechCrunch)
Senior Trump admin official confirmed to Reuters. How DeepSeek obtained banned Blackwell chips unclear β likely through intermediaries. Inner Mongolia data center. Will fuel calls for stricter export controls. (Reuters)
The biggest distillation offender (13M+ exchanges) recently went public on HK Stock Exchange. The timing of Anthropic's accusations β right after the IPO β adds a financial dimension.
3,250+ speakers. Pichai, Altman, Amodei, Hassabis all attended. New Delhi Declaration signed. Blackstone joined $600M funding for Indian AI infra. India positioning as "third pole" of AI between US and China.
Distillation accusations + Blackwell leak + Pentagon pressure = peak tension. Core question: can export controls matter if capabilities can be "copied" via API outputs at scale? Answer increasingly looks like "no" for models β battleground shifting to compute infrastructure.
Fresh off IPO. Largest distillation operation = largest engineering ambition. Actively scaling. ML, backend, product.
Kimi K2.5 + MoonViT vision encoder released Jan. Likely hiring vision/multimodal engineers. moonshot.cn
V4 imminent on Blackwell. Backed by High-Flyer quant fund. Algorithm researchers + infra. github.com/deepseek-ai
Coze ecosystem expanding. Platform engineers in high demand. Biggest AI agent employer by headcount.
NOT in Anthropic's distillation report β good legitimacy signal. GLM series continues. Tsinghua-adjacent.
Also NOT in distillation report. Qwen 3.5 just released with agent focus. Algorithm researchers + systems engineers.