29 Jul 2025

Mr Softy's AI on the Edge; Small Model... Big Result

Claude Usage Caps: Anthropic is implementing new usage caps for all Claude subscription tiers starting August 28. This is primarily aimed at hardcore users and those reselling access, as continuous use of Claude Code has pushed the system beyond its limits, leading to frequent outages and scarce compute.

Copilot Mode in Edge: Microsoft has rolled out an experimental Copilot Mode in its Edge browser. This AI assistant can search across tabs, summarize comparisons, assist via voice, and even book hotels or dinner reservations using browsing history and saved credentials. This marks a shift in browser wars towards full-stack task automation, with major players like Microsoft, OpenAI, Google and Perplexity aiming to make the browser an all-encompassing hub. The implication is that browsers are evolving into agents that require user data and permission to act on their behalf.

Harmonic and Aristotle: Harmonic, co-founded by Robinhood CEO Vlad Tenev, has launched a chatbot app featuring Aristotle, a math-focused AI model. The model promises “zero hallucinations” and formal proof verification. Harmonic recently raised $100M at an $875M valuation. This initiative aims to make AI predictable by design through embedded formal systems, starting with mathematics. The key takeaway is that Harmonic is developing verifiable, accountable, and deployable AI tools.

Google Chrome: Now offers AI-powered store summaries for US shoppers.
Booz Allen: Launched Vellox-Reverser, an AI tool for automating malware reverse engineering.
Nvidia CEO Jensen Huang: Predicts AI will create more millionaires in the next five years than the internet did in twenty.
Silicon Valley AI Startups: Some are adopting China's "996" work schedule to gain a competitive edge.
AI Chatbots: Becoming essential for neurodivergent individuals for social communication, though experts warn against over-reliance.

CopyCat: A no-code browser automation platform combining AI prompts with step-by-step web task actions.
Doco: An AI writing assistant integrated into Word, blending features from Grammarly and Co-Pilot.
Nitrode: An AI game engine enabling vibe-style coding to build playable 3D games rapidly.
Unitree R1: An ultra-lightweight humanoid robot with flexible joints and modular design, intended for education, research, and industry.

Z.ai's GLM-4.5: Chinese startup Z.ai (formerly Zhipu) has released GLM-4.5, an open-source agentic AI model family. It offers performance comparable to leading models in reasoning, coding, and autonomous tasks, while being more cost-effective than DeepSeek. GLM-4.5 combines reasoning, coding, and agentic abilities into a single 355B-parameter model with hybrid thinking. Z.ai claims it's the top open-source model globally and excels in agentic tasks with a 90% tool-use success rate. They also open-sourced their ‘slime’ training framework. This rapid development from Chinese labs is closing the gap with frontier systems and pressuring OpenAI.
Alibaba's Wan2.2: Alibaba's Tongyi Lab launched Wan2.2, a new open-source video model. It provides advanced cinematic capabilities and high-quality motion for text-to-video and image-to-video generations. Wan2.2 uses two specialized "experts" for efficiency and has surpassed rivals like Seedance, Hailuo, Kling, and Sora in aesthetics, text rendering, and camera control. It was trained on significantly more data than its predecessor, improving its handling of complex motion and scenes. Users can also fine-tune video aspects. This highlights China’s broad open-source efforts across the AI toolbox, building a parallel ecosystem.

Runway Aleph: A tool for editing, transforming, and generating video content.
Qwen3-Thinking: Alibaba’s AI with enhanced reasoning and knowledge.
Hunyuan3D World Model 1.0: Tencent’s open-world generation model.
Aeneas: Google’s open-source AI for restoring ancient texts.
Alibaba Quark AI glasses: Debuted, powered by the company’s Qwen model.
Tesla & Samsung Deal: $16.5B deal for manufacturing Tesla’s next-gen AI6 chips.
Runway & IMAX: Partnership to bring AI-generated shorts to theaters.
Google DeepMind: CEO Demis Hassabis reported 980 trillion tokens processed across Google AI products in June, a 2x increase from May.

Hierarchical Reasoning Model (HRM): Singapore-based Sapient Intelligence has developed a new AI model, HRM, which challenges the conventional belief that “bigger is better” in AI. This tiny 27M-parameter model solves complex reasoning puzzles that larger AI models like ChatGPT struggle with. HRM mimics the human brain’s hierarchical structure, using a high-level “planner” for strategic thinking and a low-level “worker” for rapid calculations. This allows HRM to “think” deeply in a single forward pass and learn from minimal examples without extensive pre-training.
Performance: On the ARC-AGI benchmark (an AI IQ test), HRM achieved 40.3%, significantly outperforming Claude 3.7 (21.2%) and OpenAI’s o3-mini-high (34.5%). It also solved 55% of Sudoku-Extreme puzzles and 74.5% of 30x30 mazes, where other models scored 0%. HRM can be trained in just two GPU hours.
- Implications: HRM demonstrates that architecture can be more important than size, leading to cheaper AI deployment, faster training, and better reasoning without expensive compute. Its open-source nature suggests a future where powerful AI can run efficiently on local machines, potentially disrupting the dominance of massive data centers.

Honesty in AI: A key tip from leaked Claude Code and Gemini 2.5 Pro system prompts is to force AI to admit when it doesn't know something. For example, include a rule like: “If you cannot find a complete solution, you must not guess.”
Structured Prompting: An upgraded prompt format suggests: “Can you fully answer this question? Summary: [your main findings] Details: [step-by-step with explanations] Gaps: [anything you’re unsure about].” Additionally, verify answers by asking the AI to find errors in its own response.

Flow: An AI voice keyboard for iPhone that converts speech to text 5x faster.
GLM 4.5: An open-source, GPT-4-level AI for complex reasoning, coding, and agent tasks.
Julius AI: Transforms Excel files into charts based on natural language questions, now with data connectors.
Shortcut: Automates complex Excel tasks, including financial modeling and spreadsheet filling.
Seed LiveInterpret 2.0: Provides instant voice conversation translation while preserving the original voice.
Wan 2.25B: Generates videos from text or images at 720P/24fps, compatible with consumer GPUs.

China AI Adoption: China reports a 99% AI adoption rate in universities and has published an AI Global Governance Initiative.