24 Jul 2025

Lovable Seduces; U.S. Strips Barriers to AI Growth

📰 New AI Benchmark & Industry Insights

AI Coding Benchmarks & Reality Check: The Laude Institute, in collaboration with Databricks and Perplexity co-founder Andy Konwinski, introduced the K Prize, a new AI coding challenge. The first winner achieved only a 7.5% score, underscoring that current AI models are not yet production-ready for complex coding tasks. This also suggests that many existing benchmarks might be inflated due to training data contamination, highlighting the need for cleaner, more rigorous evaluation methods.

Cloud Dynamics & Strategic Implications: Google Cloud is now supporting OpenAI, a major AI competitor, to help them scale, particularly as Microsoft faces GPU capacity limitations. While this boosts Google Cloud's revenue, it raises strategic questions about enabling rivals who could threaten Google's core businesses, drawing parallels to past instances where Google helped competitors only to see them grow into significant threats.

Startup Success Story Lovable: Swedish "vibe coding" startup Lovable has reportedly achieved $100M Annual Recurring Revenue (ARR) in less than eight months with a lean team of only 45 employees. They power 2.3 million active users and over 10 million projects, demonstrating remarkable efficiency and rapid monetization. The company has also refined its pricing strategy to target the mid-market, showcasing an aggressive growth trajectory often seen in venture capital fever dreams. Their success highlights the potential for AI-native, highly efficient businesses.

Quick Hits:

Google’s AI Overviews are now used by two billion people monthly, with its new AI search mode attracting 100 million users in the US and India.
Proton launched a privacy-focused AI assistant featuring end-to-end encryption and no user conversation logs.
A report from Holistic AI suggests that structured adversarial testing could have prevented the public failures observed in X.AI’s Grok 4 model.
Mastodon has introduced in-app donations, spurred by an exodus from Twitter.
Google Photos is rolling out new generative AI features, including photo stylization and cinematic video creation.

Trending Launches:

Clearitty: A sales platform that uses intent data and AI scoring to target in-market accounts.
Commitify: An AI agent that makes phone calls to help users stay accountable.
Qwen3-Coder: A 480B MoE model from Alibaba, excelling in coding tasks with 1M context support.
atypica.AI: Automates market research in 10 minutes by simulating consumer behavior to reveal key insights.
Heardly: A tool designed for quickly reading best-selling books.
CopyOwl: The first AI Research Agent, enabling deep research on any topic with a single click.

U.S. AI Action Plan & Healthcare AI

The Plan: The White House has unveiled an AI Action Plan, marking a significant shift towards accelerating U.S. dominance in AI. This plan outlines over 90 policy actions focused on fostering innovation, building infrastructure, and strengthening diplomacy. Key aspects include developing new data centers, removing legal barriers to AI growth, promoting open-source AI, and incentivizing its adoption. The plan also emphasizes addressing "ideological bias" in AI systems used by government contractors. However, critics express concerns that this blueprint primarily benefits tech giants and may compromise public safeguards.

AI in Healthcare: A partnership between OpenAI and Penda Health in Nairobi, Kenya, demonstrated the positive impact of AI copilots in medical clinics. Clinicians using the AI system showed a 16% reduction in diagnostic errors and 13% fewer treatment mistakes. The success of this initiative was attributed to the use of capable models (GPT-4o), seamless integration into existing workflows, and personalized training for medical staff. This serves as a promising blueprint for integrating AI into healthcare, particularly in underserved regions.

Other Key Developments:

Google DeepMind’s Aeneas: An AI system designed to assist historians in restoring, dating, and deciphering damaged Latin inscriptions. It boasts 72% accuracy in attributing inscriptions to Roman provinces and 73% accuracy in restoring damaged text.
AI Training with Claude and Canva: A tutorial highlights how to create comprehensive social media campaigns by integrating Claude with Canva’s Connector, enabling automated content generation across various platforms.

New!

⚙️ Qwen3-Coder: Alibaba’s new state-of-the-art agentic coding model.
⚡️ Gemini 2.5 Flash-Lite: Google’s fastest and most cost-efficient model.
🗣️ Higgs Audio V2: Boson’s open-source audio model for voice generation.
📸 Google Photos: New AI features, including photo-to-video conversion and Remix capabilities.

ChatGPT Agent, Vercel AI Cloud, and Technical Debt

ChatGPT Agent: More users are getting access to GPT's Agent, a new feature that combines web browsing with deep research analysis, utilizing a virtual computer to perform multi-step tasks such as calendar management and competitive analysis. This represents a significant step towards more autonomous and capable AI agents.

Hidden Technical Debt in AI: Despite the promise of simplicity, AI systems, including large language models (LLMs), come with substantial hidden technical debt. This includes the need for extensive infrastructure, complex data management, and significant operational overhead, revealing the underlying complexity of these powerful tools.

Vercel AI Cloud: Vercel has launched the AI Cloud, a platform aimed at simplifying AI application development. It integrates AI-first tools like the AI SDK and AI Gateway to ensure flexible and secure execution. The platform also optimizes AI workloads, introduces Vercel BotID for enhanced security, and provides a Vercel Sandbox for safely running untrusted code, all of which contribute to advancing the agentic era of web development.

Roundup:

Anthropic Tightens Claude Code Usage Limits: Users of Claude Code have reported unexpectedly restrictive usage limits without any prior notification, leading to concerns about subscription downgrades and inaccurate usage tracking.
FutureBench by Hugging Face: A new benchmark designed to test the ability of AI agents to predict future events across various domains, including science, geopolitics, and technology.
Shopify’s Internal AI Adoption Strategy: Shopify has aggressively adopted AI by purchasing 3,000 Cursor licenses with unlimited token spending and creating an internal LLM proxy. This has empowered non-technical staff to build performance auditing tools and sales engineers to streamline their workflows.
“Power” Attention for Scaling Context: A novel attention implementation that allows for independent control of state size, outperforming standard attention on long sequences and enabling significantly faster custom GPU kernels.
Weighted Perplexity Benchmark: A new evaluation method that normalizes perplexity scores across different tokenizers, providing a fairer and more accurate comparison of language models.
Perplexity’s Expansion into India: Perplexity is expanding its reach into India by partnering with Airtel to offer free 12-month Perplexity Pro subscriptions to 360 million subscribers, aiming to capture a significant share of the next wave of AI adoption.
Meta’s Aggressive Hiring: Meta continues its aggressive hiring spree, acquiring two more key figures from Apple’s AI team, Mark Lee and Tom Gunter.
Mistral’s Le Chat Enhancements: Mistral has added new capabilities to Le Chat, including a deep research mode for generating structured reports and a voice mode powered by Voxtral.
OpenAI’s Bio Bug Bounty: OpenAI has launched a bug bounty program, offering a $25,000 reward for a universal jailbreak that can bypass the safety filters on its ChatGPT Agent, which has been classified as a high bio/chemical risk.
Updated FineWeb: The FineWeb dataset has been updated with 18.5 trillion tokens from CommonCrawl snapshots from January to June 2025.
Windsurf Wave 11: The latest wave of Windsurf startups includes companies focused on AI-native productivity, developer tools, robotics, and consumer agents.

Final Bits:

OpenAI CEO Sam Altman issued a warning about impending “AI fraud,” noting that AI has already compromised authentication methods widely used by financial institutions.
YouTube has introduced new AI tools for Shorts creators, including features for photo-to-video conversion.

📚 Today's Sources

The Internet
AI Secret
The Rundown AI
TLDR AI
The Neuron
There's An AI For That

📰 New AI Benchmark & Industry Insights

U.S. AI Action Plan & Healthcare AI

ChatGPT Agent, Vercel AI Cloud, and Technical Debt

📚 Today's Sources

Subscribe to AI Slop