Close Menu
Mirror Brief

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Google tweaked its AI-powered Ask Photos feature and restarted its rollout

    June 26, 2025

    A New Perimenopause Program Is Underway at Six Senses Douro Valley

    June 26, 2025

    Open-air attraction in County Durham wins Museum of the Year

    June 26, 2025
    Facebook X (Twitter) Instagram
    Mirror BriefMirror Brief
    Trending
    • Google tweaked its AI-powered Ask Photos feature and restarted its rollout
    • A New Perimenopause Program Is Underway at Six Senses Douro Valley
    • Open-air attraction in County Durham wins Museum of the Year
    • One in four young people in England have mental health condition, NHS survey finds | Mental health
    • Liverpool sign Milos Kerkez as spending reaches £170m
    • Starmer changes tone in bid to win back Labour MPs
    • Apple to open App Store to competitors in EU as it seeks to avoid fines
    • In just 3 months, CoreWeave CEO, once a crypto-mining bro, becomes a deca-billionaire
    Thursday, June 26
    • Home
    • Business
    • Health
    • Lifestyle
    • Politics
    • Science
    • Sports
    • World
    • Travel
    • Technology
    • Entertainment
    Mirror Brief
    Home»Technology»AI Agents Are Getting Better at Writing Code—and Hacking It as Well
    Technology

    AI Agents Are Getting Better at Writing Code—and Hacking It as Well

    By Emma ReynoldsJune 25, 2025No Comments3 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    AI Agents Are Getting Better at Writing Code—and Hacking It as Well
    Share
    Facebook Twitter LinkedIn Pinterest Email

    The latest artificial intelligence models are not only remarkably good at software engineering—new research shows they are getting ever-better at finding bugs in software, too.

    AI researchers at UC Berkeley tested how well the latest AI models and agents could find vulnerabilities in 188 large open source codebases. Using a new benchmark called CyberGym, the AI models identified 17 new bugs including 15 previously unknown, or “zero-day,” ones. “Many of these vulnerabilities are critical,” says Dawn Song, a professor at UC Berkeley who led the work.

    Many experts expect AI models to become formidable cybersecurity weapons. An AI tool from startup Xbow currently has crept up the ranks of HackerOne’s leaderboard for bug hunting and currently sits in top place. The company recently announced $75 million in new funding.

    Song says that the coding skills of the latest AI models combined with improving reasoning abilities are starting to change the cybersecurity landscape. “This is a pivotal moment,” she says. “It actually exceeded our general expectations.”

    As the models continue to improve they will automate the process of both discovering and exploiting security flaws. This could help companies keep their software safe but may also aid hackers in breaking into systems. “We didn’t even try that hard,” Song says. “If we ramped up on the budget, allowed the agents to run for longer, they could do even better.”

    The UC Berkeley team tested conventional frontier AI models from OpenAI, Google, and Anthropic, as well as open source offerings from Meta, DeepSeek, and Alibaba combined with several agents for finding bugs, including OpenHands, Cybench, and EnIGMA.

    The researchers used descriptions of known software vulnerabilities from the 188 software projects. They then fed the descriptions to the cybersecurity agents powered by frontier AI models to see if they could identify the same flaws for themselves by analyzing new codebases, running tests, and crafting proof-of-concept exploits. The team also asked the agents to hunt for new vulnerabilities in the codebases by themselves.

    Through the process, the AI tools generated hundreds of proof-of-concept exploits, and of these exploits the researchers identified 15 previously unseen vulnerabilities and two vulnerabilities that had previously been disclosed and patched. The work adds to growing evidence that AI can automate the discovery of zero-day vulnerabilities, which are potentially dangerous (and valuable) because they may provide a way to hack live systems.

    AI seems destined to become an important part of the cybersecurity industry nonetheless. Security expert Sean Heelan recently discovered a zero-day flaw in the widely used Linux kernel with help from OpenAI’s reasoning model o3. Last November, Google announced that it had discovered a previously unknown software vulnerability using AI through a program called Project Zero.

    Like other parts of the software industry, many cybersecurity firms are enamored with the potential of AI. The new work indeed shows that AI can routinely find new flaws, but it also highlights remaining limitations with the technology. The AI systems were unable to find most flaws and were stumped by especially complex ones.

    agents Codeand Hacking Writing
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleA New Pelorus Travel Tour Lets You Hunt for Sunken Treasure
    Next Article Shell denies takeover talks with UK rival BP
    Emma Reynolds
    • Website

    Emma Reynolds is a senior journalist at Mirror Brief, covering world affairs, politics, and cultural trends for over eight years. She is passionate about unbiased reporting and delivering in-depth stories that matter.

    Related Posts

    Technology

    Google tweaked its AI-powered Ask Photos feature and restarted its rollout

    June 26, 2025
    Technology

    In just 3 months, CoreWeave CEO, once a crypto-mining bro, becomes a deca-billionaire

    June 26, 2025
    Technology

    Disney Just Threw a Punch in a Major AI Fight

    June 26, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Medium Rectangle Ad
    Top Posts

    IBM Consulting hires EY veteran Andy Baldwin

    June 23, 202543 Views

    Masu Spring 2026 Menswear Collection

    June 24, 20258 Views

    Scientists Are Sending Cannabis Seeds to Space

    June 23, 20255 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews
    Travel

    36 Hours on the Outer Banks, N.C.: Things to Do and See

    Emma ReynoldsJune 19, 2025
    Science

    Huge archaeological puzzle reveals Roman London frescoes

    Emma ReynoldsJune 19, 2025
    Travel

    36 Hours on the Outer Banks, N.C.: Things to Do and See

    Emma ReynoldsJune 19, 2025

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Medium Rectangle Ad
    Most Popular

    IBM Consulting hires EY veteran Andy Baldwin

    June 23, 202543 Views

    Masu Spring 2026 Menswear Collection

    June 24, 20258 Views

    Scientists Are Sending Cannabis Seeds to Space

    June 23, 20255 Views
    Our Picks

    Google tweaked its AI-powered Ask Photos feature and restarted its rollout

    June 26, 2025

    A New Perimenopause Program Is Underway at Six Senses Douro Valley

    June 26, 2025

    Open-air attraction in County Durham wins Museum of the Year

    June 26, 2025
    Recent Posts
    • Google tweaked its AI-powered Ask Photos feature and restarted its rollout
    • A New Perimenopause Program Is Underway at Six Senses Douro Valley
    • Open-air attraction in County Durham wins Museum of the Year
    • One in four young people in England have mental health condition, NHS survey finds | Mental health
    • Liverpool sign Milos Kerkez as spending reaches £170m
    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Disclaimer
    • Get In Touch
    • Privacy Policy
    • Terms and Conditions
    © 2025 Mirror Brief. All rights reserved.

    Type above and press Enter to search. Press Esc to cancel.