Security researcher spends $1,500 testing if LLMs can hack a deliberately vulnerable app
By
jc4p
Toasted golden, schmeared with insight. Top of the rack.
Summary
A security researcher built a deliberately vulnerable React Native book review app to test whether large language models (LLMs) could successfully hack it and find a hidden flag in private user reviews. The experiment cost $1,500 in API credits across multiple LLM runs. The article details the challenge setup, the exploit methodology, and compares how different models performed at reproducing a common class of security exploits the researcher has encountered in real-world apps.
Key quotes
· 3 pulledAs a part of my work I do security research for various apps and websites.
I wanted to see if LLMs could reproduce a common class of exploits I've found in multiple apps.
I made a fake React Native app in Expo and a backend in Python.
You might also wanna read
Study Finds LLM Poisoning Attacks Require Only ~250 Documents Regardless of Model Size
This research paper demonstrates that poisoning attacks on large language models (LLMs) require a near-constant number of poisoned documents
AMD denies $10,000 bug bounty to researcher who found critical auto-updater RCE vulnerability
AMD denied a security researcher a $10,000 bug bounty after the researcher discovered and reported a critical remote code execution vulnerab
GitHub and Microsoft reduce false positives in secret scanning using context-aware LLM reasoning
GitHub collaborated with Microsoft Security & AI's Agents Offense team to reduce false positives in secret scanning at scale. By using conte

Study finds large language models vulnerable to classic persuasion tactics for harmful requests
This study tested whether three widely used large language models (LLMs) are susceptible to classic persuasion principles (authority, social
Prompt Injection Attacks on AI: Understanding the Threat and Defending Your LLM Applications
This article discusses prompt injection as a critical security vulnerability targeting large language models (LLMs) and AI-powered applicati
undercodetesting.com·3d agoLaunchSafe: AI-Powered Penetration Testing Platform for Application Security
LaunchSafe offers AI-powered penetration testing that uses autonomous agents to actively attempt to hack applications across both code and l
