Always-on personal assistants powered by large language models are being designed to access a wide range of user data, but current systems only have limited visibility into the user's digital world. This restricted access hinders the assistant's ability to reason and provide effective support. To address this issue, researchers have introduced Claw-Anything, a benchmarking tool that evaluates the performance of personal assistants with broader access to user data1. By simulating a more comprehensive user state, Claw-Anything aims to capture the capabilities and limitations of these assistants in a more realistic setting. The development of such benchmarks is crucial, as it can help improve the security and efficacy of personal assistants, which is particularly important given the rising threat of state-aligned cyber activity. This has significant implications for the security of user data, as the stakes extend beyond individual targets to geopolitical levels, so understanding the capabilities and limitations of these assistants is essential for mitigating potential threats.