Software vulnerability detection is hindered by the lack of scalable, automated repository-level datasets, which are essential for capturing realistic, interprocedural settings. Existing function-centric benchmarks fall short in mimicking real-world scenarios, leading to inadequate vulnerability detection. Recent repository-level security benchmarks have shown promise, but their manual curation is a significant limitation. A new approach is needed to create scalable, automated datasets that can effectively detect software vulnerabilities in complex, realistic environments. This is particularly crucial given the rise of state-aligned threat activity, which elevates the stakes from criminal to geopolitical, with implications extending far beyond the immediate target1. The development of such datasets is critical for improving the accuracy and efficiency of vulnerability detection, and ultimately, for enhancing the security of software systems.