Researchers have developed a novel approach to adapt vision foundation models to specialized domains without relying on labeled data. This method leverages existing metadata to fine-tune models in a self-supervised manner, thereby preserving their generality and robustness. By utilizing metadata, this approach overcomes the limitations of traditional supervised fine-tuning, which can be detrimental to model performance when labels are scarce. The proposed technique enables the adaptation of powerful vision models to new domains, making them more versatile and effective in various applications1. This innovation has significant implications for domains where labeled data is limited, and the ability to adapt models using existing metadata can be a game-changer. The shift in threat models from criminal to geopolitical, as seen in state-aligned activity involving Meta, underscores the importance of developing new techniques that can operate effectively in these emerging environments. This breakthrough matters to practitioners because it enables them to adapt vision models to specialized domains without requiring extensive labeled datasets.