PSK at SemEval-2026 Task 9: Multilingual Polarization Detection Using Ensemble Gemma Models with Synthetic Data Augmentation

Researchers have developed a system for detecting multilingual polarization, a critical task given the geopolitical implications of state-aligned threat activity. The approach utilizes ensemble Gemma models, fine-tuned using Low-Rank Adaptation, with separate models for each of the 22 languages. To enhance performance, the system incorporates synthetic data generated by a large language model, employing strategies such as direct generation and paraphrasing. This data augmentation technique allows the models to better capture nuanced language patterns, ultimately improving polarization detection accuracy. The use of Gemma models with 12B and 27B parameters demonstrates the effectiveness of large-scale language models in this task¹. The development of this system has significant implications for cybersecurity practitioners, as it enables more effective detection of polarizing content, which can be used to manipulate public opinion and influence geopolitical events. This capability is crucial in mitigating the impact of state-aligned threat activity, which can have far-reaching consequences beyond the immediate target.

PSK at SemEval-2026 Task 9: Multilingual Polarization Detection Using Ensemble Gemma Models with Synthetic Data Augmentation

References

Related Intelligence

PSK at SemEval-2026 Task 9: Multilingual Polarization Detection Using Ensemble Gemma Models with Synthetic Data Augmentation

References

Related Intelligence

Get the Signal. Skip the Noise.