In a study that could reshape how researchers study neurodevelopmental disorders, scientists at Korea Advanced Institute of Science and Technology (KAIST) have unveiled an artificial‑intelligence system that translates the subtle dance of a mouse’s body into a language the machine can understand.

The system, named BehaVERT, was introduced in a March 24 paper in the International Journal of Computer Vision. BehaVERT treats the skeletal positions of a mouse’s nose, ears, spine, limbs and tail as a sequence of tokens—numerical symbols that serve as the raw material for transformer‑based language models. These tokens feed into a BERT‑style transformer, a neural architecture originally designed for natural‑language processing. Remarkably, the model was trained on raw video recordings of mice, without any prior biological annotations.

When applied to mice engineered to carry a deletion of the Shank3B gene—a genetic alteration that models autism spectrum disorder (ASD) in humans—the AI could reliably separate the altered mice from their normal counterparts. In its analysis, the model singled out “oral‑oral contact” as a key differentiator, a finding that echoes earlier behavioral studies showing that Shank3B knockout mice approach conspecifics but exhibit reduced direct social interaction.

According to the paper, BehaVERT outperformed existing methods across five international benchmarks covering social interaction, multi‑animal behavior, 3D motion analysis, and autism‑related behavior analysis. The model achieved higher accuracy than previous approaches on each of these evaluation sets.

Lead author Kim Dae‑soo, a professor in KAIST’s Department of Brain and Cognitive Sciences, described the system as moving “beyond simply classifying behavior to understanding its meaning.” He added that the tool could prove useful for drug development, psychiatric research, and behavioral genetics.

The study received funding from Korea’s Ministry of Science and ICT and the National Research Foundation of Korea. It was published on March 24, 2026, and the article was produced with assistance from generative AI and edited by The Korea Times.

BehaVERT represents a novel application of transformer models to non‑linguistic data. By converting sequences of body‑joint positions into tokens, the system learns contextual relationships between movements, mirroring how language models learn relationships between words. This approach could be extended to other animal species or to human motion‑capture data.

In the broader context of ASD research, mouse models such as the Shank3B knockout are widely used to probe the neurobiological underpinnings of social deficits. Traditional behavioral assays rely on manual scoring or simple automated metrics, which can miss subtle patterns. BehaVERT’s ability to identify specific movement signatures without pre‑defined features may accelerate the discovery of biomarkers and therapeutic targets.

The paper also notes that the model’s performance was validated against five established datasets, indicating robustness across different experimental conditions. While the study focused on a single genetic model, the authors suggest that the framework could be applied to other genetic or pharmacological manipulations.

Future work will likely involve testing the system on larger cohorts and exploring its applicability to other behavioral phenotypes. The researchers also plan to release the code and datasets to the scientific community to facilitate replication and further development.

At present, BehaVERT remains a research prototype. Its potential impact on drug discovery pipelines, clinical trials, and basic neuroscience will become clearer as additional studies confirm its predictive power and generalizability.

The development underscores the growing intersection of machine learning and behavioral neuroscience, offering a new tool that translates complex motion into interpretable patterns. As AI models continue to evolve, such interdisciplinary applications may become increasingly common.