About Me

I am currently a 4th-year PhD student at the Mathis Laboratory of Adaptive Intelligence at EPFL, working under the supervision of Professor Mackenzie Mathis.

My vision is to close the gap between biological systems and artificial systems. Indeed, with massive compute put in training deep nets, those models still suffer from skyrocketing energy consumption, lack of robustness and the lack of the ability to autonomously improve. My research aims is to make an efficient multi-modal agent that can autonomously improve. I ask, can we make a model that is energy efficient, robust, and be capable of autonomously evolving?

Biological brains use analog signal and spontaneously prune the weight connections among neurons. Can we make neural networks be more alike? Before I started my PhD, I led several works that try to make neural networks leverage sparse connections, less precision of weight representation without sacirficing performance. While progress is made there, I saw another emerging problem: neural networks seem to struggle between the compression and robustness. To that end, I then led works that ask whether we can achieve both by adversarial training and continuous domain adaptation without adapting model weights.

Then I started to realize the root cause of deep nets’ energy consumption and lack of robustness might come from the fact that neural networks need too much human intervention: what to learn, the architecture and inevitible train and test time gap. If we could build an agent that can autonomosly improve, then it should be able to find the most efficient ways to improve with minimal human intervention. To get inspired from biological systems from the neuroscience perspective, I started my doctoral study in a neuroscience lab.

During my doctoral study, I asked whether we can build an autonomous AI system in a bottom-up way. I first built the first foundation models for animal behavior analysis, which are efficient and robust. Then I built a LLM-based agent on top of such foundation models to automate animal behavior analysis. Now I am extending these to a multi-modality agent that can evolve over time. I believe we will soon see systems that can be more like biological systems with affordable resources.

My academic contributions span the domains of computer vision foundation models, LLM-based systems, the robustness of neural networks, and efficient neural network. My work has been recognized at conferences such as ICCV, NeurIPS ECCV, CVPR, ASPLOS, and DAC, and in journals like Nature Methods, Nature Communications, and TNNLs, accumulating over 2000 citations as of 2024. Additionally, I have served as a reviewer for ECCV, Nature Methods and Science.

Before I started PhD, I gained substantial industry experience as a senior algorithm engineer at Alibaba Group and as a researcher at the Institute for interdisciplinary (IIISCT) in Xi’an and Beijing. Earlier, I worked as a software engineer at Geonumerical Solutions.

News

Latest Updates

[April 2024]

My first-author paper, SuperAnimal models pretrained for plug-and-play analysis of animal behavior has been accepted by Nature Communications.
My co-authored paper, Keypoint-MoSeq: parsing behavior by linking point tracking to pose dynamics has been accepted by Nature Methods.

[October 2023]

My first-author paper, AmadeusGPT: a natural language interface for interactive animal behavioral analysis has been accepted by NeurIPS 2023.

[March 2022]

Co-authored paper Multi-animal pose estimation, identification and tracking with DeepLabCut was accepted by Nature Methods.
Co-authored paper on building more robust vision transformers was accepted by CVPR 2022.

[March 2021]

Two co-authored papers on the adversarial vulnerability of neural networks, accepted by CVPR 2021.

Education

M.S. in Computer Engineering, Syracuse University
- Academic Advisor: Prof. Yanzhi Wang
B.S. in Computer Engineering, Saint Louis University

Shaokai Ye