Jiahui Zhang
Jiahui Zhang portrait

Research Assistant · Computer Vision Lab · University of Virginia

  • School of Engineering & Applied Science · Department of Computer Science
  • Email: jiahui9923 [at] gmail [dot] com
About Me

I am a Research Assistant in the Computer Vision Lab at the University of Virginia, working under the guidance of Prof. Zezhou Cheng. My research focuses on 3D computer vision, particularly multi-frame scene understanding, geometry-grounded perception, and transformer-based approaches for large-scale spatial reasoning. Broadly, I am interested in how visual systems integrate temporal, geometric, and structural information to form coherent 3D representations of complex environments. I aim to advance fundamental methods in visual scene modeling and contribute to applications in ecology, robotics, and large-scale environmental analysis.

Recent Updates
  • Expected graduation from University of Virginia with M.S. in Computer Science.

  • Started working as a Research Assistant in the Computer Vision Lab at UVA.

  • Graduated from Portland State University.

Projects
Multi-Frame 3D Perception visualization

Multi-Frame 3D Perception for Large-Scale Video Sequences

An end-to-end multi-frame 3D perception system developed to leverage temporal consistency across large video datasets. The pipeline introduces SE(3)-based sequence sampling, high-throughput data processing for the CA-1M dataset, and a unified training framework that integrates multi-objective losses and Hungarian matching. The system establishes a foundation for robust temporal reasoning and large-scale 3D scene reconstruction. [Preparing for submission to ECCV 2026]

3D Perception Temporal Consistency SE(3) Hungarian Matching
Tree point cloud segmentation

Transformer-Based Tree Point Cloud Segmentation

A transformer-based framework designed for joint semantic and instance segmentation of large-scale tree point clouds. The approach incorporates hierarchical geometric priors that reflect biological tree structure and employs a sparse 3D transformer to model long-range spatial dependencies. A query-based decoding head enables scalable instance prediction without clustering, offering an efficient solution for forestry-scale 3D understanding.

Transformer Point Cloud 3D Segmentation Geometric Priors