Guofeng Zhang

I am currently a 3rd year PhD student in the department of Computer Science, Johns Hopkins University. I am a member of CCVL, advised by Bloomberg Distinguished Professor Dr. Alan Yuille. My current research focus is on Generative AI and 3D vision. Previously, I received my Bachelor degree with a double major in Computer Science and Applied Mathematics from UCLA in 2023. During my undergraduate years, I spent a good time with prof. Cho-Jui Hsieh and prof. M. Khalid Jawed.

Google Scholar / Email: zhangguofeng1123@gmail.com

I have also spent great time at Bytedance as research intern.

I am currently looking for internship opportunity Summer 2026, feel free to contact me if you are interested!

Selected Publications

	HECTOR: Hybrid Editable Compositional Object References for Video Generation Guofeng Zhang, Angtian Wang, Jacob Zhiyuan Fang, Liming Jiang, Haotian Yang, Alan Yuille, and Chongyang Ma International Conference on Machine Learning (ICML), 2026 paper / bibtex A video generation pipeline that enables fine-grained, explicit control over individual scene elements by combining static image and dynamic video references with precise spatiotemporal trajectories.
	PASR: Pose-Aware 3D Shape Retrieval from Occluded Single Views Jiaxin Shi, Guofeng Zhang, Wufei Ma, Naifu Liang, Adam Kortylewski, and Alan Yuille Project Lead The IEEE/CVF Conference on Computer Vision and Pattern Recognition Findings (CVPR findings), 2026 paper / bibtex A robust 3D shape retrieval framework that uses an interpretable analysis-by-synthesis approach to jointly optimize for object shape and pose by aligning 3D features with a 2D foundation model.
	TGT: Text-Grounded Trajectories for Locally Controlled Video Generation Guofeng Zhang, Angtian Wang, Jacob Zhiyuan Fang, Liming Jiang, Haotian Yang, Bo Liu, Yiding Yang, Guang Chen, Longyin Wen, Alan Yuille, and Chongyang Ma The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026 paper / bibtex A Text-to-Video framework that pairs point trajectories with localized text descriptions to enable precise control over subject appearance and motion.
	X-LRM: X-ray Large Reconstruction Model for Extremely Sparse-View Computed Tomography Recovery in One Second Guofeng Zhang, Ruyi Zha, Hao He, Yixun Liang, Alan Yuille, Hongdong Li, and Yuanhao Cai International Conference on 3D Vision (3DV), 2025 paper / bibtex A feedforward method and a large-scale dataset for instant CT reconstruction
	Scaling 3D Compositional Models for Robust Classification and Pose Estimation Xiaoding Yuan, Guofeng Zhang, Prakhar Kaushik, Artur Jesslen, Adam Kortylewski, and Alan Yuille International Conference on Computer Vision (ICCV), 2025* paper / bibtex Large-scale robust classification and 3D pose estimation model applicable to 200+ rigid object categories.
	Development and evaluation of a computer vision algorithm for quantification of children’s microactivities Sara Lupolt, Guofeng Zhang, Jiahao Wang, Qihao Liu, Zhang Yi, Jiawei Peng, Xingrui Wang, Xiaoding Yuan, Yin Oscar, Alan Yuille, and Keeve E. Nachman. Journal of Exposure Science and Environmental Epidemiology, 2025 paper New baby-video dataset and 3D keypoints detections of baby in real-world.
	ImageNet3D: Towards General-Purpose Object-Level 3D Understanding Wufei Ma, Guofeng Zhang, Qihao Liu, Guanning Zeng, Adam Kortylewski, Yaoyao Liu, and Alan Yuille Advances in Neural Information Processing Systems (NeurIPS, Dataset Track), 2024 paper / bibtex Large scale rigid object dataset with 3D information.
	NOVUM: Neural Object Volumes for Robust Object Classification Artur Jesslen, Guofeng Zhang, Angtian Wang, Wufei Ma, Alan Yuille, and Adam Kortylewski The European Conference on Computer Vision (ECCV), 2024 paper / bibtex Robust and fast Classification and 3D-Pose estimation of objects in single-pass.
	Deep-CNN based Robotic Multi-Class Under-Canopy Weed Control in Precision Farming Yayun Du, Guofeng Zhang, Darren Tsang, and Mohammad Khalid Jawed International Conference on Robotics and Automation (ICRA), 2022 paper / bibtex A real-world depolyable precision farming pipeline with weeds dataset.