Huiyu Wang


I am a research scientist at FAIR, Meta. Before that I received my Ph.D. in Computer Science at Johns Hopkins University, advised by Bloomberg Distinguished Professor Alan Yuille. I obtained M.S. in Electrical Engineering from University of California, Los Angeles and B.S. in Information Engineering from Shanghai Jiao Tong University. I also spent two years as a student researcher at Google. I had wonderful summers at Allen Institute for Artificial Intelligence and TuSimple.

My research interest is computer vision with focuses on recognition and segmentation of images and videos.

Hiring: We are recruiting research interns interested in video understanding, generation, and vision-language. If you are interested in the opportunities, please email me for details.

News

  • Ego4D Goal-Step and HT-Step presented at NeurIPS 2023 in New Orleans.
  • Ego-Only, DiffMAE, and SMAUG presented remotely at ICCV 2023, Paris, France.
  • DMAE presented at CVPR 2023.
  • 3 / 3 submissions accepted to ECCV 2022.
  • 3 / 3 submissions accepted to CVPR 2022.
  • iBOT for masked image modeling with an online tokenizer is accepted to ICLR 2022.
  • DeepLab2 has been released, with MaX-DeepLab and Axial-DeepLab officially re-implemented in TensorFlow2.
  • MaX-DeepLab, accepted to CVPR 2021, proposes Mask Xformers for end-to-end panoptic segmentation.
  • Axial-DeepLab, the first architecture with global attention in all layers, is accepted to ECCV 2020.

Selected Publications

Ego-Only: Egocentric Action Detection without Exocentric Transferring
Huiyu Wang, Mitesh Kumar Singh, Lorenzo Torresani
In International Conference on Computer Vision (ICCV), 2023
arXiv | poster | slides | video

Diffusion Models as Masked Autoencoders
Chen Wei, Karttikeya Mangalam, Po-Yao Huang, Yanghao Li, Haoqi Fan, Hu Xu, Huiyu Wang, Cihang Xie, Alan Yuille, Christoph Feichtenhofer
In International Conference on Computer Vision (ICCV), 2023
arXiv | project

k-means Mask Transformer
Qihang Yu, Huiyu Wang, Siyuan Qiao, Maxwell Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
In European Conference on Computer Vision (ECCV), 2022
arXiv | code | Google AI blog

iBOT: Image BERT Pre-Training with Online Tokenizer
Jinghao Zhou, Chen Wei, Huiyu Wang, Wei Shen, Cihang Xie, Alan Yuille, Tao Kong
In International Conference on Learning Representations (ICLR), 2022
arXiv | code

MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers
Huiyu Wang, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
In Conference on Computer Vision and Pattern Recognition (CVPR), 2021
arXiv | code | poster | slides | video | Google AI blog

Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
In European Conference on Computer Vision (ECCV), 2020 (Spotlight)
arXiv | official code | PyTorch code | slides | video | Google AI blog

ELASTIC: Improving CNNs with Dynamic Scaling Policies
Huiyu Wang, Aniruddha Kembhavi, Ali Farhadi, Alan Yuille, Mohammad Rastegari
In Conference on Computer Vision and Pattern Recognition (CVPR), 2019 (Oral)
arXiv | code | poster | video

Full List of Publications

Ego4D Goal-Step: Toward Hierarchical Understanding of Procedural Activities
Yale Song, Gene Byrne, Tushar Nagarajan, Huiyu Wang, Miguel Martin, Lorenzo Torresani
In Conference on Neural Information Processing Systems (NeurIPS), 2023 (Spotlight)
paper

HT-Step: Aligning Instructional Articles with How-To Videos
Triantafyllos Afouras, Effrosyni Mavroudi, Tushar Nagarajan, Huiyu Wang, Lorenzo Torresani
In Conference on Neural Information Processing Systems (NeurIPS), 2023
paper

Ego-Only: Egocentric Action Detection without Exocentric Transferring
Huiyu Wang, Mitesh Kumar Singh, Lorenzo Torresani
In International Conference on Computer Vision (ICCV), 2023
arXiv | poster | slides | video

Diffusion Models as Masked Autoencoders
Chen Wei, Karttikeya Mangalam, Po-Yao Huang, Yanghao Li, Haoqi Fan, Hu Xu, Huiyu Wang, Cihang Xie, Alan Yuille, Christoph Feichtenhofer
In International Conference on Computer Vision (ICCV), 2023
arXiv | project

SMAUG: Sparse Masked Autoencoder for Efficient Video-Language Pre-training
Yuanze Lin, Chen Wei, Huiyu Wang, Alan Yuille, Cihang Xie
In International Conference on Computer Vision (ICCV), 2023
arXiv

Masked Autoencoders Enable Efficient Knowledge Distillers
Yutong Bai, Zeyu Wang, Junfei Xiao, Chen Wei, Huiyu Wang, Alan Yuille, Yuyin Zhou, Cihang Xie
In Conference on Computer Vision and Pattern Recognition (CVPR), 2023
arXiv | code

k-means Mask Transformer
Qihang Yu, Huiyu Wang, Siyuan Qiao, Maxwell Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
In European Conference on Computer Vision (ECCV), 2022
arXiv | code | Google AI blog

CP2: Copy-Paste Contrastive Pretraining for Semantic Segmentation
Feng Wang, Huiyu Wang, Chen Wei, Alan Yuille, Wei Shen
In European Conference on Computer Vision (ECCV), 2022
arXiv | code

In Defense of Image Pre-Training for Spatiotemporal Recognition
Xianhang Li, Huiyu Wang, Chen Wei, Jieru Mei, Alan Yuille, Yuyin Zhou, Cihang Xie
In European Conference on Computer Vision (ECCV), 2022
arXiv | code

CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation
Qihang Yu, Huiyu Wang, Dahun Kim, Siyuan Qiao, Maxwell Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
In Conference on Computer Vision and Pattern Recognition (CVPR), 2022 (Oral)
arXiv | video | Google AI Blog

TubeFormer-DeepLab: Video Mask Transformer
Dahun Kim, Jun Xie, Huiyu Wang, Siyuan Qiao, Qihang Yu, Hong-Seok Kim, Hartwig Adam, In So Kweon, Liang-Chieh Chen
In Conference on Computer Vision and Pattern Recognition (CVPR), 2022
arXiv | visualization

A Simple Data Mixing Prior for Improving Self-Supervised Learning
Sucheng Ren, Huiyu Wang, Zhengqi Gao, Shengfeng He, Alan Yuille, Yuyin Zhou, Cihang Xie
In Conference on Computer Vision and Pattern Recognition (CVPR), 2022
arXiv | code

On Modeling Long-Range Dependencies for Visual Perception
Huiyu Wang
Ph.D. thesis, Johns Hopkins University, 2022
dissertation

iBOT: Image BERT Pre-Training with Online Tokenizer
Jinghao Zhou, Chen Wei, Huiyu Wang, Wei Shen, Cihang Xie, Alan Yuille, Tao Kong
In International Conference on Learning Representations (ICLR), 2022
arXiv | code

Searching for TrioNet: Combining Convolution with Local and Global Self-Attention
Huaijin Pi, Huiyu Wang, Yingwei Li, Zizhang Li, Alan Yuille
In British Machine Vision Conference (BMVC), 2021
arXiv | code

DeepLab2: A TensorFlow Library for Deep Labeling
Mark Weber*, Huiyu Wang*, Siyuan Qiao*, Jun Xie, Maxwell D. Collins, Yukun Zhu, Liangzhe Yuan, Dahun Kim, Qihang Yu, Daniel Cremers, Laura Leal-Taixe, Alan L. Yuille, Florian Schroff, Hartwig Adam, Liang-Chieh Chen
In arXiv preprint, 2021
arXiv | code

MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers
Huiyu Wang, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
In Conference on Computer Vision and Pattern Recognition (CVPR), 2021
arXiv | code | poster | slides | video | Google AI blog

SpecTr: Spectral Transformer for Hyperspectral Pathology Image Segmentation
Boxiang Yun, Yan Wang, Jieneng Chen, Huiyu Wang, Wei Shen, Qingli Li
In arXiv preprint, 2021
arXiv | code

CO2: Consistent Contrast for Unsupervised Visual Representation Learning
Chen Wei, Huiyu Wang, Wei Shen, Alan Yuille
In International Conference on Learning Representations (ICLR), 2021
arXiv

Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
In European Conference on Computer Vision (ECCV), 2020 (Spotlight)
arXiv | official code | PyTorch code | slides | video | Google AI blog

Scaling Wide Residual Networks for Panoptic Segmentation
Liang-Chieh Chen, Huiyu Wang, Siyuan Qiao
In arXiv preprint, 2020
arXiv | code

Rethinking Normalization and Elimination Singularity in Neural Networks
Siyuan Qiao, Huiyu Wang, Chenxi Liu, Wei Shen, Alan Yuille
In arXiv preprint, 2019
arXiv | code

Combining Compositional Models and Deep Networks For Robust Object Classification under Occlusion
Adam Kortylewski, Qing Liu, Huiyu Wang, Zhishuai Zhang, Alan Yuille
In Winter Conference on Applications of Computer Vision (WACV), 2020 (Spotlight)
arXiv

Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval
Qing Liu, Lingxi Xie, Huiyu Wang, Alan Yuille
In International Conference on Computer Vision (ICCV), 2019
arXiv | code

Weight Standardization
Siyuan Qiao, Huiyu Wang, Chenxi Liu, Wei Shen, Alan Yuille
In arXiv preprint, 2019
arXiv | code

ELASTIC: Improving CNNs with Dynamic Scaling Policies
Huiyu Wang, Aniruddha Kembhavi, Ali Farhadi, Alan Yuille, Mohammad Rastegari
In Conference on Computer Vision and Pattern Recognition (CVPR), 2019 (Oral)
arXiv | code | poster | video

Semantic Mapping for Safe and Comfortable Navigation of a Brain-Controlled Wheelchair
Zhixuan Wei, Weidong Chen, Jingchuan Wang, Huiyu Wang, Kang Li
In International Conference on Intelligent Robotics and Applications (ICIRA), 2013
paper

Novelty

Novelty (Noah) is my cat. He is a core contributor featured in the Ego How-To research by Meta:

Noah

Here are some images reconstructed with DiffMAE (original, masked, generated):

Noah Noah Noah Noah Noah Noah Noah