2

A Survey of Vision and Language Related Multi-modal Task

CAAI Artificial Intelligence Research, 2023

A Multi-granularity Feature Fusion Model for Pedestrian Attribute Recognition

The International Conference on Digital Image Computing Techniques and Applications (DICTA, Oral), 2022

CrossDet++: Growing Crossline Representation for Object Detection

IEEE Transactions on Circuits and Systems for Video Technology, 2022

Real-time panoptic segmentation with relationship between adjacent pixels and boundary prediction

Neurocomputing, 2022

What Happens in Crowd Scenes: A New Dataset about Crowd Scenes for Image Captioning

IEEE Transactions on Multimedia, 2022

POS-trends Dynamic-Aware Model for Video Caption

IEEE Transactions on Circuits and Systems for Video Technology, 2021