Publications

You can also find my articles on my Google Scholar profile.

Conference Papers


DPSeg: Dual-Prompt Cost Volume Learning for Open-Vocabulary Semantic Segmentation

Published in CVPR, 2025

This paper presents DPSeg, a dual-prompt framework for open-vocabulary semantic segmentation that integrates both visual and textual prompts to generate spatial-semantic cost volumes. A multi-scale cost volume-guided decoder and a semantic-guided prompt refinement strategy are introduced to enhance spatial detail and alignment. The method significantly improves segmentation accuracy across diverse benchmarks by effectively mitigating the domain gap between image and text embeddings.

Download Paper

Crossmodal Few-shot 3D Point Cloud Semantic Segmentation via View Synthesis

Published in ACM Multimedia, 2024

This paper introduces a cross-modal few-shot approach for 3D point cloud segmentation, using multi-view synthesis with color and depth inpainting to address occlusions and reduce reliance on 3D annotations. A Co-embedding Network aligns features between synthesized views and original 3D data, while a weighted prototype network enhances segmentation performance.

Download Paper

Few-Shot 3D Point Cloud Semantic Segmentation via Stratified Class-Specific Attention Based Transformer Network

Published in AAAI, 2023

This paper presents a multi-layer transformer network for few-shot 3D point cloud semantic segmentation, addressing limitations in computational complexity and fine-grained relationship learning in existing methods. By aggregating query point cloud features with class-specific support features at multiple scales and avoiding pooling, our approach fully utilizes pixel-level support features.

Download Paper

Crossmodal few-shot 3d point cloud semantic segmentation

Published in ACM Multimedia, 2022

This paper introduces a cross-modal few-shot approach for 3D point cloud segmentation that uses labeled 2D images instead of 3D annotations. By converting 2D images to 3D format and employing a co-embedding network, the method achieves effective segmentation through prototype-based cosine similarity, performing competitively on benchmarks with minimal labeled 2D support.

Download Paper

Leveraging Adaptive Implicit Presentation Mapping for Ultra High-Resolution Image Segmentation

Published in arXiv, 2022

This paper proposes a novel Adaptive Implicit Representation Mapping (AIRM) approach for ultra-high-resolution image segmentation, addressing limitations in current CNN-based IRM methods. Our method includes an Affinity Empowered Encoder (AEE) with transformer architecture to capture long-distance semantic information and an Adaptive Implicit Representation Mapping Function (AIRMF) that dynamically translates pixel-wise features while preserving global context.

Download Paper