DPSeg: Dual-Prompt Cost Volume Learning for Open-Vocabulary Semantic Segmentation
Published in CVPR, 2025
This paper presents DPSeg, a dual-prompt framework for open-vocabulary semantic segmentation that integrates both visual and textual prompts to generate spatial-semantic cost volumes. A multi-scale cost volume-guided decoder and a semantic-guided prompt refinement strategy are introduced to enhance spatial detail and alignment. The method significantly improves segmentation accuracy across diverse benchmarks by effectively mitigating the domain gap between image and text embeddings.