Paper Image

Event-based open-vocabulary scene parsing

Published on:

8 May 2024

Primary Category:

Computer Vision and Pattern Recognition

Paper Authors:

Lingdong Kong,

Youquan Liu,

Lai Xing Ng,

Benoit R. Cottereau,

Wei Tsang Ooi

Bullets

Key Details

Transfers knowledge from images and text to events

Performs segmentation on open-ended textual queries

Uses contrastive learning and text embedding alignment

Achieves state-of-the-art accuracy with no event labels

AI generated summary

Event-based open-vocabulary scene parsing

This paper introduces OpenESS, a method to perform event-based semantic segmentation with open-ended textual queries instead of fixed labels. It transfers knowledge from image and text models to event data, allowing segmentation of new categories without retraining. Key techniques include contrastive learning between events and image regions, and optimizing event embeddings to match text meanings.

Answers from this paper

Comments

No comments yet, be the first to start the conversation...

Sign up to comment on this paper

Sign Up