Event-based open-vocabulary scene parsing

Paper Title:

OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies

Published on:

8 May 2024

Primary Category:

Computer Vision and Pattern Recognition

Paper Authors:

Lingdong Kong,

Youquan Liu,

Lai Xing Ng,

Benoit R. Cottereau,

Wei Tsang Ooi

Bullets

Key Details

•

Transfers knowledge from images and text to events

•

Performs segmentation on open-ended textual queries

•

Uses contrastive learning and text embedding alignment

•

Achieves state-of-the-art accuracy with no event labels

Explore the topics in this paper

contrastive learning

event-based segmentation

multimodal embeddings

open-ended queries

transfer learning

AI generated summary

Event-based open-vocabulary scene parsing

This paper introduces OpenESS, a method to perform event-based semantic segmentation with open-ended textual queries instead of fixed labels. It transfers knowledge from image and text models to event data, allowing segmentation of new categories without retraining. Key techniques include contrastive learning between events and image regions, and optimizing event embeddings to match text meanings.