Journal Article


Feature boosting with efficient attention for scene parsing

Abstract

The complexity of scene parsing grows with the number of object and scene classes, which is higher in unrestricted open scenes. The biggest challenge is to model the spatial relation between scene elements while succeeding in identifying objects at smaller scales. This paper presents a novel feature-boosting network that gathers spatial context from multiple levels of feature extraction and computes the attention weights for each level of representation to generate the final class labels. A novel ‘channel attention module’ is designed to compute the attention weights, ensuring that features from the relevant extraction stages are boosted while the others are attenuated. The model also learns spatial context information at low resolution to preserve the abstract spatial relationships among scene elements and reduce computational cost. Spatial attention is subsequently concatenated into a final feature set before applying feature boosting. Low-resolution spatial attention features are trained using an auxiliary task that help to learn a coarse global scene structure. The proposed model outperforms all state-of-theart models on both the ADE20K and the Cityscapes datasets.



The fulltext files of this resource are currently embargoed.
Embargo end: 2025-07-24

Authors

Singh, Vivek
Sharma, Shailza
Cuzzolin, Fabio

Oxford Brookes departments

School of Engineering, Computing and Mathematics

Dates

Year of publication: 2024
Date of RADAR deposit: 2024-10-22


Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License


Related resources

This RADAR resource is the Accepted Manuscript of Feature boosting with efficient attention for scene parsing

Details

  • Owner: Daniel Croft (removed)
  • Collection: Outputs
  • Version: 1 (show all)
  • Status: Live
  • Views (since Sept 2022): 49