Fourier Feature-based CBAM and Vision Transformer for Text Detection in Drone Images

Roy, Ayush and Shivakumara, Palaiahnakote and Pal, Umapada and Mokayed, Hamam and Liwicki, Marcus (2023) Fourier Feature-based CBAM and Vision Transformer for Text Detection in Drone Images. In: Document Analysis and Recognition - ICDAR 2023 Workshops, Pt II, 24-26 August 2023, San Jose, California.

Full text not available from this repository.
Official URL: https://doi.org/10.1007/978-3-031-41501-2_18

Abstract

The use of drones for several real-world applications is increasing exponentially, especially for the purpose of monitoring, surveillance, security, etc. Most existing scene text detection methods were developed for normal scene images. This work aims to develop a model for detecting text in drone as well as scene images. To reduce the adverse effects of drone images, we explore the combination of Fourier transform and Convolutional Block Attention Module (CBAM) to enhance the degraded information in the images without affecting high-contrast images. This is because the above combination helps us to extract prominent features which represent text irrespective of degradations. Therefore, the refined features extracted from the Fourier Contouring Network (FCN) are supplied to Vision Transformer, which uses the ResNet50 as a backbone and encoder-decoder for text detection in both drone and scene images. Hence, the model is called Fourier Transform based Transformer. Experimental results on drone datasets and benchmark datasets, namely, Total-Text and ICDAR 2015 of natural scene text detection show the proposed model is effective and outperforms the state-of-the-art models.

Item Type: Conference or Workshop Item (Paper)
Funders: Technology Innovation Hub (TIH), Indian Statistical Institute. Kolkata
Additional Information: 17th International Conference on Document Analysis and Recognition Workshop (ICDAR), San Jose, CA, AUG 24-26, 2023
Uncontrolled Keywords: Scene text detection; Drone images; Deep learning; Transformer; Detection transformer
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Faculty of Computer Science & Information Technology
Depositing User: Ms. Juhaida Abd Rahim
Date Deposited: 16 Jan 2025 08:09
Last Modified: 16 Jan 2025 08:09
URI: http://eprints.um.edu.my/id/eprint/47679

Actions (login required)

View Item View Item