Accurate Scene Text Detection via Scale-Aware Data Augmentation and Shape Similarity Constraint

None, None; None, None; None, None; None, None; None, None

Accurate Scene Text Detection via Scale-Aware Data Augmentation and Shape Similarity Constraint

Journal Article (2021)

Author(s)

Pengwen Dai (Chinese Academy of Sciences)

Yang Li (TU Delft - Algorithmics)

Hua Zhang (Chinese Academy of Sciences)

Jingzhi Li (Chinese Academy of Sciences)

Xiaochun Cao (Chinese Academy of Sciences)

Research Group

Algorithmics

Copyright

DOI related publication

https://doi.org/10.1109/TMM.2021.3073575

Image segmentation Agriculture Training Shape Data augmentation Location awareness Training data Accurate localization Arbitrary shape Global context Proposals Scene text detection Text part

To reference this document use:

https://resolver.tudelft.nl/uuid:8e458d2b-4c9a-4777-b4b4-6bc9085753c3

More Info

expand_more

Publication Year

2021

Language

English

Copyright

Research Group

Algorithmics

Volume number

24

Pages (from-to)

1883-1895

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Scene text detection has attracted increasing concerns with the rapid development of deep neural networks in recent years. However, existing scene text detectors may overfit on the public datasets due to the limited training data, or generate inaccurate localization for arbitrary-shape scene texts. This paper presents an arbitrary-shape scene text detection method that can achieve better generalization ability and more accurate localization. We first propose a Scale-Aware Data Augmentation (SADA) technique to increase the diversity of training samples. SADA considers the scale variations and local visual variations of scene texts, which can effectively relieve the dilemma of limited training data. At the same time, SADA can enrich the training minibatch, which contributes to accelerating the training process. Furthermore, a Shape Similarity Constraint (SSC) technique is exploited to model the global shape structure of arbitrary-shape scene texts and backgrounds from the perspective of the loss function. SSC encourages the segmentation of text or non-text in the candidate boxes to be similar to the corresponding ground truth, which is helpful to localize more accurate boundaries for arbitrary-shape scene texts. Extensive experiments have demonstrated the effectiveness of the proposed techniques, and state-of-the-art performances are achieved over public arbitrary-shape scene text benchmarks (e.g., CTW1500, Total-Text, and ArT).

Files

09405411.pdf

(pdf | 3.38 Mb)

License info not available