Mussola: Following The Pattern: Scene Text Spotting Guided by Regular Expressions. Sergi García Bordils will defend his PhD thesis

What is the thesis about?

Scene-Text Recognition (STR) is a sub-field of computer vision that tackles the problem of text localization and recognition in natural images. Since scene-text provides crucial semantic information for high-level tasks, continued research interest has resulted in great leaps in performance. Much of this success is thanks to the surge of deep learning, which has significantly pushed the capabilities of STR models. However, these models adopt a purely generic approach toward text extraction, where all text is treated indistinctively and the possible semantics of the textual content are ignored. We identify and study two main disadvantages which are the consequence of this generic nature.

The first one is the reliance on vocabulary priors by the recognition step, which can degrade recognition performance on unseen words and morphological constructions. The second one is related to the \textit{detection granularity}, which we define as the boundary at which the network separates text into individual instances. Most networks establish this localization boundary at word level. If our downstream application requires textual expressions that feature spaces or line breaks, generic STR detectors will split it into different instances.

Quan: 21/10/2024

Més informació

Més posts d'Esdeveniments