I will start by introducing DTLR, our general approach for recognizing text lines, whether printed (OCR) or handwritten (HTR), using Latin, Chinese, or ciphered characters. Most HTR methods have focused on autoregressive decoding, which predicts characters one at a time. In contrast, DTLR processes the entire line at once. Our method shows strong results across various scripts, even those typically addressed by specialized techniques. We have achieved state-of-the-art performance for Chinese script recognition on the CASIA v2 dataset and for cipher recognition on the Borg and Copiale datasets.
Eventually, I will discuss my future project and briefly introduce the Learnable Type Writer (Best Paper Award ICDAR 2024), a generative approach to character analysis and recognition developed by my colleague Ioannis Siglidis from ENPC.
Short bio:
Raphael Baena is a postdoctoral researcher in computer vision at École des Ponts ParisTech (ENPC), working with Mathieu Aubry. He is part of the Imagine group at the LIGM lab. Raphael earned his PhD from IMT Atlantique in 2023, supervised by Vincent Gripon and Lucas Drumetz. His research focuses on computer vision and deep learning, with his current work particularly centered on applications to historical documents.
Quan: 29/10/2024
Més posts d'Esdeveniments
Cap comentari:
Publica un comentari a l'entrada