MAIN CONFERENCE
The main conference will take place from September 8-10, 2021, at the Beaulieu Lausanne Convention Center.
Below the detailed program. Please note that minor changes may still be applied. A program overview can be downloaded here, a flyer is available here and the program booklet here.
Papers accepted into the ICDAR-IJDAR journal track will appear in a special issue of IJDAR to be published online simultaneously with the opening of the conference and will receive an oral presentation slot at ICDAR 2021.
Wednesday, 8 September 2021
Time | Event | Location |
08:00 | Registration & Coffee | |
08:45-09:15 | Opening | Room “Rome” (Third floor) |
09:15-10:00 | Award Keynote (IAPR/ICDAR Outstanding Achievement) : Masaki Nakagawa | Room “Rome” (Third floor) |
10:00-10:30 | Award Keynote (IAPR/ICDAR Young Investigator) : Mickaël Coustaty | Room “Rome” (Third floor) |
10:30-11:00 | Coffee Break | |
11:00-12:40 | Oral Session 1: Journal Track 1 |
Room “Rome” (Third floor) |
11:00-12:40 | Oral Session 2: Journal Track 2 |
Room “St-Moritz” (Third floor) |
12:40-14:00 | Lunch | Forum (Third floor) |
14:00-16:00 | Oral Session 3: Historical Document Analsyis 1 |
Room “Rome” (Third floor) |
14:00-16:00 | Oral Session 4: Document Analysis Systems |
Room “St-Moritz” (Third floor) |
16:00-16:30 | Coffee Break | |
16:30-17:15 | TC10/11 Meeting | Room “Rome” (Third floor) |
18:00 | Reception and Gala Dinner | Olympic Museum |
Thursday, 9 September 2021
Time | Event | Location |
08:00 | Registration & Coffee | |
08:30-10:00 | Poster: Session 1 |
Room “Barcelone” (Second floor) |
10:00-10:30 | Coffee Break | |
10:30-12:00 | Poster: Session 2 |
Room “Barcelone” (Second floor) |
12:00-13:00 | Lunch | Forum (Second floor) |
13:00-14:00 | Keynote: Prem Natarajan |
Room “Rome” (Third floor) |
14:00-14:30 | Coffee Break | |
14:30-16:30 | Oral Session 5: Handwriting Recognition |
Room “Rome” (Third floor) |
14:30-16:30 | Oral Session 6: Scene Text Detection and Recognition | Room “St-Moritz” (Second floor) |
16:30-17:00 | Coffee Break | |
17:00-17:45 | Industrial Session |
Friday, 10 September 2021
Time | Event | Location |
08:30 | Registration & Coffee | |
09:00-10:00 | Keynote: Beáta Megyesi |
Room “Rome” (Third floor) |
10:00-10:30 | Coffee Break | |
10:30-12:00 | Oral Session 7: Historical Document Analysis 2 |
Room “Rome” (Third floor) |
10:30-12:00 | Oral Session 8: Document Image Processing |
Room “St-Moritz” (Second floor) |
12:00-14:00 | Lunch | Forum (Third floor) |
14:00-15:30 | Oral Session 9: NLP for Document Understanding | Room “Rome” (Third floor) |
14:00-15:30 | Oral Session 10: Graphics, Diagram, and Math Recognition | Room “St-Moritz” (Second floor) |
15:30-16:00 | Coffee Break | |
16:00-16:45 | Competition Session | Room “Rome” (Third floor) |
16:45-17:00 | ICDAR Awards Ceremony & Closing | Room “Rome” (Third floor) |
Oral Sessions
Oral Session 1: Journal Track 1
Chairs: Paul Maergner, Mickael Coustaty, György Kovacs, Sana Sabah
11:00-12:40, Wednesday 8
Room “Rome”, 3rd floor
ID | Title | Authors |
O1.1 | Learning from similarity and information extraction from structured documents | Martin Holeček |
O1.2 | Learning-free Pattern Detection for Manuscript Research: An Efficient Approach Toward Making Manuscript Images Searchable | Hussein Mohammed, Volker Märgner and Giovanni Ciotti |
O1.3 | A two-step framework for text line segmentation in historical Arabic and Latin document images | Olfa Mechi, Maroua Mehri, Rolf Ingold and Najoua Essoukri Ben Amara |
O1.4 | Self-Supervised Deep Metric Learning for ancient papyrus fragments retrieval | Antoine Pirrone, Marie Beurton-Aimar and Nicholas Journet |
O1.5 | Data Augmentation using Geometric, Frequency, and Beta Modeling approaches for Improving Multi-lingual Online Handwriting Recognition | Yahia Hamdi, Houcine Boubaker and Adel Alimi |
go to top
Oral Session 2: Journal Track 2
Chairs: Jean-Christophe Burje, Hubert Cardo & Tosin Adewumi
11:00-12:40, Wednesday 8
Room “St Moritz”, 2nd floor
ID | Title | Authors |
O2.1 | EAPML: Ensemble Self-Attention-based Positive Mutual Learning Network for Document Image Classification | Souhail Bakkali, Zuheng Ming, Mickael Coustaty and Marcal Rusinol |
O2.2 | Beyond Document Object Detection: Instance-Level Segmentation of Complex Layouts | Sanket Biswas, Pau Riba, Josep Llados and Umapada Pal |
O2.3 | Asking Questions on Handwritten Document Collections | Minesh Mathew, Lluis Gomez, Dimosthenis Karatzas and C V Jawahar |
O2.4 | Revealing a History: Palimpsest Text Separation with Generative Networks | Anna Starynska, David Messinger and Yu Kong |
Oral Session 3: Historical Document Analysis 1
Chairs: Bertrand Couasnon, Enrique Vidal, Nosheen Abid & Konstantina Nikolaidou
14:00-16:00, Wednesday 8
Room “Rome”, 3rd floor
ID | Title | Authors |
O3.1 | BoundaryNet: An Attentive Deep Network with Fast Marching Distance Maps for Semi-automatic Layout Annotation | Abhishek Trivedi and Ravi Kiran Sarvadevabhatla |
O3.2 | Pho(SC)Net: An Approach Towards Zero-shot Word Image Recognition in Historical Documents | Anuj Rai, Narayanan C. Krishnan and Sukalpa Chanda |
O3.3 | Versailles-FP dataset: Wall Detection in Ancient Floor Plans | Wassim Swaileh, Dimitrios Kotzinos, Suman Ghosh, Michel Jordan, Ngoc-Son Vu and Yaguan Qian |
O3.4 | Graph Convolutional Neural Networks for Learning Attribute Representations for Word Spotting | Fabian Wolf, Andreas Fischer and Gernot A. Fink |
O3.5 | Context Aware Generation of Cuneiform Signs | Kai Brandenbusch, Eugen Rusakov and Gernot A. Fink |
O3.6 | Adaptive Scaling for Archival Table Structure Recognition | Xiao-Hui Li, Fei Yin, Xu-Yao Zhang and Cheng-Lin Liu |
Oral Session 4: Document Analysis Systems
Chairs: Jean-Yves Ramel, Joseph Chazalon, Prakash Chandra Chhipa & Rajkumar Saini
14:00-16:00, Wednesday 8
Room “St Moritz”, 2nd floor
ID | Title | Authors |
O4.1 | LGPMA: Complicated Table Structure Recognition with Local and Global Pyramid Mask Alignment | Liang Qiao, Zaisheng Li, Zhanzhan Cheng, Peng Zhang, Shiliang Pu, Yi Niu, Wenqi Ren, Wenming Tan and Fei Wu |
O4.2 | VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations | Peng Zhang, Can Li, Liang Qiao, Zhanzhan Cheng, Shiliang Pu, Yi Niu and Fei Wu |
O4.3 | LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis | Zejiang Shen, Ruochen Zhang, Melissa Dell, Benjamin Charles Germain Lee, Jacob Carlson and Weining Li |
O4.4 | Understanding and Mitigating the Impact of Model Compression for Document Image Classification | Shoaib Ahmed Siddiqui, Andreas Dengel and Sheraz Ahmed |
O4.5 | Hierarchical and Multimodal Classification of Images from Soil Remediation Reports | Korlan Rysbayeva, Romain Giot and Nicholas Journet |
O4.6 | Competition and Collaboration in Document Analysis and Recognition | Daniel Lopresti and George Nagy |
Oral Session 5: Handwriting Recognition
Chairs: Eric Anquetil, Robert Saborin, Hamam Mokayed, Rajkumar Saini
14:30-16:30, Thursday 9
Room “Rome”, 3rd floor
ID | Title | Authors |
O5.1 | 2D Self-Attention Convolutional Recurrent Network for Offline Handwritten Text Recognition | Nam Tuan Ly, Hung Tuan Nguyen and Masaki Nakagawa |
O5.2 | Handwritten Text Recognition with Convolutional Prototype Network and Most Aligned Frame Based CTC Training | Likun Gao, Heng Zhang and Cheng-Lin Liu |
O5.3 | Online Spatio-Temporal 3D Convolutional Neural Network for Early Recognition of Handwritten Gestures | William Mocaër, Eric Anquetil and Richard Kulpa |
O5.4 | Mix-Up Augmentation for Oracle Character Recognition with Imbalanced Data Distribution | Jing Li, Qiu-Feng Wang, Rui Zhang and Kaizhu Huang |
O5.5 | Radical Composition Network for Chinese Character Generation | Mobai Xue, Jun Du, Jianshu Zhang, Zi-Rui Wang, Bin Wang and Bo Ren |
O5.6 | SmartPatch: Improving Handwritten Word Imitation with Patch Discriminators | Alexander Mattick, Martin Mayr, Mathias Seuret, Andreas Maier and Vincent Christlein |
Oral Session 6: Scene Text Detection and Recognition
Chairs: Dimos Karatsas, Jean-Marc Ogier, Tosin Adewumi & Hamam Mokayed
14:30-16:30, Thursday 9
Room “St Moritz”, 2nd floor
ID | Title | Authors |
O6.1 | Reciprocal Feature Learning via Explicit and Implicit Tasks in Scene Text Recognition | Hui Jiang, Yunlu Xu, Zhanzhan Cheng, Shiliang Pu, Yi Niu, Wenqi Ren, Fei Wu and Wenming Tan |
O6.2 | Text Detection by Jointly Learning Character and Word Regions | Deyang Wu, Xingfei Hu, Zhaozhi Xie, Haiyan Li, Usman Ali and Hongtao Lu |
O6.3 | Vision Transformer for Fast and Efficient Scene Text Recognition | Rowel Atienza |
O6.4 | Look, Read and Ask: Learning to Ask Questions by Reading Text in Images | Soumya Jahagirdar, Shankar Gangisetty and Anand Mishra |
O6.5 | CATNet: Scene Text Recognition Guided by Concatenating Augmented Text Features | Ziyin Zhang, Lemeng Pan, Lin Du, Qingrui Li and Ning Lu |
O6.6 | Explore Hierarchical Relations Reasoning and Global Information Aggregation | Lei Li, Chun Yuan and Kai Fan |
Oral Session 7: Historical Document Analysis 2
Chairs: Thierry Paquet, Laurence Likforman, Konstantina Nikolaidou & Nosheen Abid
10:30-12:00, Friday 10
Room “Rome”, 3rd floor
ID | Title | Authors |
O7.1 | One-Model Ensemble-Learning for Text Recognition of Historical Printings | Christoph Wick and Christian Reul |
O7.2 | On the use of attention in deep learning based denoising method for ancient Cham inscription images | Tien-Nam Nguyen, Jean-Christophe Burie, Thi-Lan Le and Anne-Valerie Schweyer |
O7.3 | Visual FUDGE: Form Understanding via Dynamic Graph Editing | Brian Davis, Bryan Morse, Brian Price, Chris Tensmeyer and Curtis Wiginton |
O7.4 | Annotation-Free Character Detection in Historical Vietnamese Stele Images | Anna Scius-Bertrand, Michael Jungo, Beat Wolf, Andreas Fischer and Marc Bui |
Oral Session 8: Document Image Processing
Chairs: Elisa Barney Smith, Jihad El-Sana, Prakash Chandra Chhipa, Homan Mokayed
10:30-12:00, Friday 10
Room “St Moritz”, 2nd floor
ID | Title | Authors |
O8.1 | DocReader: Bounding-Box Free Training of a Document Information Extraction Model | Shachar Klaiman and Marius Lehne |
O8.2 | Document Dewarping with Control Points | Guo-Wang Xie, Fei Yin, Xu-Yao Zhang and Cheng-Lin Liu |
O8.3 | Unknown-box Approximation to Improve Optical Character Recognition Performance | Ayantha Randika, Nilanjan Ray, Xiao Xiao and Allegra Latimer |
O8.4 | Document Domain Randomization for Deep Learning Document Layout Extraction | Meng Ling, Jian Chen, Torsten Möller, Petra Isenberg, Tobias Isenberg, Michael Sedlmair, Robert S. Laramee, Han-Wei Shen, Jian Wu and C. Lee Giles |
Oral Session 9: NLP for Document Understanding
Chairs: David Smith, Joan Andreu Sanchez & Tosin Adewumi
14:00-15:30, Friday 10
Room “Rome”, 3rd floor
ID | Title | Authors |
O9.1 | Distilling the Documents for Relation Extraction by Topic Segmentation | Minghui Wang, Ping Xue, Ying Li and Zhonghai Wu |
O9.2 | LAMBERT: Layout-Aware Language Modeling for Information Extraction | Łukasz Garncarek, Rafał Powalski, Tomasz Stanisławek, Bartosz Topolski, Piotr Halama, Michał Turski and Filip Graliński |
O9.3 | ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Information Extraction from Documents | Weihong Lin, Qifang Gao, Lei Sun, Zhuoyao Zhong, Kai Hu, Qin Ren and Qiang Huo |
O9.4 | Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts | Tomasz Stanisławek, Filip Graliński, Anna Wróblewska, Dawid Lipiński, Agnieszka Kaliska, Paulina Rosalska, Bartosz Topolski and Przemysław Biecek |
Oral Session 10: Graphics, Diagram, and Math Recognition
Chairs: Rajkumar Saini, Lama Alkhaled, Harold Mouchere, Christophe Rigaud, Pedro Alonso & Saleha Javed
14:00-15:30, Friday 10
Room “St Moritz”, 2nd floor
ID | Title | Authors |
O10.1 | Towards an efficient framework for Data Extraction from Chart Images | Weihong Ma, Hesuo Zhang, Shuang Yan, Guangshun Yao, Yichao Huang, Hui Li, Yaqiang Wu and Lianwen Jin |
O10.2 | Geometric Object 3D Reconstruction from Single Line Drawing Image Based on a Network for Classification and Sketch Extraction | Zhuoying Wang, Qingkai Fang and Yongtao Wang |
O10.3 | DiagramNet: Hand-drawn Diagram Recognition using Visual Arrow-relation Detection | Bernhard Schäfer and Heiner Stuckenschmidt |
O10.4 | Formula Citation Graph Based Mathematical Information Retrieval | Ke Yuan, Liangcai Gao, Zhuoren Jiang and Zhi Tang |
Poster Sessions
Poster Session 1
8:30-10:00, Thursday 9
Room “Barcelone”, 2nd floor
ID | Authors | Title |
P1-1 | Rongyu Cao, Hongwei Li, Ganbin Zhou and Ping Luo | Towards Document Panoptic Segmentation with Pinpoint Accuracy: Method and Evaluation |
P1-2 | Ayush Kumar Shah, Abhisek Dey and Richard Zanibbi | A Math Formula Extraction and Evaluation Framework for PDF Documents |
P1-3 | Laura E. Brandt and William T. Freeman | Toward Automatic Interpretation of 3D Plots |
P1-4 | Marta Vicente, Robiert Sepúlveda-Torrres, Cristina Barros, Estela Saquete and Elena Lloret | Can Text Summarization Enhance the Headline Stance Detection Task? Benefits and Drawbacks |
P1-5 | Justin Wood, Wei Wang and Corey Arnold | The Biased Coin Flip Process for Nonparametric Topic Modeling |
P1-6 | Sayali Kulkarni, Sheide Chammas, Wan Zhu, Fei Sha and Eugene Ie | CoMSum and SIBERT: A Dataset and Neural Model for Query-Based Multi-Document Summarization |
P1-7 | Tonghua Su, Shuchen Liu and Shengjie Zhou | RTNet: An End-to-End Method for Handwritten Text Image Translation |
P1-8 | Ziyi Zhu, Liangcai Gao, Yibo Li, Yilun Huang, Lin Du, Ning Lu and Xianfeng Wang | NTable: A Dataset for Camera-based Table Detection |
P1-9 | Tianqi Ji, Jun Li and Jianhua Xu | Label Selection Algorithm Based on Boolean Interpolative Decomposition with Sequential Backward Selection for Multi-label Classification |
P1-10 | Huy Quang Ung, Cuong Tuan Nguyen, Hung Tuan Nguyen and Masaki Nakagawa | GSSF: A Generative Sequence Similarity Function based on a Seq2Seq model for clustering online handwritten mathematical answers |
P1-11 | Vaibhavi Gupta, Vinay Detani, Vivek Khokar and Chiranjoy Chattopadhyay | C2VNet: A Deep Learning Framework Towards Comic Strip to Audio-Visual Scene Synthesis |
P1-12 | Jie He, Xingjiao Wu, Wenxin Hu and Jing Yang | LSTMVAEF: Vivid Layout via LSTM-based Variational Autoencoder Framework |
P1-13 | Andrii Grygoriev, Illya Degtyarenko, Ivan Deriuga, Serhii Polotskyi, Volodymyr Melnyk, Dmytro Zakharchuk and Olga Radyvonenko | HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification |
P1-14 | Daniil Matalov, Elena Limonova, Natalya Skoryukina and Vladimir V. Arlazarov | RFDoc: memory efficient local descriptors for ID documents localization and classification |
P1-15 | Haibo Qin, Chun Yang, Xiaobin Zhu and Xucheng Yin | Dynamic Receptive Field Adaptation for Attention-Based Text Recognition |
P1-16 | Ryota Yoshihashi, Tomohiro Tanaka, Kenji Doi, Takumi Fujino and Naoaki Yamashita | Context-Free TextSpotter for Real-Time and Mobile End-to-End Text Detection and Recognition |
P1-17 | Yulia Chernyshova, Ekaterina Emelianova, Alexander Sheshkus and Vladimir V. Arlazarov | MIDV-LAIT: a challenging dataset for recognition of IDs with Perso-Arabic, Thai, and Indian scripts |
P1-18 | Konstantin Bulatov and Vladimir V. Arlazarov | Determining optimal frame processing strategies for real-time document recognition systems |
P1-19 | Eugen Rusakov, Turna Somel, Gerfrid G.W. Müller and Gernot A. Fink | Embedded Attributes for Cuneiform Sign Spotting |
P1-20 | Adrià Molina, Pau Riba, Lluis Gomez, Oriol Ramos-Terrades and Josep Lladós | Date Estimation in the Wild of Scanned Historical Photos: An Image Retrieval Approach |
P1-21 | Muhammad Osama Zeeshan, Imran Siddiqi and Momina Moetesum | Two-Step Fine-Tuned Convolutional Neural Networks for Multi-Label Classification of Children’s Drawings |
P1-22 | Tamal Chowdhury, Palaiahnakote Shivakumara, Umapada Pal, Tong Lu, Ramachandra Raghavendra and Sukalpa Chanda | DCINN: Deformable Convolution and Inception Based Neural Network for Tattoo Text Detection through Skin Region |
P1-23 | Fatma Najar and Nizar Bouguila | Sparse Document Analysis using Beta-Liouville Naive Bayes with Vocabulary Knowledge |
P1-24 | Sk Md Obaidullah, Mridul Ghosh, Himadri Mukherjee, Kaushik Roy and Umapada Pal | Automatic Signature-based Writer Identification in Mixed-script Scenarios |
P1-25 | Pau Riba, Adrià Molina, Lluis Gomez, Oriol Ramos-Terrades and Josep Lladós | Learning to Rank Words: Optimizing Ranking Metrics for Word Spotting |
P1-26 | Trung Tan Ngo, Hung Tuan Nguyen and Masaki Nakagawa | A-VLAD: An End-to-End Attention-based Neural Network for Writer Identification in Historical Documents |
P1-27 | Nhu-Van Nguyen, Christophe Rigaud, Arnaud Revel and Jean-Christophe Burie | Manga-MMTL: multimodal multitask transfer learning for manga character analysis |
P1-28 | Enrique Vidal and Alejandro H. Toselli | Probabilistic Indexing and Search for Hyphenated Words |
P1-29 | Sieben Bocklandt, Gust Verbruggen and Thomas Winters | SandSlide: Automatic Slideshow Normalization |
P1-30 | Alejandro H. Toselli, Si Wu and David A. Smith | Digital Editions as Distant Supervision for Layout Analysis of Printed Books |
P1-31 | S P Sharan, Sowmya Aitha, Amandeep Kumar, Abhishek Trivedi, Aaron Augustine and Ravi Kiran Sarvadevabhatla | Palmira: A Deep Deformable Network for Instance Segmentation of Dense and Uneven Layouts in Handwritten Manuscripts |
P1-32 | Oldřich Kodym and Michal Hradiš | Page Layout Analysis System for Unconstrained Historic Documents |
P1-33 | Jose Ramón Prieto and Enrique Vidal | Improved Graph Methods for Table Layout Understanding |
P1-34 | Berat Kurar Barakat, Ahmad Droby, Raid Saabni and Jihad El-Sana | Unsupervised learning of text line segmentation by differentiating coarse patterns |
P1-35 | Yibo Li, Yilun Huang, Ziyi Zhu, Lemeng Pan, Yongshuai Huang, Lin Du, Zhi Tang and Liangcai Gao | Rethinking Table Structure Recognition Using Sequence Labeling Methods |
P1-36 | Harsh Desai, Pratik Kayal and Mayank Singh | TabLeX: A Benchmark Dataset for Structure and Content Information Extraction from Scientific Tables |
P1-37 | Wenqi Zhao, Liangcai Gao, Zuoyu Yan, Shuai Peng, Lin Du and Ziyin Zhang | Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer |
P1-38 | Umar Khan, Sohaib Zahid, Muhammad Asad Ali, Adnan Ul-Hasan and Faisal Shafait | TabAug: Data Driven Augmentation for Enhanced Table Structure Recognition |
P1-39 | Haisong Ding, Kai Chen and Qiang Huo | An Encoder-Decoder Approach to Handwritten Mathematical Expression Recognition with Multi-Head Attention and Stacked Decoder |
P1-40 | Cuong Tuan Nguyen, Thanh-Nghia Truong, Hung Tuan Nguyen and Masaki Nakagawa | Global Context for improving recognition of Online Handwritten Mathematical Expressions |
P1-41 | Koji Ichikawa | Image-based Relation Classification Approach for Table Structure Recognition |
P1-42 | Shuai Peng, Liangcai Gao, Ke Yuan and Zhi Tang | Image to LaTeX with Graph Neural Network for Mathematical Formula Recognition |
P1-43 | Badal Agrawal, Mohit Mishra and Varun Parashar | A Novel Method for Automated Suggestion of Similar Software Incidents using 2-Stage Filtering : Findings on Primary Data |
P1-44 | Lianxi Wang, Xiaotian Lin and Nankai Lin | Research on pseudo-label technology for multi-label news classification |
P1-45 | Ahmed Hamdi, Elodie Carel, Aurélie Joseph, Mickael Coustaty and Antoine Doucet | Information Extraction from Invoices |
P1-46 | Apoorva Singh and Sriparna Saha | Are You Really Complaining? A Multi-task Framework for Complaint Identification, Emotion and Sentiment Classification |
P1-47 | Rafał Powalski, Łukasz Borchmann, Dawid Jurkiewicz, Tomasz Dwojak, Michał Pietruszka and Gabriela Pałka | Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer |
P1-48 | Luisa März, Stefan Schweter, Nina Poerner, Benjamin Roth and Hinrich Schütze | Data Centric Domain Adaptation for Historical Text with OCR Errors |
P1-49 | Nafaa Haffar, Rami Ayadi, Emna Hkiri and Mounir Zrigui | Temporal Ordering of Events via Deep Neural Networks |
P1-50 | Rubèn Tito, Dimosthenis Karatzas and Ernest Valveny | Document Collection Visual Question Answering |
P1-51 | Jiří Martínek, Pavel Král and Ladislav Lenc | Dialogue Act Recognition using Visual Information |
P1-52 | Oliver Tüselmann, Fabian Wolf and Gernot A. Fink | Are End-to-End Systems Really Necessary for NER on Handwritten Document Images? |
P1-53 | Harsh Kohli | Training Bi-Encoders for Word Sense Disambiguation |
P1-54 | Freddy C. Chua and Nigel P. Duffy | DeepCPCFG: Deep Learning and Context Free Grammars for End-to-End Information Extraction |
P1-55 | Djedjiga Belhadj, Yolande Belaïd and Abdel Belaïd | Consideration of the word’s neighborhood in GATs for information extraction in semi-structured documents |
P1-56 | Paola A., Buitrago, Evgeny Toropov, Rajanie Prabha, Julian Uran and Raja Adal | MiikeMineStamps: A Long-Tailed Dataset of Japanese Stamps via Active Learning |
P1-57 | Romain Carletto, Hubert Cardot and Nicolas Ragot | Deep Learning for Document Layout Generation: A First Reproducible Quantitative Evaluation and a Baseline Model |
P1-58 | Jiaming Wang, Qing Wang, Jun Du, Jianshu Zhang, Bin Wang and Bo Ren | MRD: A Memory Relation Decoder for Online Handwritten Mathematical Expression Recognition |
P1-59 | Sumeet S. Singh and Sergey Karayev | Full Page Handwriting Recognition via Image to Sequence Extraction |
P1-60 | Denis Coquenet, Clément Chatelain and Thierry Paquet | SPAN: a Simple Predict & Align Network for Handwritten Paragraph Recognition |
P1-61 | Manh Tu VU, Van Linh LE, and Marie BEURTON-AIMAR | IHR-NomDB: The Old Degraded Vietnamese Handwritten Script Archive Database |
P1-62 | Valerii Dziubliuk, Mykhailo Zlotnyk and Oleksandr Viatchaninov | Sequence Learning Model for Syllables Recognition Arranged in Two Dimensions |
P1-63 | Christoph Wick, Jochen Zöllner and Tobias Grüning | Transformer for Handwritten Text Recognition using Bidirectional Post-Decoding |
P1-64 | Yuhao Huang, Lianwen Jin and Dezhi Peng | Zero-Shot Chinese Text Recognition via Matching Class Embedding |
P1-65 | Ryohei Tanaka, Kunio Osada and Akio Furuhata | Text-conditioned Character Segmentation for CTC-based Text Recognition |
P1-66 | Dezhi Peng, Canyu Xie, Hongliang Li, Lianwen Jin, Zecheng Xie, Kai Ding, Yichao Huang and Yaqiang Wu | Towards Fast, Accurate and Compact Online Handwritten Chinese Text Recognition |
P1-67 | Siqi Cai, Wenyuan Xue, Qingyong Li and Peng Zhao | HCADecoder: A Hybrid CTC-Attention Decoder for Chinese Text Recognition |
P1-68 | Takato Otsuzuki, Heon Song, Seiichi Uchida and Hideaki Hayashi | Meta-learning of Pooling Layers for Character Recognition |
P1-69 | Chandranath Adak, Bidyut B. Chaudhuri, Chin-Teng Lin and Michael Blumenstein | Text-line-up: Don’t Worry about the Caret |
P1-70 | Ibrahim Souleiman Mahamoud, Joris Voerman, Mickaël Coustaty, Aurélie Joseph, Vincent Poulain d’Andecy and Jean-Marc Ogier | Multimodal Attention-based Learning for Imbalanced Corporate Documents Classification |
P1-71 | Soumyadeep Dey and Pratik Jawanpuria | Light-weight Document Image Cleanup using Perceptual Loss |
Poster Session 2
10:30-12:00, Thursday 9
Room “Barcelone”, 2nd floor
P2-1 | Zhenzhou Zhuang, Zonghao Liu, Kin-Man Lam, Shuangping Huang and Gang Dai | A New Semi-Automatic Annotation Model via Semantic Boundary Estimation for Scene Text Detection |
P2-2 | Brian Liu, Weicong Sun, Wenjing Kang and Xianchao Xu | Searching from the Prediction of Visual and Language Model for Handwritten Chinese Text Recognition |
P2-3 | Mohamad Wehbi, Tim Hamann, Jens Barth, Peter Kaempf, Dario Zanca and Bjoern Eskofier | Towards an IMU-based Pen Online Handwriting Recognizer |
P2-4 | Antonio Parziale, Cristina Carmona-Duarte, Miguel Angel Ferrer and Angelo Marcelli | 2D vs 3D online writer identification: a comparative study |
P2-5 | Celso A. M. Lopes Junior, Murilo C. Stodolni, Byron L. D. Bezerra and Donato Impedovo | A Handwritten Signature Segmentation Approach for Multi-resolution and Complex Documents Acquired by Multiple Sources |
P2-6 | Yu-Jie Xiong and Song-Yang Cheng | Attention based Multiple Siamese Network for Offline Signature Verification |
P2-7 | Shinnosuke Matsuo, Xiaomeng Wu, Gantugs Atarsaikhan, Akisato Kimura, Kunio Kashino, Brian Kenji Iwana and Seiichi Uchida | Attention to Warp: Deep Metric Learning for Multivariate Time Series |
P2-8 | Huaigu Cao and Wael AbdAlmageed | Customizable Camera Verification for Media Forensic |
P2-9 | Barbara Gawda | Density Parameters of Handwriting in Schizophrenia and Affective Disorders Assessed Using the Raygraf Computer Software |
P2-10 | Catherine Taleb, Laurence Likforman-Sulem and Chafic Mokbel | Language-Independent Bimodal System for Early Parkinson’s Disease Detection |
P2-11 | Taylor Archibald, Mason Poggemann, Aaron Chan and Tony Martinez | TRACE: A Differentiable Approach to Line-level Stroke Recovery for Offline Handwritten Text |
P2-12 | Arnaud Lods, Éric Anquetil and Sébastien Macé | Segmentation and graph matching for online analysis of student arithmetic operations |
P2-13 | Elmokhtar Mohamed Moussa, Thibault Lelore and Harold Mouchère | Applying End-to-end Trainable Approach on Stroke Extraction in Handwritten Math Expressions Images |
P2-14 | Jianhuan Huang and Zili Zhang | A Novel Sigma-Lognormal Parameter Extractor for Online Signatures |
P2-15 | George Nagy | Near-perfect Relation Extraction from Family Books |
P2-16 | Simon Brenner, Lukas Schügerl and Robert Sablatnig | Estimating Human Legibility in Historic Manuscript Images – A Baseline |
P2-17 | Chahan Vidal-Gorène, Boris Dupin, Aliénor Decours-Perez and Thomas Riccioli | A Modular and Automated Annotation Platform for Handwritings: Evaluation on Under-resourced Languages |
P2-18 | Emilio Granell, Lorenzo Quirós, Verónica Romero and Joan Andreu Sánchez | Reducing the Human Effort in Text Line Segmentation for Historical Documents |
P2-19 | Fan Peng, Zhendong Zhuang and Yang Xue | DSCNN: Dimension Separable Convolutional Neural Networks for character recognition based on inertial sensor signal |
P2-20 | Sanket Biswas, Pau Riba, Josep Lladós and Umapada Pal | DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis |
P2-21 | Taiga Miyazono, Brian Kenji Iwana, Daichi Haraguchi and Seiichi Uchida | Font Style that Fits an Image — Font Generation Based on Image Context |
P2-22 | Sinda Jlassi, Imen Jdey and Hela Ltifi | Bayesian Hyperparameter optimization of Deep Neural Network algorithms based on Ant Colony optimization |
P2-23 | Mengqiao Zhao, Andre Gustavo Hochuli and Abbas Cheddad | End-to-End Approach for Recognition of Historical Digit Strings |
P2-24 | Lars Vögtlin, Manuel Drazyk, Vinaychandran Pondenkandath, Michele Alberti and Rolf Ingold | Generating Synthetic Handwritten Historical Documents With OCR Constrained GANs |
P2-25 | Jiří Mayer and Pavel Pecina | Synthesizing Training Data for Handwritten Music Recognition |
P2-26 | Wensheng Zhang, Yan Zheng, Taiga Miyazono, Seiichi Uchida and Brian Kenji Iwana | Towards Book Cover Design via Layout Graphs |
P2-27 | Antonio Ríos-Vila, David Rizo and Jorge Calvo-Zaragoza | Complete Optical Music Recognition via Agnostic Transcription and Machine Translation |
P2-28 | Sihang Wu, Canyu Xie, Yuhao Huang, Guozhi Tang, Qianying Liao, Jiapeng Wang, Bangdong Chen, Hongliang Li, Xinfeng Chang, Hui Li, Kai Ding, Yichao Huang and Lianwen Jin | Improving Machine Understanding of Human Intent in Charts |
P2-29 | Hesuo Zhang, Weihong Ma, Lianwen Jin, Yichao Huang, Kai Ding and Yaqiang Wu | DeMatch: Towards Understanding the Panel of Chart Documents |
P2-30 | Enrique Mas-Candela, Maria Alfaro-Contreras and Jorge Calvo-Zaragoza | Sequential Next-Symbol Prediction for Optical Music Recognition |
P2-31 | Masaya Ueda, Akisato Kimura and Seiichi Uchida | Which Parts Determine the Impression of the Font? |
P2-32 | Seiya Matsuda, Akisato Kimura and Seiichi Uchida | Impressions2Font: Generating Fonts by Specifying Impressions |
P2-33 | Chia-Wei Tang, Chao-Lin Liu and Po-Sen Chiu | HRRegionNet: Chinese Character Segmentation in Historical Documents with Regional Awareness |
P2-34 | Jiri Kralicek and Jiri Matas | Fast Text v. Non-text Classification of Images |
P2-35 | Haodong Shi, Liangrui Peng, Ruijie Yan, Gang Yao, Shuman Han and Shengjin Wang | Mask Scene Text Recognizer |
P2-36 | Jusung Lee, Jaemyung Lee, Cheoljong Yang, Younghyun Lee and Joonsoo Lee | Rotated Box Is Back: An Accurate Box Proposal Network for Scene Text Detection |
P2-37 | Qianyi Jiang, Qi Song, Nan Li, Rui Zhang and Xiaolin Wei | Heterogeneous Network Based Semi-supervised Learning For Scene Text Recognition |
P2-38 | Wenqing Zhang, Yang Qiu, Minghui Liao, Rui Zhang, Xiaolin Wei and Xiang Bai | Scene Text Detection with Scribble Line |
P2-39 | Jiedong Hao, Yafei Wen, Jie Deng, Jun Gan, Shuai Ren, Hui Tan, and Xiaoxin Chen | EEM: An End-to-end Evaluation Metric for Scene Text Detection and Recognition |
P2-40 | Moonbin Yim, Yoonsik Kim, Han-Cheol Cho and Sungrae Park | SynthTIGER: Synthetic Text Image GEneratoR Towards Better Text Recognition Models |
P2-41 | Qi Liu, Song-Lu Chen, Zhen-Jia Li, Chun Yang, Feng Chen and Xu-Cheng Yin | Fast Recognition for Multidirectional and Multi-Type License Plates with 2D Spatial Attention |
P2-42 | Qianying Liao, Qingxiang Lin, Lianwen Jin, Canjie Luo, Jiaxin Zhang, Dezhi Peng and Tianwei Wang | A Multi-level Progressive Rectification Mechanism for Irregular Scene Text Recognition |
P2-43 | Mengmeng Cui, Wei Wang, Jinjin Zhang and Liang Wang | Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition |
P2-44 | Yash Patel and Jiří Matas | FEDS – Filtered Edit Distance Surrogate |
P2-45 | Tao Sheng and Zhouhui Lian | Bidirectional Regression for Arbitrary-Shaped Text Detection |
P2-46 | Ahmad Droby, Berat Kurar Barakat, Daria Vasyutinsky Shapira, Irina Rabaev and Jihad El-Sana | VML-HP: Hebrew paleography dataset |
P2-47 | Sarkhan Badirli, Mary Borgo Ton, Abdulmecit Gungor and Murat Dundar | Open Set Authorship Attribution toward Demystifying Victorian Periodicals |
P2-48 | Amit Maraj, Miguel Vargas Martin and Masoud Makrehchi | A More Effective Sentence-Wise Text Segmentation Approach using BERT |
P2-49 | Fabio Pignelli, Yandre M. G. Costa, Luiz S. Oliveira and Diego Bertolini | Data Augmentation for Writer Identification Using a Cognitive Inspired Model |
P2-50 | Xiaojie Xia, Wei Liu, Ying Zhang, Liuan Wang and Jun Sun | Key-guided Identity Document Classification Method by Graph Attention Network |
P2-51 | Dmitry Rodin, Vasily Loginov, Ivan Zagaynov and Nikita Orlov | Document Image Quality Assessment via Explicit Blur and Text Size Estimation |
P2-52 | Shoaib Ahmed Siddiqui, Andreas Dengel and Sheraz Ahmed | Analyzing the potential of Zero-Shot Recognition for Document Image Classification |
P2-53 | Fahimeh Alaei and Alireza Alaei | Gender Detection Based on Spatial Pyramid Matching |
P2-54 | Akrem Sellami and Salvatore Tabbone | EDNets: Deep Feature Learning for Document Image Classification based on Multi-view Encoder-Decoder Neural Networks |
P2-55 | Guillaume Chiron, Florian Arrestier and Ahmad Montaser Awal | Fast End-to-end Deep Learning Identity Document Detection, Classification and Cropping |
P2-56 | Ryad Kaoua, Xi Shen, Alexandra Durr, Stavros Lazaris, David Picard and Mathieu Aubry | Image Collation: Matching illustrations in manuscripts |
P2-57 | Joseph Chazalon and Edwin Carlinet | Revisiting the Coco Panoptic Metric to Enable Visual and Qualitative Analysis of Historical Map Instance Segmentation |
P2-58 | Samiul Alam, Tahsin Reasat, Asif Shahriyar Sushmit, Sadi Mohammad Siddique, Fuad Rahman, Mahady Hasan and Ahmed Imtiaz Humayun | A Large Multi-Target Dataset of Common Bengali Handwritten Graphemes |
P2-59 | Alex W. C. Lee, Jonathan Chung and Marco Lee | GNHK: A Dataset for English Handwriting in the Wild |
P2-60 | Christian Gold, Dario van den Boom and Torsten Zesch | Personalizing Handwriting Recognition Systems with Limited User-Specific Samples |
P2-61 | Haoran Zhang, Wei Chen, Xiangdong Su, Hui Guo and Huali Xu | An Efficient Local Word Augment Approach for Mongolian Handwritten Script Recognition |
P2-62 | Santhoshini Gongidi and C V Jawahar | IIIT-INDIC-HW-WORDS: A Dataset for Indic Handwritten Text Recognition |
P2-63 | Martin Kišš, Karel Beneš and Michal Hradiš | AT-ST: Self-Training Adaptation Strategy for OCR in Domains with Limited Transcriptions |
P2-64 | Jan Kohút and Michal Hradiš | TS-Net: OCR Trained to Switch Between Text Transcription Styles |
P2-65 | Derek S. Prijatelj, Samuel Grieggs, Futoshi Yumoto, Eric Robertson and Walter J. Scheirer | Handwriting Recognition with Novelty |
P2-66 | Yizi Chen, Edwin Carlinet, Joseph Chazalon, Clément Mallet, Bertrand Duménieu and Julien Perret | Vectorization of Historical Maps Using Deep Edge Filtering and Closed Shape Extraction |
P2-67 | Hongxi Wei, Kexin Liu, Jing Zhang and Daoerji Fan | Data Augmentation Based on CycleGAN for Improving Woodblock-printing Mongolian Words Recognition |
P2-68 | Deng Li, Yue Wu and Yicong Zhou | SauvolaNet: Learning Adaptive Sauvola Network for Degraded Document Binarization |
P2-69 | Shi Yan, Jin-Wen Wu, Fei Yin and Cheng-Lin Liu | Recognizing Handwritten Chinese Texts with Insertion and Swapping Using A Structural Attention Network |
P2-70 | Raphaela Heil, Ekta Vats and Anders Hast | Strikethrough Removal From Handwritten Words Using CycleGANs |
P2-71 | George Retsinas, Giorgos Sfikas and Christophoros Nikou | Iterative Weighted Transductive Learning for Handwriting Recognition |
Competition Session
16:00-16:45, Friday 10
Room “Rome”, 3rd floor
ID | Competition | Authors |
Competition – 1 | ICDAR 2021 Competition on Scientific Literature Parsing | Antonio Jimeno Yepes, Peter Zhong, Douglas Burdick |
Competition – 2 | ICDAR 2021 Competition on Historical Document Classification | Mathias Seuret, Anguelos Nicolaou, Dalia Rodrı́guez-Salas, Nikolaus Weichselbaumer, Dominique Stutzmann, Martin Mayr, Andreas Maier, Vincent Christlein |
Competition – 3 | ICDAR 2021 Competition on Document Visual Question Answering | Rubèn Tito, Minesh Mathew, C.V. Jawahar, Ernest Valveny, Dimosthenis Karatzas |
Competition – 4 | ICDAR 2021 Competition on Scene Video Text Spotting | Zhanzhan Cheng, Jing Lu, Baorui Zou, Shuigeng Zhou, Fei Wu |
Competition – 5 | ICDAR 2021 Competition on Integrated Circuit Text Spotting and Aesthetic Assessment | Chun Chet Ng, Akmalul Khairi Bin Nazaruddin, Yeong Khang Lee, Xinyu Wang, Yuliang Liu, Chee Seng Chan, Lianwen Jin, Yipeng Sun, Lixin Fan |
Competition – 6 | ICDAR 2021 Competition on Components Segmentation Task of Document Photos | Celso A. M. Lopes Junior, Ricardo B. Neves Junior, Byron L. D. Bezerra, Alejandro H. Toselli, Donato Impedovo |
Competition – 7 | ICDAR 2021 Competition on Historical Map Segmentation | Joseph Chazalon, Edwin Carlinet, Yizi Chen, Julien Perret, Bertrand Duménieu, Clément Mallet, Thierry Géraud,Vincent Nguyen, Nam Nguyen, Josef Baloun, Ladislav Lenc, Pavel Král |
Competition – 8 | ICDAR 2021 Competition on Time-Quality Document Image Binarization | Rafael Dueire Lins, Rodrigo Barros Bernardino, Elisa Barney Smith, Ergina Kavallieratou |
Competition – 9 | ICDAR 2021 Competition on On-Line Signature Verification | Ruben Tolosana, Ruben Vera-Rodriguez, Carlos Gonzalez-Garcia, Julian Fierrez, Santiago Rengifo, Aythami Morales, Javier Ortega-Garcia, Juan Carlos Ruiz-Garcia, Sergio Romero-Tapiador, Jiajia Jiang, Songxuan Lai, Lianwen Jin, Yecheng Zhu, Javier Galbally, Moises Diaz, Miguel Angel Ferrer, Marta Gomez-Barrero, Ilya Hodashinsky, Konstantin Sarin, Artem Slezkin, Marina Bardamova, Mikhail Svetlakov, Mohammad Saleem, Cintia Lia Szücs, Bence Kovari, Falk Pulsmeyer, Mohamad Wehbi, Dario Zanca, Sumaiya Ahmad, Sarthak Mishra, Suraiya Jabin |
Competition – 10 | ICDAR 2021 Competition on Script Identification in the Wild | Abhijit Das, Miguel A. Ferrer, Aythami Morales, Moises Diaz, Umapada Pal, Donato Impedovo, Hongliang Li, Wentao Yang, Kensho Ota, Tadahito Yao, Le Quang Hung, Nguyen Quoc Cuong, Seungjae Kim, and Abdeljalil Gattal |
Competition – 11 | ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX | Pratik Kayal, Mrinal Anand, Harsh Desai, Mayank Singh |
Competition – 12 | ICDAR 2021 Competition on Multimodal Emotion Recognition on Comics Scenes | Nhu-Van Nguyen, Xuan-Son Vu, Christophe Rigaud, Lili Jiang, Jean-Christophe Burie |
Competition – 13 | ICDAR 2021 Competition on Mathematical Formula Detection |
Dan Anitei, Joan Andreu Sánchez, José Manuel Fuentes, Roberto Paredes, José Miguel Benedı́
|
Copyright © ICDAR 2021 Organizing Committee