Conference Programme


Day 1: Friday, August 30, 2024

Session 1: (9:00 am - 10:45 am) Document Analysis & Understanding / Retrieval & VQA Venue: Parthenon II
9:00 am - 9:15 am Opening Ceremony
9:15 am - 10:00 am Keynote 1: AURELIE JOSEPH - Unveiling the Power of AI: The Critical Role of Explainability and Frugality in Modern Companies
10:00 am - 10:15 am O1: Two Experiments for Automatic Scoring of Handwritten Descriptive Answers
10:15 am - 10:30 am O2: Instruction Makes a Difference
10:30 am - 10:45 am O3: Image-text matching for large-scale book collections
10:45 am - 11:15 am Morning Coffee Break
Session 2: (11:15 am - 13:15 pm) Document Analysis & Understanding / Retrieval & VQA Venue: Parthenon II
11:15 am - 12:00 pm Keynote 2: LUKASZ BORCHMANN - From Research to Production and Back Again
Poster Session I (12:00 pm – 12:45 pm) Venue: Parthenon II
P1: Two Experiments for Automatic Scoring of Handwritten Descriptive Answers
P2: Transformer-Based Architecture for Judgment Prediction and Explanation in Legal Proceedings
P3: Enhanced Bank Check Security: Introducing a Novel Dataset and Transformer-Based Approach for Detection and Verification
P4: Multi-page document VQA with Recurrent Memory Transformer
P5: Instruction Makes a Difference
P6: Image-text matching for large-scale book collections
Demo Session I (12:00 pm – 12:45 pm) Venue: Parthenon II
D1: Automatic Scoring of Digital Ink Answers for Japanese, English and Math
D2: Delfos and Dilbert: Advancing Entity and Relation Extraction in Complex Document Scenarios
D3: GDP: Generic Document Pretraining to Improve Document Understanding
12:45 pm - 13:15 pm Discussion Group
13:15 pm - 14:15 pm Lunch Break
Session 3: (14:15 pm – 16:00 pm) Layout Analysis / Document Classification Venue: Parthenon II
14:15 pm - 14:30 pm O4: LD-DOC: Light-weight Domain-Adaptive Document Layout Analysis
14:30 pm - 14:45 pm O5: Leveraging Semantic Segmentation Masks with Embeddings for Fine-Grained Form Classification
Poster Session II (14:45 pm – 15:30 pm) Venue: Parthenon II
P7: Layout Analysis RCAM-Transformer: A Novel Approach to Table Reconstruction Using Row-Column Attention Mechanism
P8: LD-DOC: Light-weight Domain-Adaptive Document Layout Analysis
P9: UnSupDLA: Towards Unsupervised Document Layout Analysis
P10: Document Classification What Text Design Characterizes Book Genres?
P11: Leveraging Semantic Segmentation Masks with Embeddings for Fine-Grained Form Classification
P12: DocLightDetect: A new algorithm for occlusion classification in identification documents
Demo Session II (14:45 pm – 15:30 pm) Venue: Parthenon II
D4: Let's Create a Dataset! The Chinese Musical Notation Annotation Tool in Action
D5: End-to-end information extraction from handwritten Paris marriage records (1880-1940) using the DANIEL model
15:30 pm – 16:00 pm Discussion Group
16:00 pm – 16:30 pm Afternoon Coffee Break
Session 4: Short Paper / Demo Session (16:30 pm - 17:45 pm) Venue: Parthenon II
Poster Session III (16:30 pm – 17:45 pm) Venue: Parthenon II
P13: A collaborative platform for the segmentation and transcription of historical texts
P14: Arabic Handwritten Text Recognition using Advanced CNN-RNN Architecture
P15: OCR4all 1.0 – Flexible open-source OCR/HTR based on various single-step Solutions
P16: Behind the Smoke: Misleading mistakes on the Tobacco3482 Document Benchmark
P17: DiffLBP: towards differentiable and explainable texture analysis for documents
Demo Session III (16:30 pm – 17:45 pm) Venue: Parthenon II
D6: OCR4all 1.0 – Flexible open-source OCR/HTR based on various single-step Solutions
D7: Arkindex + Callico, the open-source document processing solution by TEKLIA
D8: IoT Paper digitizing paper device for academic research
D9: On Image Processing and Pattern Recognition for Thermograms of Watermarks in Manuscripts - A First Proof-of-Concept

Day 2: Saturday, August 31, 2024

Session 1: (9:15 am - 10:45 am) OCR Correction & NLP / Recognition Systems Venue: Parthenon II
9:15 am - 10:00 am Keynote 1: OLIVIER LESSARD - Revolutionizing Identity Verification: Deep Learning-based Facial Recognition for Identity Documents
10:00 am - 10:15 am O1: Confidence-Aware Document OCR Error Detection
10:15 am - 10:30 am O2: Speed-up Pre-trained Vision Encoder-Decoder Transformers by Leveraging Lightweight Mixer Layers for Text Recognition
10:30 am - 10:45 am O3: Full-page music symbols recognition: state-of-the-art deep model comparison for handwritten and printed music scores
10:45 am - 11:15 am Morning Coffee Break
Session 2: (11:15 am - 13:15 pm) OCR Correction & NLP / Recognition Systems Venue: Parthenon II
11:15 am - 12:00 pm Keynote 2: THOMAS BREUEL - LLMs, Knowledge, and Document Analysis
Poster Session I (12:00 pm – 12:45 pm) Venue: Parthenon II
P1: Confidence-Aware Document OCR Error Detection
P2: Error Correction of Japanese Character-recognition in Answers to Writing-type Questions Using T5
P3: How does changing the Optical Character Recognition system impact the Layout-Aware Named Entity Recognition models?
P4: RUATS: Abstractive Text Summarization for Roman Urdu
P5: Speed-up Pre-trained Vision Encoder-Decoder Transformers by Leveraging Lightweight Mixer Layers for Text Recognition
P6: Maximizing Data Efficiency of HTR Models by Synthetic Text
P7: Contrastive Self-Supervised Learning for Optical Music Recognition
P8: Full-page music symbols recognition: state-of-the-art deep model comparison for handwritten and printed music scores
Demo Session I (12:00 pm – 12:45 pm) Venue: Parthenon II
D1: PERO-OCR tool
D2: General purpose text information search and retrieval from large series of untranscribed text images
D3: AltChart: a chart summarization model with enhanced visual perception
12:45 pm - 13:15 pm Discussion Group
13:15 pm - 14:15 pm Lunch Break
Session 3: (14:15 pm – 16:00 pm) Historical Documents Venue: Parthenon II
14:15 pm - 14:30 pm O4: Fetch-A-Set: A Large-Scale OCR-Free Benchmark for Historical Document Retrieval
14:30 pm - 14:45 pm O5: From Detection to Modeling: An End-to-End Paleographic System for Analysing Historical Handwriting Styles
Poster Session II (14:45 pm – 15:30 pm) Venue: Parthenon II
P9: Fetch-A-Set: A Large-Scale OCR-Free Benchmark for Historical Document Retrieval
P10: From Detection to Modeling: An End-to-End Paleographic System for Analysing Historical Handwriting Styles
P11: FAnG: Fast Annotation of Glyphs in Historical Printed Documents
P12: Bessarion: Medieval Greek Inscriptions on a challenging dataset for Vision and NLP tasks
P13: Automatic Lemmatization of Old Church Slavonic Language Using A Novel Dictionary-Based Approach
P14: Automatic Transcription of Ottoman Documents Using Deep Learning
Demo Session II (14:45 pm – 15:30 pm) Venue: Parthenon II
D4: ClusterTabNet: Supervised clustering method for table detection and table structure recognition
D5: Melodic pattern and Lyrics search in handwritten series of sheet music images
15:30 pm – 16:00 pm Discussion Group
16:00 pm – 16:30 pm Afternoon Coffee Break
16:30 pm – 17:00 pm Closing Ceremony and Awards