The table recognition system in MinerU is a sophisticated dual-path pipeline designed to handle both "Wired" (tables with explicit grid lines) and "Wireless" (tables relying on whitespace alignment) structures. It incorporates automated orientation correction, specialized classification, and a custom grid reconstruction algorithm to produce high-fidelity HTML and Markdown representations of tabular data.
The table recognition logic is orchestrated within the MineruPipelineModel and BatchAnalyze classes. When a table region is detected by the layout model, the system extracts the table image and routes it through a series of specialized models managed by the AtomModelSingleton.
The following diagram illustrates the lifecycle of a table from a raw image to a reconstructed HTML structure, mapping high-level steps to their corresponding code entities.
Diagram: Table Processing Pipeline
Sources: mineru/model/table/rec/unet_table/main.py68-129 mineru/model/table/rec/slanet_plus/main.py165-187 mineru/model/table/cls/paddle_table_cls.py134-150
Before recognition begins, classification steps ensure the table image is in the optimal state for the recognition models.
The orientation logic ensures tables are upright before structural analysis.
MineruTableOrientationClsModel mineru/model/table/cls/mineru_table_ori_cls.py25-27The PaddleTableClsModel mineru/model/table/cls/paddle_table_cls.py16-26 determines which recognition path to take by classifying the table structure.
TSRUnet.SlanetPlus or RapidTable.onnxruntime.InferenceSession) to predict the class and confidence score mineru/model/table/cls/paddle_table_cls.py18-20 The model accepts a batch of images and updates the table_res dictionary with cls_label and cls_score mineru/model/table/cls/paddle_table_cls.py134-150[0.485, 0.456, 0.406] and std [0.229, 0.224, 0.225] mineru/model/table/cls/paddle_table_cls.py21-25Sources: mineru/model/table/cls/mineru_table_ori_cls.py25-192 mineru/model/table/cls/paddle_table_cls.py16-76 mineru/model/table/cls/paddle_table_cls.py134-150
The wired path uses a UNet-based architecture to segment table lines and a geometric recovery algorithm to reconstruct the grid.
The TSRUnet class mineru/model/table/rec/unet_table/table_structure_unet.py25-38 performs:
cv2.morphologyEx) and get_table_line mineru/model/table/rec/unet_table/utils_table_line_rec.py129-146 to extract discrete line segments from the mask mineru/model/table/rec/unet_table/table_structure_unet.py117-128final_adjust_lines and adjust_lines mineru/model/table/rec/unet_table/table_structure_unet.py131-137cal_rotate_angle mineru/model/table/rec/unet_table/table_structure_unet.py164-177 and optionally performs rotate_image to normalize the grid mineru/model/table/rec/unet_table/table_structure_unet.py140-146The WiredTableRecognition class mineru/model/table/rec/unet_table/main.py61-66 orchestrates the full recovery:
TableRecover: Transforms predicted polygons into a logical grid with logic_points mineru/model/table/rec/unet_table/main.py89-91match_ocr_cell mineru/model/table/rec/unet_table/main.py107fill_blank_rec mineru/model/table/rec/unet_table/main.py109 This includes a noise filter _should_drop_blank_cell_rec_result to drop low-confidence or irrelevant characters mineru/model/table/rec/unet_table/main.py174-184plot_html_table mineru/model/table/rec/unet_table/main.py121Sources: mineru/model/table/rec/unet_table/main.py46-129 mineru/model/table/rec/unet_table/table_structure_unet.py25-150 mineru/model/table/rec/unet_table/utils_table_line_rec.py129-175 mineru/model/table/rec/unet_table/utils_table_recover.py122-158
Wireless tables are processed using the PaddleTableModel wrapper around the SlanetPlus architecture.
The TableStructurer mineru/model/table/rec/slanet_plus/table_structure.py28-40 uses an ONNX model to predict both the table structure (HTML tags) and the cell bounding boxes.
<tr>, <td>, </td>) and a list of cell coordinates mineru/model/table/rec/slanet_plus/table_structure.py53-68batch_process enables high-throughput inference across multiple table images using BatchTablePreprocess mineru/model/table/rec/slanet_plus/table_structure.py70-113The PaddleTable class aligns OCR text with the predicted structure:
adapt_slanet_plus mineru/model/table/rec/slanet_plus/main.py137-145match_result mineru/model/table/rec/slanet_plus/matcher.py132-159 It uses IoU and distance scores to pair OCR boxes with cell boxes mineru/model/table/rec/slanet_plus/matcher.py73-102decode_logic_points mineru/model/table/rec/slanet_plus/main.py68table_matcher generates the final HTML string by combining predicted structures, cell bboxes, and OCR results mineru/model/table/rec/slanet_plus/main.py62Sources: mineru/model/table/rec/slanet_plus/main.py37-71 mineru/model/table/rec/slanet_plus/table_structure.py28-113 mineru/model/table/rec/slanet_plus/matcher.py120-159
The system uses the AtomModelSingleton to manage the lifecycle of these models. The specific recognition model is selected based on the output of the classification step.
Diagram: Code Entity Mapping
| System Concept | Implementation Class | Primary File |
|---|---|---|
| Model Manager | AtomModelSingleton | mineru/backend/pipeline/model_init.py |
| Wired Recognition | WiredTableRecognition | mineru/model/table/rec/unet_table/main.py61-66 |
| Wireless Recognition | PaddleTableModel | mineru/model/table/rec/slanet_plus/main.py153-163 |
| Structure Recovery | TableRecover | mineru/model/table/rec/unet_table/main.py65 |
| Table Classifier | PaddleTableClsModel | mineru/model/table/cls/paddle_table_cls.py16-26 |
| Table Structurer | TableStructurer | mineru/model/table/rec/slanet_plus/table_structure.py28-40 |
| Inference Session | OrtInferSession | mineru/model/table/rec/unet_table/utils.py26-41 |
For tables spanning multiple pages, MinerU implements specialized merging logic in mineru/utils/table_merge.py.
_scan_rows calculates effective column metrics and preserved rowspans across boundaries mineru/utils/table_merge.py78-146is_table_continuation_text and _is_continuation_caption identify "Continued Table" markers in captions mineru/utils/table_merge.py11 mineru/utils/table_merge.py203-205TableMergeState caches header signatures and row metrics to determine if a subsequent table is a logical continuation of the previous one mineru/utils/table_merge.py54-67Sources: mineru/utils/table_merge.py11-205 mineru/model/table/rec/unet_table/main.py46-51 mineru/model/table/rec/slanet_plus/main.py153-163 mineru/model/table/rec/unet_table/utils.py26-41 mineru/model/table/rec/slanet_plus/table_structure.py28-40 mineru/model/table/cls/paddle_table_cls.py16-76
Refresh this wiki