我想做个OCR小程序-PaddleOCR安装

我想做个OCR小程序-PaddleOCR安装

开源的OCR框架并不少,如:tesseractPaddleOCREasyOCRchineseocr_lite 等,这里,我选择的是百度开源的PaddleOCR

先用命令看看我的Centos的版本:

1
2
[root@instance-epknpagk ~]# cat  /etc/redhat-release
CentOS Linux release 8.4.2105

在安装PaddleOCR之前需要先安装Python,其实系统是带有Python,只是没有设置环境变量,导致直接用命令不可访问。那还是用yum安装一下吧

安装Python

1
2
3
4
5
[root@instance-epknpagk ~]# yum -y install python
Last metadata expiration check: 0:17:32 ago on Sat 23 Oct 2021 12:30:05 PM CST.
No match for argument: python
There are following alternatives for "python": python2, python36, python38, python39
Error: Unable to find a match: python

这个提示是要求我们选择要安装的版本。我还是保守一点 ,选择python36

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
[root@instance-epknpagk ~]# yum -y install python36
Last metadata expiration check: 0:19:44 ago on Sat 23 Oct 2021 12:30:05 PM CST.
Dependencies resolved.
=================================================================================================================================================================================================================
Package Architecture Version Repository Size
=================================================================================================================================================================================================================
Installing:
python36 x86_64 3.6.8-2.module_el8.4.0+790+083e3d81 appstream 19 k
Installing dependencies:
python3-pip noarch 9.0.3-19.el8 appstream 20 k
Enabling module streams:
python36 3.6

Transaction Summary
=================================================================================================================================================================================================================
Install 2 Packages

Total download size: 39 k
Installed size: 16 k
Downloading Packages:
(1/2): python3-pip-9.0.3-19.el8.noarch.rpm 2.8 MB/s | 20 kB 00:00
(2/2): python36-3.6.8-2.module_el8.4.0+790+083e3d81.x86_64.rpm 2.6 MB/s | 19 kB 00:00
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 3.8 MB/s | 39 kB 00:00
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Installing : python36-3.6.8-2.module_el8.4.0+790+083e3d81.x86_64 1/2
Running scriptlet: python36-3.6.8-2.module_el8.4.0+790+083e3d81.x86_64 1/2
Installing : python3-pip-9.0.3-19.el8.noarch 2/2
Running scriptlet: python3-pip-9.0.3-19.el8.noarch 2/2
Verifying : python3-pip-9.0.3-19.el8.noarch 1/2
Verifying : python36-3.6.8-2.module_el8.4.0+790+083e3d81.x86_64 2/2

Installed:
python3-pip-9.0.3-19.el8.noarch python36-3.6.8-2.module_el8.4.0+790+083e3d81.x86_64

Complete!

这时候我们就可以通过python3进入到python的环境里了。除了用yum安装外,还可以下载源码自行编译安装,当然,怎么方便怎么弄

安装PaddleOCR

安装PaddleOCR有两种方式,第一种就是通过pip安装,另外就是git clone代码到本地安装,先用最简单的方式走一遍,以此来增强我们的自信心,
在开始之前,我们先升级一下pip

1
2
3
4
5
6
7
8
[root@instance-epknpagk ~]# python3 -m pip install --upgrade pip
WARNING: Running pip install with root privileges is generally not a good idea. Try `__main__.py install --user` instead.
Collecting pip
Downloading https://files.pythonhosted.org/packages/a4/6d/6463d49a933f547439d6b5b98b46af8742cc03ae83543e4d7688c2420f8b/pip-21.3.1-py3-none-any.whl (1.7MB)
100% |████████████████████████████████| 1.7MB 30kB/s
Installing collected packages: pip
Successfully installed pip-21.3.1

在设置一下镜像的代理地址

1
2
3
[root@instance-epknpagk ~]# pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
Writing to /root/.config/pip/pip.conf

下面,我们的命令走起pip install paddleocr,在安装过程中,并不会太顺利,我遇到一个这样的问题

1
Levenshtein/_levenshtein.c:99:10: fatal error: Python.h: No such file or directory

然后去stackoverflow搜索了一番,告知我需要安装yum install python3-devel,OK,安装完成后,我们继续

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
[root@instance-epknpagk ~]# pip install paddleocr
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting paddleocr
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/fc/27/399895cf9623d2cf3b1edbabead47b907261c93813e7f82608544514b898/paddleocr-2.3.0.1-py3-none-any.whl (239 kB)
Collecting opencv-contrib-python==4.4.0.46
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b4/ec/a66505cb25704066235369c8a1c1ed8d37b21f260f7b66d2cfa3264f0724/opencv_contrib_python-4.4.0.46-cp36-cp36m-manylinux2014_x86_64.whl (55.7 MB)
Collecting openpyxl
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/1c/a6/8ce4d2ef2c29be3235c08bb00e0b81e29d38ebc47d82b17af681bf662b74/openpyxl-3.0.9-py2.py3-none-any.whl (242 kB)
Collecting premailer
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b1/07/4e8d94f94c7d41ca5ddf8a9695ad87b888104e2fd41a35546c1dc9ca74ac/premailer-3.10.0-py2.py3-none-any.whl (19 kB)
Collecting pyclipper
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/93/fe/5fec2e268d568b93bf20cb655c99e382d29b81925cf19d836cf1fa0cf0ec/pyclipper-1.3.0-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (124 kB)
Requirement already satisfied: scikit-image==0.17.2 in /usr/local/lib64/python3.6/site-packages (from paddleocr) (0.17.2)
Requirement already satisfied: numpy in /usr/local/lib64/python3.6/site-packages (from paddleocr) (1.19.5)
Requirement already satisfied: visualdl in /usr/local/lib/python3.6/site-packages (from paddleocr) (2.2.1)
Collecting lmdb
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/4d/48/8b040e5120c3dc1bdd85d26e5301336a2185756567c9fe449fbf681c0936/lmdb-1.2.1-cp36-cp36m-manylinux2010_x86_64.whl (297 kB)
Collecting python-Levenshtein
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/2a/dc/97f2b63ef0fa1fd78dcb7195aca577804f6b2b51e712516cc0e902a9a201/python-Levenshtein-0.12.2.tar.gz (50 kB)
Preparing metadata (setup.py) ... done
Requirement already satisfied: lxml in /usr/lib64/python3.6/site-packages (from paddleocr) (4.2.3)
Requirement already satisfied: shapely in /usr/local/lib64/python3.6/site-packages (from paddleocr) (1.7.1)
Requirement already satisfied: tqdm in /usr/local/lib/python3.6/site-packages (from paddleocr) (4.62.3)
Collecting imgaug==0.4.0
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/66/b1/af3142c4a85cba6da9f4ebb5ff4e21e2616309552caca5e8acefe9840622/imgaug-0.4.0-py2.py3-none-any.whl (948 kB)
Requirement already satisfied: scipy in /usr/local/lib64/python3.6/site-packages (from imgaug==0.4.0->paddleocr) (1.5.4)
Requirement already satisfied: imageio in /usr/local/lib/python3.6/site-packages (from imgaug==0.4.0->paddleocr) (2.9.0)
Requirement already satisfied: opencv-python in /usr/local/lib64/python3.6/site-packages (from imgaug==0.4.0->paddleocr) (4.5.4.58)
Requirement already satisfied: six in /usr/local/lib/python3.6/site-packages (from imgaug==0.4.0->paddleocr) (1.16.0)
Requirement already satisfied: matplotlib in /usr/local/lib64/python3.6/site-packages (from imgaug==0.4.0->paddleocr) (3.3.4)
Requirement already satisfied: Pillow in /usr/local/lib64/python3.6/site-packages (from imgaug==0.4.0->paddleocr) (8.4.0)
Requirement already satisfied: PyWavelets>=1.1.1 in /usr/local/lib64/python3.6/site-packages (from scikit-image==0.17.2->paddleocr) (1.1.1)
Requirement already satisfied: networkx>=2.0 in /usr/local/lib/python3.6/site-packages (from scikit-image==0.17.2->paddleocr) (2.5.1)
Requirement already satisfied: tifffile>=2019.7.26 in /usr/local/lib/python3.6/site-packages (from scikit-image==0.17.2->paddleocr) (2020.9.3)
Requirement already satisfied: et-xmlfile in /usr/local/lib/python3.6/site-packages (from openpyxl->paddleocr) (1.1.0)
Requirement already satisfied: cssselect in /usr/local/lib/python3.6/site-packages (from premailer->paddleocr) (1.1.0)
Requirement already satisfied: cachetools in /usr/local/lib/python3.6/site-packages (from premailer->paddleocr) (4.2.4)
Requirement already satisfied: requests in /usr/lib/python3.6/site-packages (from premailer->paddleocr) (2.20.0)
Requirement already satisfied: cssutils in /usr/local/lib/python3.6/site-packages (from premailer->paddleocr) (2.3.0)
Requirement already satisfied: setuptools in /usr/lib/python3.6/site-packages (from python-Levenshtein->paddleocr) (39.2.0)
Requirement already satisfied: Flask-Babel>=1.0.0 in /usr/local/lib/python3.6/site-packages (from visualdl->paddleocr) (2.0.0)
Requirement already satisfied: shellcheck-py in /usr/local/lib64/python3.6/site-packages (from visualdl->paddleocr) (0.7.2.1)
Requirement already satisfied: pre-commit in /usr/local/lib/python3.6/site-packages (from visualdl->paddleocr) (2.15.0)
Requirement already satisfied: flake8>=3.7.9 in /usr/local/lib/python3.6/site-packages (from visualdl->paddleocr) (4.0.1)
Requirement already satisfied: flask>=1.1.1 in /usr/local/lib/python3.6/site-packages (from visualdl->paddleocr) (2.0.2)
Requirement already satisfied: protobuf>=3.11.0 in /usr/local/lib64/python3.6/site-packages (from visualdl->paddleocr) (3.19.0)
Requirement already satisfied: pandas in /usr/local/lib64/python3.6/site-packages (from visualdl->paddleocr) (1.1.5)
Requirement already satisfied: bce-python-sdk in /usr/local/lib/python3.6/site-packages (from visualdl->paddleocr) (0.8.62)
Requirement already satisfied: pycodestyle<2.9.0,>=2.8.0 in /usr/local/lib/python3.6/site-packages (from flake8>=3.7.9->visualdl->paddleocr) (2.8.0)
Requirement already satisfied: pyflakes<2.5.0,>=2.4.0 in /usr/local/lib/python3.6/site-packages (from flake8>=3.7.9->visualdl->paddleocr) (2.4.0)
Requirement already satisfied: importlib-metadata<4.3 in /usr/local/lib/python3.6/site-packages (from flake8>=3.7.9->visualdl->paddleocr) (4.2.0)
Requirement already satisfied: mccabe<0.7.0,>=0.6.0 in /usr/local/lib/python3.6/site-packages (from flake8>=3.7.9->visualdl->paddleocr) (0.6.1)
Requirement already satisfied: click>=7.1.2 in /usr/local/lib/python3.6/site-packages (from flask>=1.1.1->visualdl->paddleocr) (8.0.3)
Requirement already satisfied: itsdangerous>=2.0 in /usr/local/lib/python3.6/site-packages (from flask>=1.1.1->visualdl->paddleocr) (2.0.1)
Requirement already satisfied: Werkzeug>=2.0 in /usr/local/lib/python3.6/site-packages (from flask>=1.1.1->visualdl->paddleocr) (2.0.2)
Requirement already satisfied: Jinja2>=3.0 in /usr/local/lib/python3.6/site-packages (from flask>=1.1.1->visualdl->paddleocr) (3.0.2)
Requirement already satisfied: pytz in /usr/lib/python3.6/site-packages (from Flask-Babel>=1.0.0->visualdl->paddleocr) (2017.2)
Requirement already satisfied: Babel>=2.3 in /usr/lib/python3.6/site-packages (from Flask-Babel>=1.0.0->visualdl->paddleocr) (2.5.1)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib64/python3.6/site-packages (from matplotlib->imgaug==0.4.0->paddleocr) (1.3.1)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.6/site-packages (from matplotlib->imgaug==0.4.0->paddleocr) (0.10.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /usr/local/lib/python3.6/site-packages (from matplotlib->imgaug==0.4.0->paddleocr) (2.4.7)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.6/site-packages (from matplotlib->imgaug==0.4.0->paddleocr) (2.8.2)
Requirement already satisfied: decorator<5,>=4.3 in /usr/local/lib/python3.6/site-packages (from networkx>=2.0->scikit-image==0.17.2->paddleocr) (4.4.2)
Requirement already satisfied: future>=0.6.0 in /usr/local/lib/python3.6/site-packages (from bce-python-sdk->visualdl->paddleocr) (0.18.2)
Requirement already satisfied: pycryptodome>=3.8.0 in /usr/local/lib64/python3.6/site-packages (from bce-python-sdk->visualdl->paddleocr) (3.11.0)
Requirement already satisfied: virtualenv>=20.0.8 in /usr/local/lib/python3.6/site-packages (from pre-commit->visualdl->paddleocr) (20.8.1)
Requirement already satisfied: identify>=1.0.0 in /usr/local/lib/python3.6/site-packages (from pre-commit->visualdl->paddleocr) (2.3.1)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib64/python3.6/site-packages (from pre-commit->visualdl->paddleocr) (6.0)
Requirement already satisfied: toml in /usr/local/lib/python3.6/site-packages (from pre-commit->visualdl->paddleocr) (0.10.2)
Requirement already satisfied: nodeenv>=0.11.1 in /usr/local/lib/python3.6/site-packages (from pre-commit->visualdl->paddleocr) (1.6.0)
Requirement already satisfied: importlib-resources in /usr/local/lib/python3.6/site-packages (from pre-commit->visualdl->paddleocr) (5.3.0)
Requirement already satisfied: cfgv>=2.0.0 in /usr/local/lib/python3.6/site-packages (from pre-commit->visualdl->paddleocr) (3.3.1)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/lib/python3.6/site-packages (from requests->premailer->paddleocr) (3.0.4)
Requirement already satisfied: idna<2.8,>=2.5 in /usr/lib/python3.6/site-packages (from requests->premailer->paddleocr) (2.5)
Requirement already satisfied: urllib3<1.25,>=1.21.1 in /usr/lib/python3.6/site-packages (from requests->premailer->paddleocr) (1.24.2)
Requirement already satisfied: typing-extensions>=3.6.4 in /usr/local/lib/python3.6/site-packages (from importlib-metadata<4.3->flake8>=3.7.9->visualdl->paddleocr) (3.10.0.2)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.6/site-packages (from importlib-metadata<4.3->flake8>=3.7.9->visualdl->paddleocr) (3.6.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib64/python3.6/site-packages (from Jinja2>=3.0->flask>=1.1.1->visualdl->paddleocr) (2.0.1)
Requirement already satisfied: platformdirs<3,>=2 in /usr/local/lib/python3.6/site-packages (from virtualenv>=20.0.8->pre-commit->visualdl->paddleocr) (2.4.0)
Requirement already satisfied: filelock<4,>=3.0.0 in /usr/local/lib/python3.6/site-packages (from virtualenv>=20.0.8->pre-commit->visualdl->paddleocr) (3.3.1)
Requirement already satisfied: backports.entry-points-selectable>=1.0.4 in /usr/local/lib/python3.6/site-packages (from virtualenv>=20.0.8->pre-commit->visualdl->paddleocr) (1.1.0)
Requirement already satisfied: distlib<1,>=0.3.1 in /usr/local/lib/python3.6/site-packages (from virtualenv>=20.0.8->pre-commit->visualdl->paddleocr) (0.3.3)
Requirement already satisfied: dataclasses in /usr/local/lib/python3.6/site-packages (from Werkzeug>=2.0->flask>=1.1.1->visualdl->paddleocr) (0.8)
Using legacy 'setup.py install' for python-Levenshtein, since package 'wheel' is not installed.
Installing collected packages: python-Levenshtein, pyclipper, premailer, openpyxl, opencv-contrib-python, lmdb, imgaug, paddleocr
Running setup.py install for python-Levenshtein ... done
Successfully installed imgaug-0.4.0 lmdb-1.2.1 opencv-contrib-python-4.4.0.46 openpyxl-3.0.9 paddleocr-2.3.0.1 premailer-3.10.0 pyclipper-1.3.0 python-Levenshtein-0.12.2
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

本以为安装结束了,但在执行paddleocr又报错了

1
2
3
4
5
6
7
8
9
10
11
12
[root@instance-epknpagk ~]# paddleocr
Traceback (most recent call last):
File "/usr/local/bin/paddleocr", line 5, in <module>
from paddleocr.paddleocr import main
File "/usr/local/lib/python3.6/site-packages/paddleocr/__init__.py", line 15, in <module>
from .paddleocr import *
File "/usr/local/lib/python3.6/site-packages/paddleocr/paddleocr.py", line 21, in <module>
import cv2
File "/usr/local/lib64/python3.6/site-packages/cv2/__init__.py", line 5, in <module>
from .cv2 import *
ImportError: libGL.so.1: cannot open shared object file: No such file or directory

提示少个东西,少东西自然就是安装了,所以百度了一番,要求安装yum install mesa-libGL.x86_64,接下来有提示ModuleNotFoundError: No module named 'paddle',这里需要安装两个
pip install paddlepaddlepip install paddlepaddle-gpu,

1
ModuleNotFoundError: No module named 'urllib3.packages.six'

这里需要先卸载,然后再重新安装

1
2
pip uninstall urllib3 -y 
pip install --no-cache-dir -U urllib3

好了,现在执行paddleocr算是基本正常了,现在我们来识别一下图片,这里我去网上找了一张身份证

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[root@instance-epknpagk ~]# paddleocr --image_dir 1.png
/usr/lib/python3.6/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
/usr/local/lib64/python3.6/site-packages/paddle/fluid/framework.py:301: UserWarning: You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default.
"You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default."
[2021/10/23 14:00:43] root WARNING: version PP-OCRv2 not support cls models, use version PP-OCR instead
Namespace(benchmark=False, cls_batch_num=6, cls_image_shape='3, 48, 192', cls_model_dir='/root/.paddleocr/2.3.0.1/ocr/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_thresh=0.9, cpu_threads=10, det=True, det_algorithm='DB', det_db_box_thresh=0.6, det_db_score_mode='fast', det_db_thresh=0.3, det_db_unclip_ratio=1.5, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_limit_side_len=960, det_limit_type='max', det_model_dir='/root/.paddleocr/2.3.0.1/ocr/det/ch/ch_PP-OCRv2_det_infer', det_sast_nms_thresh=0.2, det_sast_polygon=False, det_sast_score_thresh=0.5, drop_score=0.5, e2e_algorithm='PGNet', e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_limit_side_len=768, e2e_limit_type='max', e2e_model_dir=None, e2e_pgnet_mode='fast', e2e_pgnet_polygon=True, e2e_pgnet_score_thresh=0.5, e2e_pgnet_valid_set='totaltext', enable_mkldnn=False, gpu_mem=500, help='==SUPPRESS==', image_dir='1.png', ir_optim=True, label_list=['0', '180'], lang='ch', layout_path_model='lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config', max_batch_size=10, max_text_length=25, min_subgraph_size=15, output='./output/table', precision='fp32', process_id=0, rec=True, rec_algorithm='CRNN', rec_batch_num=6, rec_char_dict_path='/usr/local/lib/python3.6/site-packages/paddleocr/ppocr/utils/ppocr_keys_v1.txt', rec_char_type='ch', rec_image_shape='3, 32, 320', rec_model_dir='/root/.paddleocr/2.3.0.1/ocr/rec/ch/ch_PP-OCRv2_rec_infer', save_log_path='./log_output/', show_log=True, table_char_dict_path=None, table_char_type='en', table_max_len=488, table_model_dir=None, total_process_num=1, type='ocr', use_angle_cls=False, use_dilation=False, use_gpu=True, use_mp=False, use_pdserving=False, use_space_char=True, use_tensorrt=False, version='PP-OCRv2', vis_font_path='./doc/fonts/simfang.ttf', warmup=True)
/bin/sh: nvidia-smi: command not found
Traceback (most recent call last):
File "/usr/local/bin/paddleocr", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.6/site-packages/paddleocr/paddleocr.py", line 442, in main
engine = PaddleOCR(**(args.__dict__))
File "/usr/local/lib/python3.6/site-packages/paddleocr/paddleocr.py", line 314, in __init__
super().__init__(params)
File "/usr/local/lib/python3.6/site-packages/paddleocr/tools/infer/predict_system.py", line 45, in __init__
self.text_detector = predict_det.TextDetector(args)
File "/usr/local/lib/python3.6/site-packages/paddleocr/tools/infer/predict_det.py", line 99, in __init__
args, 'det', logger)
File "/usr/local/lib/python3.6/site-packages/paddleocr/tools/infer/utility.py", line 165, in create_predictor
"Not found GPU in current device. Please check your device or set args.use_gpu as False"
ValueError: Not found GPU in current device. Please check your device or set args.use_gpu as False

这是非常尴尬的,说我的设备不支持GPU,那就没办法,只有不用GPU计算了paddleocr --image_dir 1.png --use_gpu false

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
[root@instance-epknpagk ~]# paddleocr --image_dir 1.png --use_gpu false
/usr/lib/python3.6/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
/usr/local/lib64/python3.6/site-packages/paddle/fluid/framework.py:301: UserWarning: You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default.
"You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default."
[2021/10/23 14:01:48] root WARNING: version PP-OCRv2 not support cls models, use version PP-OCR instead
Namespace(benchmark=False, cls_batch_num=6, cls_image_shape='3, 48, 192', cls_model_dir='/root/.paddleocr/2.3.0.1/ocr/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_thresh=0.9, cpu_threads=10, det=True, det_algorithm='DB', det_db_box_thresh=0.6, det_db_score_mode='fast', det_db_thresh=0.3, det_db_unclip_ratio=1.5, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_limit_side_len=960, det_limit_type='max', det_model_dir='/root/.paddleocr/2.3.0.1/ocr/det/ch/ch_PP-OCRv2_det_infer', det_sast_nms_thresh=0.2, det_sast_polygon=False, det_sast_score_thresh=0.5, drop_score=0.5, e2e_algorithm='PGNet', e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_limit_side_len=768, e2e_limit_type='max', e2e_model_dir=None, e2e_pgnet_mode='fast', e2e_pgnet_polygon=True, e2e_pgnet_score_thresh=0.5, e2e_pgnet_valid_set='totaltext', enable_mkldnn=False, gpu_mem=500, help='==SUPPRESS==', image_dir='1.png', ir_optim=True, label_list=['0', '180'], lang='ch', layout_path_model='lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config', max_batch_size=10, max_text_length=25, min_subgraph_size=15, output='./output/table', precision='fp32', process_id=0, rec=True, rec_algorithm='CRNN', rec_batch_num=6, rec_char_dict_path='/usr/local/lib/python3.6/site-packages/paddleocr/ppocr/utils/ppocr_keys_v1.txt', rec_char_type='ch', rec_image_shape='3, 32, 320', rec_model_dir='/root/.paddleocr/2.3.0.1/ocr/rec/ch/ch_PP-OCRv2_rec_infer', save_log_path='./log_output/', show_log=True, table_char_dict_path=None, table_char_type='en', table_max_len=488, table_model_dir=None, total_process_num=1, type='ocr', use_angle_cls=False, use_dilation=False, use_gpu=False, use_mp=False, use_pdserving=False, use_space_char=True, use_tensorrt=False, version='PP-OCRv2', vis_font_path='./doc/fonts/simfang.ttf', warmup=True)
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_graph_clean_pass]
--- Running analysis [ir_analysis_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [layer_norm_fuse_pass]
--- Fused 0 subgraphs into layer_norm op.
--- Running IR pass [attention_lstm_fuse_pass]
--- Running IR pass [seqconv_eltadd_relu_fuse_pass]
--- Running IR pass [seqpool_cvm_concat_fuse_pass]
--- Running IR pass [mul_lstm_fuse_pass]
--- Running IR pass [fc_gru_fuse_pass]
--- Running IR pass [mul_gru_fuse_pass]
--- Running IR pass [seq_concat_fc_fuse_pass]
--- Running IR pass [squeeze2_matmul_fuse_pass]
--- Running IR pass [reshape2_matmul_fuse_pass]
--- Running IR pass [flatten2_matmul_fuse_pass]
--- Running IR pass [map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
--- Running IR pass [repeated_fc_relu_fuse_pass]
--- Running IR pass [squared_mat_sub_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
I1023 14:01:48.296001 31681 graph_pattern_detector.cc:91] --- detected 33 subgraphs
--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
--- Running IR pass [conv_transpose_bn_fuse_pass]
--- Running IR pass [is_test_pass]
--- Running IR pass [runtime_context_cache_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [memory_optimize_pass]
I1023 14:01:48.318467 31681 memory_optimize_pass.cc:199] Cluster name : batch_norm_68.tmp_0 size: 1920
I1023 14:01:48.318507 31681 memory_optimize_pass.cc:199] Cluster name : conv2d_185.tmp_0 size: 1382400
I1023 14:01:48.318514 31681 memory_optimize_pass.cc:199] Cluster name : conv2d_150.tmp_0 size: 29491200
I1023 14:01:48.318518 31681 memory_optimize_pass.cc:199] Cluster name : elementwise_add_5 size: 576000
I1023 14:01:48.318522 31681 memory_optimize_pass.cc:199] Cluster name : nearest_interp_v2_3.tmp_0 size: 5529600
I1023 14:01:48.318528 31681 memory_optimize_pass.cc:199] Cluster name : batch_norm_31.tmp_3 size: 29491200
I1023 14:01:48.318536 31681 memory_optimize_pass.cc:199] Cluster name : conv2d_182.tmp_0 size: 22118400
I1023 14:01:48.318540 31681 memory_optimize_pass.cc:199] Cluster name : x size: 11059200
I1023 14:01:48.318544 31681 memory_optimize_pass.cc:199] Cluster name : batch_norm_30.tmp_3 size: 7372800
--- Running analysis [ir_graph_to_program_pass]
I1023 14:01:48.366503 31681 analysis_predictor.cc:636] ======= optimize end =======
I1023 14:01:48.370296 31681 naive_executor.cc:98] --- skip [feed], feed -> x
I1023 14:01:48.372720 31681 naive_executor.cc:98] --- skip [batch_norm_31.tmp_3], fetch -> fetch
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_graph_clean_pass]
--- Running analysis [ir_analysis_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [layer_norm_fuse_pass]
--- Fused 0 subgraphs into layer_norm op.
--- Running IR pass [attention_lstm_fuse_pass]
--- Running IR pass [seqconv_eltadd_relu_fuse_pass]
--- Running IR pass [seqpool_cvm_concat_fuse_pass]
--- Running IR pass [mul_lstm_fuse_pass]
--- Running IR pass [fc_gru_fuse_pass]
--- Running IR pass [mul_gru_fuse_pass]
--- Running IR pass [seq_concat_fc_fuse_pass]
--- Running IR pass [squeeze2_matmul_fuse_pass]
--- Running IR pass [reshape2_matmul_fuse_pass]
--- Running IR pass [flatten2_matmul_fuse_pass]
--- Running IR pass [map_matmul_to_mul_pass]
I1023 14:01:48.411522 31681 graph_pattern_detector.cc:91] --- detected 2 subgraphs
--- Running IR pass [fc_fuse_pass]
I1023 14:01:48.412515 31681 graph_pattern_detector.cc:91] --- detected 2 subgraphs
--- Running IR pass [repeated_fc_relu_fuse_pass]
--- Running IR pass [squared_mat_sub_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
I1023 14:01:48.421797 31681 graph_pattern_detector.cc:91] --- detected 14 subgraphs
--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
--- Running IR pass [conv_transpose_bn_fuse_pass]
--- Running IR pass [is_test_pass]
--- Running IR pass [runtime_context_cache_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [memory_optimize_pass]
I1023 14:01:48.431990 31681 memory_optimize_pass.cc:199] Cluster name : lstm_1._generated_var_0 size: 1
I1023 14:01:48.432016 31681 memory_optimize_pass.cc:199] Cluster name : lstm_1.tmp_3 size: 1
I1023 14:01:48.432021 31681 memory_optimize_pass.cc:199] Cluster name : softmax_0.tmp_0 size: 662500
I1023 14:01:48.432025 31681 memory_optimize_pass.cc:199] Cluster name : batch_norm_33.tmp_2 size: 204800
I1023 14:01:48.432034 31681 memory_optimize_pass.cc:199] Cluster name : x size: 38400
I1023 14:01:48.432039 31681 memory_optimize_pass.cc:199] Cluster name : lstm_1.tmp_2 size: 1024
I1023 14:01:48.432042 31681 memory_optimize_pass.cc:199] Cluster name : student_ctc_head_2.tmp_1 size: 662500
I1023 14:01:48.432046 31681 memory_optimize_pass.cc:199] Cluster name : batch_norm_52.tmp_1 size: 2048
I1023 14:01:48.432050 31681 memory_optimize_pass.cc:199] Cluster name : lstm_1.tmp_1 size: 1024
--- Running analysis [ir_graph_to_program_pass]
I1023 14:01:48.459770 31681 analysis_predictor.cc:636] ======= optimize end =======
I1023 14:01:48.461709 31681 naive_executor.cc:98] --- skip [feed], feed -> x
I1023 14:01:48.463284 31681 naive_executor.cc:98] --- skip [student_ctc_head_2.tmp_1], fetch -> fetch
[2021/10/23 14:01:48] root INFO: **********1.png**********
[2021/10/23 14:01:48] root DEBUG: dt_boxes num : 11, elapse : 0.29398012161254883
[2021/10/23 14:01:49] root DEBUG: rec_res num : 11, elapse : 1.2093861103057861
[2021/10/23 14:01:49] root INFO: [[[157.0, 123.0], [233.0, 125.0], [232.0, 151.0], [156.0, 148.0]], ('卢博汉', 0.99022883)]
[2021/10/23 14:01:49] root INFO: [[[83.0, 170.0], [177.0, 170.0], [177.0, 194.0], [83.0, 194.0]], ('性别男', 0.94476193)]
[2021/10/23 14:01:49] root INFO: [[[208.0, 169.0], [291.0, 173.0], [290.0, 195.0], [207.0, 192.0]], ('民族汉', 0.9764979)]
[2021/10/23 14:01:49] root INFO: [[[80.0, 213.0], [339.0, 216.0], [338.0, 237.0], [80.0, 234.0]], ('出生2006年10月10日', 0.99801433)]
[2021/10/23 14:01:49] root INFO: [[[78.0, 263.0], [132.0, 263.0], [132.0, 280.0], [78.0, 280.0]], ('住址', 0.64965886)]
[2021/10/23 14:01:49] root INFO: [[[144.0, 261.0], [379.0, 264.0], [379.0, 284.0], [144.0, 281.0]], ('江西省南昌市南昌县小蓝', 0.9973393)]
[2021/10/23 14:01:49] root INFO: [[[147.0, 291.0], [380.0, 295.0], [380.0, 315.0], [147.0, 312.0]], ('经济开发区塔田村塔田自', 0.98016495)]
[2021/10/23 14:01:49] root INFO: [[[143.0, 322.0], [234.0, 320.0], [234.0, 344.0], [144.0, 347.0]], ('然村62号', 0.9896429)]
[2021/10/23 14:01:49] root INFO: [[[72.0, 381.0], [237.0, 381.0], [237.0, 398.0], [72.0, 398.0]], ('公民身份号码', 0.9898131)]
[2021/10/23 14:01:49] root INFO: [[[232.0, 380.0], [546.0, 380.0], [546.0, 400.0], [232.0, 400.0]], ('360121200610100056', 0.95592064)]

这识别率还是挺高的,好,我去网上截一篇文章试试看呢

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
[root@instance-epknpagk ~]# paddleocr --image_dir ocr_20211023140320.png --use_gpu false
/usr/lib/python3.6/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
/usr/local/lib64/python3.6/site-packages/paddle/fluid/framework.py:301: UserWarning: You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default.
"You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default."
[2021/10/23 14:05:27] root WARNING: version PP-OCRv2 not support cls models, use version PP-OCR instead
Namespace(benchmark=False, cls_batch_num=6, cls_image_shape='3, 48, 192', cls_model_dir='/root/.paddleocr/2.3.0.1/ocr/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_thresh=0.9, cpu_threads=10, det=True, det_algorithm='DB', det_db_box_thresh=0.6, det_db_score_mode='fast', det_db_thresh=0.3, det_db_unclip_ratio=1.5, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_limit_side_len=960, det_limit_type='max', det_model_dir='/root/.paddleocr/2.3.0.1/ocr/det/ch/ch_PP-OCRv2_det_infer', det_sast_nms_thresh=0.2, det_sast_polygon=False, det_sast_score_thresh=0.5, drop_score=0.5, e2e_algorithm='PGNet', e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_limit_side_len=768, e2e_limit_type='max', e2e_model_dir=None, e2e_pgnet_mode='fast', e2e_pgnet_polygon=True, e2e_pgnet_score_thresh=0.5, e2e_pgnet_valid_set='totaltext', enable_mkldnn=False, gpu_mem=500, help='==SUPPRESS==', image_dir='ocr_20211023140320.png', ir_optim=True, label_list=['0', '180'], lang='ch', layout_path_model='lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config', max_batch_size=10, max_text_length=25, min_subgraph_size=15, output='./output/table', precision='fp32', process_id=0, rec=True, rec_algorithm='CRNN', rec_batch_num=6, rec_char_dict_path='/usr/local/lib/python3.6/site-packages/paddleocr/ppocr/utils/ppocr_keys_v1.txt', rec_char_type='ch', rec_image_shape='3, 32, 320', rec_model_dir='/root/.paddleocr/2.3.0.1/ocr/rec/ch/ch_PP-OCRv2_rec_infer', save_log_path='./log_output/', show_log=True, table_char_dict_path=None, table_char_type='en', table_max_len=488, table_model_dir=None, total_process_num=1, type='ocr', use_angle_cls=False, use_dilation=False, use_gpu=False, use_mp=False, use_pdserving=False, use_space_char=True, use_tensorrt=False, version='PP-OCRv2', vis_font_path='./doc/fonts/simfang.ttf', warmup=True)
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_graph_clean_pass]
--- Running analysis [ir_analysis_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [layer_norm_fuse_pass]
--- Fused 0 subgraphs into layer_norm op.
--- Running IR pass [attention_lstm_fuse_pass]
--- Running IR pass [seqconv_eltadd_relu_fuse_pass]
--- Running IR pass [seqpool_cvm_concat_fuse_pass]
--- Running IR pass [mul_lstm_fuse_pass]
--- Running IR pass [fc_gru_fuse_pass]
--- Running IR pass [mul_gru_fuse_pass]
--- Running IR pass [seq_concat_fc_fuse_pass]
--- Running IR pass [squeeze2_matmul_fuse_pass]
--- Running IR pass [reshape2_matmul_fuse_pass]
--- Running IR pass [flatten2_matmul_fuse_pass]
--- Running IR pass [map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
--- Running IR pass [repeated_fc_relu_fuse_pass]
--- Running IR pass [squared_mat_sub_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
I1023 14:05:27.458253 31860 graph_pattern_detector.cc:91] --- detected 33 subgraphs
--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
--- Running IR pass [conv_transpose_bn_fuse_pass]
--- Running IR pass [is_test_pass]
--- Running IR pass [runtime_context_cache_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [memory_optimize_pass]
I1023 14:05:27.481014 31860 memory_optimize_pass.cc:199] Cluster name : batch_norm_68.tmp_0 size: 1920
I1023 14:05:27.481048 31860 memory_optimize_pass.cc:199] Cluster name : conv2d_185.tmp_0 size: 1382400
I1023 14:05:27.481055 31860 memory_optimize_pass.cc:199] Cluster name : conv2d_150.tmp_0 size: 29491200
I1023 14:05:27.481062 31860 memory_optimize_pass.cc:199] Cluster name : elementwise_add_5 size: 576000
I1023 14:05:27.481066 31860 memory_optimize_pass.cc:199] Cluster name : nearest_interp_v2_3.tmp_0 size: 5529600
I1023 14:05:27.481071 31860 memory_optimize_pass.cc:199] Cluster name : batch_norm_31.tmp_3 size: 29491200
I1023 14:05:27.481076 31860 memory_optimize_pass.cc:199] Cluster name : conv2d_182.tmp_0 size: 22118400
I1023 14:05:27.481079 31860 memory_optimize_pass.cc:199] Cluster name : x size: 11059200
I1023 14:05:27.481083 31860 memory_optimize_pass.cc:199] Cluster name : batch_norm_30.tmp_3 size: 7372800
--- Running analysis [ir_graph_to_program_pass]
I1023 14:05:27.529305 31860 analysis_predictor.cc:636] ======= optimize end =======
I1023 14:05:27.533017 31860 naive_executor.cc:98] --- skip [feed], feed -> x
I1023 14:05:27.535477 31860 naive_executor.cc:98] --- skip [batch_norm_31.tmp_3], fetch -> fetch
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_graph_clean_pass]
--- Running analysis [ir_analysis_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [layer_norm_fuse_pass]
--- Fused 0 subgraphs into layer_norm op.
--- Running IR pass [attention_lstm_fuse_pass]
--- Running IR pass [seqconv_eltadd_relu_fuse_pass]
--- Running IR pass [seqpool_cvm_concat_fuse_pass]
--- Running IR pass [mul_lstm_fuse_pass]
--- Running IR pass [fc_gru_fuse_pass]
--- Running IR pass [mul_gru_fuse_pass]
--- Running IR pass [seq_concat_fc_fuse_pass]
--- Running IR pass [squeeze2_matmul_fuse_pass]
--- Running IR pass [reshape2_matmul_fuse_pass]
--- Running IR pass [flatten2_matmul_fuse_pass]
--- Running IR pass [map_matmul_to_mul_pass]
I1023 14:05:27.574136 31860 graph_pattern_detector.cc:91] --- detected 2 subgraphs
--- Running IR pass [fc_fuse_pass]
I1023 14:05:27.575150 31860 graph_pattern_detector.cc:91] --- detected 2 subgraphs
--- Running IR pass [repeated_fc_relu_fuse_pass]
--- Running IR pass [squared_mat_sub_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
I1023 14:05:27.584664 31860 graph_pattern_detector.cc:91] --- detected 14 subgraphs
--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
--- Running IR pass [conv_transpose_bn_fuse_pass]
--- Running IR pass [is_test_pass]
--- Running IR pass [runtime_context_cache_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [memory_optimize_pass]
I1023 14:05:27.595206 31860 memory_optimize_pass.cc:199] Cluster name : lstm_1._generated_var_0 size: 1
I1023 14:05:27.595235 31860 memory_optimize_pass.cc:199] Cluster name : lstm_1.tmp_3 size: 1
I1023 14:05:27.595240 31860 memory_optimize_pass.cc:199] Cluster name : softmax_0.tmp_0 size: 662500
I1023 14:05:27.595247 31860 memory_optimize_pass.cc:199] Cluster name : batch_norm_33.tmp_2 size: 204800
I1023 14:05:27.595254 31860 memory_optimize_pass.cc:199] Cluster name : x size: 38400
I1023 14:05:27.595258 31860 memory_optimize_pass.cc:199] Cluster name : lstm_1.tmp_2 size: 1024
I1023 14:05:27.595263 31860 memory_optimize_pass.cc:199] Cluster name : student_ctc_head_2.tmp_1 size: 662500
I1023 14:05:27.595268 31860 memory_optimize_pass.cc:199] Cluster name : batch_norm_52.tmp_1 size: 2048
I1023 14:05:27.595271 31860 memory_optimize_pass.cc:199] Cluster name : lstm_1.tmp_1 size: 1024
--- Running analysis [ir_graph_to_program_pass]
I1023 14:05:27.623514 31860 analysis_predictor.cc:636] ======= optimize end =======
I1023 14:05:27.625537 31860 naive_executor.cc:98] --- skip [feed], feed -> x
I1023 14:05:27.627077 31860 naive_executor.cc:98] --- skip [student_ctc_head_2.tmp_1], fetch -> fetch
[2021/10/23 14:05:27] root INFO: **********ocr_20211023140320.png**********
[2021/10/23 14:05:28] root DEBUG: dt_boxes num : 15, elapse : 0.39180588722229004
[2021/10/23 14:05:31] root DEBUG: rec_res num : 15, elapse : 3.1125948429107666
[2021/10/23 14:05:31] root INFO: [[[27.0, 21.0], [76.0, 21.0], [76.0, 40.0], [27.0, 40.0]], ('学而篇', 0.92950207)]
[2021/10/23 14:05:31] root INFO: [[[30.0, 77.0], [76.0, 77.0], [76.0, 92.0], [30.0, 92.0]], ('子日:', 0.9615998)]
[2021/10/23 14:05:31] root INFO: [[[83.0, 78.0], [657.0, 78.0], [657.0, 92.0], [83.0, 92.0]], ('"学而时习之,不亦说乎?有朋自远方来,不亦乐乎?人不知而不温,不亦君子乎?', 0.9332169)]
[2021/10/23 14:05:31] root INFO: [[[27.0, 121.0], [910.0, 122.0], [910.0, 139.0], [27.0, 138.0]], ('有子日:“其为人也孝弟而好犯上者,鲜矣;不好犯上而好作乱者,未之有也。君子务本,本立而道生。孝弟也者,其为仁之本', 0.95606136)]
[2021/10/23 14:05:31] root INFO: [[[27.0, 150.0], [56.0, 150.0], [56.0, 170.0], [27.0, 170.0]], ('与!"', 0.88232946)]
[2021/10/23 14:05:31] root INFO: [[[30.0, 200.0], [224.0, 200.0], [224.0, 213.0], [30.0, 213.0]], ('子日:"巧言令色,鲜仁!', 0.93246955)]
[2021/10/23 14:05:31] root INFO: [[[30.0, 246.0], [522.0, 246.0], [522.0, 259.0], [30.0, 259.0]], ('曾子日“吾日三省昔影:为人谋而不思平7与朋友交而不信平?传不习平?', 0.8734449)]
[2021/10/23 14:05:31] root INFO: [[[30.0, 291.0], [419.0, 291.0], [419.0, 305.0], [30.0, 305.0]], ('子日:“道千乘之国.敬事而信,节用而爱人,使民以时。', 0.93525136)]
[2021/10/23 14:05:31] root INFO: [[[28.0, 335.0], [603.0, 336.0], [603.0, 353.0], [28.0, 352.0]], ('子日:“弟子入则孝,出则弟,谨而信,泛爱众,而亲仁。行有余力,则以学文。', 0.99160206)]
[2021/10/23 14:05:31] root INFO: [[[29.0, 382.0], [87.0, 382.0], [87.0, 399.0], [29.0, 399.0]], ('子夏日:', 0.96413237)]
[2021/10/23 14:05:31] root INFO: [[[94.0, 381.0], [803.0, 383.0], [803.0, 400.0], [93.0, 398.0]], ('“贤贤易色;事父母,能竭其力;事君,能致其身;与朋友交,言而有信。虽日未学,吾必谓之学矣。', 0.9826589)]
[2021/10/23 14:05:31] root INFO: [[[30.0, 429.0], [76.0, 429.0], [76.0, 443.0], [30.0, 443.0]], ('子日:', 0.9726858)]
[2021/10/23 14:05:31] root INFO: [[[77.0, 427.0], [567.0, 428.0], [567.0, 444.0], [77.0, 443.0]], ('“君子不重则不威,学则不固。主忠信,无友不如己者,过则勿惮改。', 0.9420074)]
[2021/10/23 14:05:31] root INFO: [[[29.0, 474.0], [84.0, 474.0], [84.0, 490.0], [29.0, 490.0]], ('曾子日:', 0.97829574)]
[2021/10/23 14:05:31] root INFO: [[[102.0, 476.0], [274.0, 476.0], [274.0, 489.0], [102.0, 489.0]], ('“绎追远,民德归厚矣,', 0.86066586)]

识别率是高,但是也有一些文字会丢失。这种识别结果,对于我们来说,有些不好分辨,所以,需要借助另外一个组件来帮我们格式化

1
pip install -U https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
1
2
3
4
paddleocr --image_dir=1.png --use_gpu false --type=structure

[2021/10/23 14:17:37] root INFO: {'type': 'Figure', 'bbox': [0, 1, 694, 519], 'res': ([[157.0, 124.0, 233.0, 124.0, 233.0, 150.0, 157.0, 150.0], [85.0, 171.0, 175.0, 171.0, 175.0, 192.0, 85.0, 192.0], [194.0, 169.0, 290.0, 173.0, 289.0, 196.0, 193.0, 191.0], [80.0, 213.0, 338.0, 217.0, 338.0, 237.0, 80.0, 234.0], [78.0, 263.0, 146.0, 263.0, 146.0, 280.0, 78.0, 280.0], [144.0, 261.0, 380.0, 264.0, 379.0, 284.0, 144.0, 281.0], [147.0, 291.0, 380.0, 293.0, 379.0, 315.0, 147.0, 313.0], [144.0, 322.0, 232.0, 322.0, 232.0, 343.0, 144.0, 343.0], [72.0, 379.0, 238.0, 379.0, 238.0, 400.0, 72.0, 400.0], [231.0, 378.0, 547.0, 378.0, 547.0, 402.0, 231.0, 402.0]], [('卢博汉', 0.991118), ('性别男', 0.9331177), ('民族汉', 0.804858), ('出生2006年10月10日', 0.99847186), ('往址:', 0.70957905), ('江西省南昌市南昌县小蓝', 0.9975168), ('经济开发区塔田村塔田自', 0.9736971), ('然村62号', 0.99629766), ('公民身份号码', 0.9932396), ('360121200610100056', 0.9873691)])}

我想做个OCR小程序-PaddleOCR安装

https://blogs.52fx.biz/posts/184948635.html

作者

eyiadmin

发布于

2021-10-20

更新于

2024-05-31

许可协议

评论