Forgot password?
 Create new account
View 194|Reply 1

PyMupdf替换TrueType字体

[Copy link]

3146

Threads

8493

Posts

610K

Credits

Credits
66158
QQ

Show all posts

hbghlyj Posted at 2023-6-27 18:30:23 |Read mode
font-replacement: Using PyMuPDF v1.17.6 or later, replacing fonts in an existing PDF becomes possible.
这部小说的第2页为例
$type page2.pdf (4.13 KB, Downloads: 31)
写入page2.pdf-fontnames.json以下内容:
  1. [
  2.   {
  3.     "oldfont": [
  4.       "SimHei,Regular"
  5.     ],
  6.     "newfont": "D:/华文黑体.ttf"
  7.   }
  8. ]
Copy the Code
原PDF复制出来的文字之间含有空格(但PDF中不显示空格)。
为了替换掉空格,将python repl-font.py page2.pdf中的2处
  1. text = span["text"].replace(chr(0xFFFD), chr(0xB6))
Copy the Code
修改为
  1. text = span["text"].replace(chr(0xFFFD), chr(0xB6)).replace(' ','')
Copy the Code

Python: How can I replace full-width characters with half-width characters?
运行python repl-font.py page2.pdf将文字中的空格替换掉,并用华文黑体替换旧字体:
下面是stdout
Processing PDF 'page2.pdf' with 1 page.

Phase 1: Analyze use of fonts.
Font replacement overview:
  SimHei,Regular replaced by: STHeiti Regular.

$type page2-new.pdf (62.14 KB, Downloads: 31)
文件尺寸从4KB增加到了62KB

3146

Threads

8493

Posts

610K

Credits

Credits
66158
QQ

Show all posts

 Author| hbghlyj Posted at 2023-6-27 18:39:52
Last edited by hbghlyj at 2023-6-28 15:34:00
原PDF替换字体后
Screenshot 2023-06-27 113938.png Screenshot 2023-06-27 113904.png

原PDF的〔明〕左括号的空白小,其中的字偏左,替换字体后〔明〕括号间的空白变得对称了。
书名号《》变小了。


原PDF复制出的文字《 十 大 古 典 白 话 短 篇 小 说 》 丛 书警 世 通 言
把空格替换掉后:
新PDF复制出的文字《十大古典白话短篇小说》丛书警世通言
显示的文字不变

手机版Mobile version|Leisure Math Forum

2025-4-20 22:07 GMT+8

Powered by Discuz!

× Quick Reply To Top Return to the list