|
本帖最后由 hbghlyj 于 2022-8-11 19:11 编辑 Linux下ghostscript的指令是gs, Windows下则是gswin32c.exe .(以下示例为linux)
首先用wget从Internet下载pdf文件到当前目录:
- wget http://pcjohnson.net/312/Sqrt.pdf
复制代码
提取第9页到outfile_9.pdf
- gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -dFirstPage=9 -dLastPage=9 -sOutputFile=outfile_9.pdf Sqrt.pdf
复制代码 得到:
再进行裁剪.参数顺序:左框到原左框的距离121,下框到原下框的距离120,右框到原左框的距离523,上框到原下框的距离359. 单位都是postscript point,即1/72 inch.
- gs -o cropped.pdf -sDEVICE=pdfwrite -c "[/CropBox [121 120 523 359]" -c " /PAGES pdfmark" -f outfile_9.pdf
复制代码
参考:stackoverflow.com/questions/6183479/cropping-a-pdf-using-ghostscript-9-01
再将pdf转换为svg
- pdftocairo -svg cropped.pdf Sqrt.svg
复制代码
Sqrt.svg
(38.23 KB, 下载次数: 86)
注1.
如果参数顺序错了,gs不会报错,但是会输出空的pdf(只有一条缝).如果使用pdfcrop,它就会报错
!!!Warning: Empty Bounding Box is returned by Ghostscript!
!!! Page 1: 121 120 121 432
!!! Either there is a problem with the page or with Ghostscript.
!!! Recovery is tried by embedding the page in its original size.
注2.
另外,gs可以从pdf中分离image,vector,text三者.在这个例子中,“坐标轴”,“函数曲线”,“分数线”,“分隔线”都属于vector(矢量图).
参考:stackoverflow.com/questions/29657335/how-can-i-remove-all-images-from-a-pdf/37858893#37858893
关于“PDF”格式的资料:
Many PDF documents may contain vector artwork or line-art. However, due to the nature of PDF, it is not always possible for a computer program to determine where on a page such artwork occurs since each page is stored as a general mix of text, images and line-art. |
|