之前的文章有介绍利用Puppeteer部署Docker,生成截图,详见:https://blog.terrynow.com/2022/10/29/use-node-puppeteer-docker-mingalevme-screenshoter-as-service-to-fullscreen-screenshot-and-support-chinese/,不过并不能生成PDF,本次介绍另一个Docker镜像,是生成PDF的。
镜像地址是: https://hub.docker.com/r/hmtx/puppeteer-pdf
同样的,用默认的方式部署docker容器后,也是不支持中文网页生成PDF的,会有乱码现象,需要把中文字体下载准备好,放在服务器的/opt/fonts 下,部署docker的时候,映射进去
- 准备中文字体
我们可以去下载无版本问题的阿里巴巴普惠字体2.0(可放心商用):https://done.alibabadesign.com/puhuiti2.0,当然你也可以使用自己想要的字体准备好
下载好了以后,解压,放在linux上面,如下(以放在/opt/fonts下为例):
[root@localhost fonts]# pwd /opt/fonts [root@localhost fonts]# ll total 61048 -rw-r--r-- 1 root root 2035700 Apr 30 2021 AlibabaPuHuiTi-2-105-Heavy.ttf -rw-r--r-- 1 root root 2022644 Apr 30 2021 AlibabaPuHuiTi-2-115-Black.ttf -rw-r--r-- 1 root root 8465416 Apr 30 2021 AlibabaPuHuiTi-2-35-Thin.ttf -rw-r--r-- 1 root root 8476208 Apr 30 2021 AlibabaPuHuiTi-2-45-Light.ttf -rw-r--r-- 1 root root 8449680 Apr 30 2021 AlibabaPuHuiTi-2-55-Regular.ttf -rw-r--r-- 1 root root 8347080 Apr 30 2021 AlibabaPuHuiTi-2-65-Medium.ttf -rw-r--r-- 1 root root 8293500 Apr 30 2021 AlibabaPuHuiTi-2-75-SemiBold.ttf -rw-r--r-- 1 root root 8289188 Apr 30 2021 AlibabaPuHuiTi-2-85-Bold.ttf -rw-r--r-- 1 root root 8124312 Apr 30 2021 AlibabaPuHuiTi-2-95-ExtraBold.ttf
- 拉取镜像
docker pull hmtx/puppeteer-pdf
- 启动镜像
docker run -d --name puppeteer-pdf -v /opt/fonts:/usr/share/fonts --restart always -p 3000:3000 hmtx/puppeteer-pdf
- 查看效果
浏览器输入:
http://localhost:3000/?url=https://www.baidu.com
应该就能查看到网页输出的baidu首页PDF
Linux上直接保存为文件的命令:
curl "http://localhost:3000/?url=https%3A%2F%2Fwww.baidu.com" > /tmp/baidu.pdf
Java程序获取PDF的简单示例:
httpGetPDFBytes("http://localhost:3000/?url=https://www.baidu.com"); // 或者在方法里面,把InputStream保存成文件,这个很基础,就不展开了 public static byte[] httpGetPDFBytes(String urlString) { try { URL url = new URL(urlString); HttpURLConnection connection = (HttpURLConnection) url.openConnection(); connection.setDoInput(true); connection.connect(); InputStream input = connection.getInputStream(); ByteArrayOutputStream buffer = new ByteArrayOutputStream(); int nRead; byte[] data = new byte[4096]; while ((nRead = input.read(data, 0, data.length)) != -1) { buffer.write(data, 0, nRead); } return buffer.toByteArray(); } catch (IOException e) { // Log exception return null; } }
一些参数的说明:
- username
- password
- format (纸张大小,可选为:"A4", "A3", "Letter" 等等)
- landscape (是否是横屏,可选为:true|false)
- url (需要生成PDF的网页,必填)
文章评论