最近项目需要做在线预览文档功能,要求对上传的word文档后台转为pdf,遇到了很多问题,因此记录一下。

最开始我使用document4j实现了正常的转换,但在服务器上却报错,查看了日志才发现document4j好像是通过创建临时文件的方式进行转换,在本地运行时在C盘下创建了临时文件,到了linux服务器上因为文件目录差异,无法创建成功,因此导致转换失败;

        在经过无数次百度之后,我尝试使用poi来实现文档转换功能,引入pom依赖

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>3.13</version>
</dependency>
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>3.13</version>
</dependency>
<dependency>
    <groupId>fr.opensagres.xdocreport</groupId>
    <artifactId>org.apache.poi.xwpf.converter.pdf</artifactId>
    <version>1.0.4</version>
</dependency>

下面是实现代码

 

public R<FileVo> uploadExamFile(MultipartFile file) throws Exception {
String contentType = file.getContentType();
InputStream inputStream = file.getInputStream();
String fileName = FileUploadUtil.extractFilename(file);
FileVo fileVo = new FileVo();
String fileType = FileTypeUtil.getExtension(file);
fileVo.setFileType(fileType);
boolean flag = FileUploadUtil.isAllowedExtension(fileType, MimeTypeUtils.EXAM_FILE_TYPE);
if (!flag) {
    throw new BaseException("文件类型错误");
}
long size = file.getSize();
ByteArrayInputStream input = null;
//是否word文档
if (OssConstants.DOCX.equals(fileType) || OssConstants.DOC.equals(fileType)) {
    String fileEncode = System.getProperty("file.encoding");
    String path = "/usr/tmp/";

    byte[] bytes = new byte[1024];
    int len = 0;
    StringBuilder sb = new StringBuilder();
    FileOutputStream out = new FileOutputStream(path+"template.docx");
    while ((len = inputStream.read(bytes))!=-1){
        sb.append(new String(bytes,"UTF-8"));
        out.write(bytes);
        out.flush();
    }
    inputStream = new FileInputStream(path+"template.docx");
    XWPFDocument document = null;
    try {
        document = new XWPFDocument(inputStream);
    } catch (IOException e) {
        e.printStackTrace();
    }
    PdfOptions options = PdfOptions.create();
    FileOutputStream out1 = new FileOutputStream(path+"template.pdf");
    try {
        PdfConverter.getInstance().convert(document, out1, options);
    } catch (IOException e) {
        e.printStackTrace();
    }finally {
        out.close();
    }
    inputStream = new FileInputStream(path+"template.pdf");
    contentType = "application/pdf";
    fileName = fileName.substring(0, fileName.lastIndexOf(".")) + ".pdf";
    fileVo.setFileType(OssConstants.PDF);
} else if (FileUtil.isVideo(file)) {//是否视频
    if (size > OssConstants.VIDEO_MAX_SIZE) {
        throw new BaseException(OssConstants.FILE_MORE_THAN_MAX_SIZE);
    }
}else if (OssConstants.PDF.equals(fileType)){
    if (size > OssConstants.DOCUMENT_MAX_SIZE) {
        throw new BaseException(OssConstants.FILE_MORE_THAN_MAX_SIZE);
    }
}

PutObjectArgs args = PutObjectArgs.builder()
        .bucket(minioProperties.getBucket())
        .object(fileName)
        .stream(inputStream, inputStream.available(), -1)
        .contentType(contentType)
        .build();
client.putObject(args);
inputStream.close();
File file1 = new File("/usr/tmp/template.pdf");
File file2 = new File("/usr/tmp/template.docx");
file1.delete();
file2.delete();
String fileUrl = minioUtil.getPresignedObjectUrl(minioProperties.getBucket(), fileName);
fileVo.setFileUrl(fileUrl);
return R.ok().data(fileVo);

我是将上传的文件先存到linux系统上,在读取文件进行转换,然后读取转换成功的文件上传到minio上,当时直接转换也是成功了的但是文件时乱码的,后来考虑到可能是编码问题,所以尝试先存再转,转换成功后还是乱码,这是缺少中文字体的问题,上传字体库就能解决

最终在linux环境下会出现乱码问题,参考下面文档操作就能解决

 Linux下word转pdf中文乱码问题_西南疯啊疯的博客-CSDN博客_linux word转pdf 乱码

Logo

更多推荐