之前說過PDF To JPG,這次剛好有人問到PDF轉TXT,就順道筆記下來.
採用PDFminer ,一鍵搞定!
1.安裝
1 |
pip install pdfminer.six |
2.執行
1 |
pdf2txt.py -o outfile.txt -t text text.pdf |
3.不過遇到可能噴錯問題
1 |
dfminer.pdfdocument.PDFTextExtractionNotAllowed: Text extraction is not allowed: <_io.BufferedReader name='text.pdf'> |
4.請註解132行
1 |
vi ~/anaconda/lib/python3.6/site-packages/pdfminer/pdfpage.py |
1 2 |
#if check_extractable and not doc.is_extractable: # raise PDFTextExtractionNotAllowed('Text extraction is not allowed: %r' % fp) |
即可安心享用!
5.PDF TO JPG 傳送門