之前說過PDF To JPG,這次剛好有人問到PDF轉TXT,就順道筆記下來.
採用PDFminer ,一鍵搞定!
1.安裝
pip install pdfminer.six
2.執行
pdf2txt.py -o outfile.txt -t text text.pdf
3.不過遇到可能噴錯問題
dfminer.pdfdocument.PDFTextExtractionNotAllowed: Text extraction is not allowed: <_io.BufferedReader name='text.pdf'>
4.請註解132行
vi ~/anaconda/lib/python3.6/site-packages/pdfminer/pdfpage.py
#if check_extractable and not doc.is_extractable: # raise PDFTextExtractionNotAllowed('Text extraction is not allowed: %r' % fp)
即可安心享用!
5.PDF TO JPG 傳送門