44
submitted 1 year ago* (last edited 1 year ago) by Historical_General@lemmy.world to c/piracy@lemmy.dbzer0.com

I saw this other guy asking how you'd download protected drive only view documents. So that reminded me of that other annoying characteristic of PDFs. They're 'protected'.

How do you deal with PDFs that are inherently uncustomisable and have fixed formatting? I appreciate the KO Reader and other readers can do reflowable text, but I'd prefer not to and epubs/txt/any customisable format would be better.

Any good methods of PDF to text/epub out there?

you are viewing a single comment's thread
view the rest of the comments
[-] liliumstar@lemmy.dbzer0.com 2 points 1 year ago

If you have one of those really annoying PDFs where the structure is all crazy or some letters are pictures, etc., it is possible to OCR them with a mask on page numbers.

There are also tools which can just extract the text elements and smush them together, but as others have said, this doesn't always works as intended.

this post was submitted on 30 Jul 2023
44 points (67.2% liked)

Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ

54420 readers
221 users here now

⚓ Dedicated to the discussion of digital piracy, including ethical problems and legal advancements.

Rules • Full Version

1. Posts must be related to the discussion of digital piracy

2. Don't request invites, trade, sell, or self-promote

3. Don't request or link to specific pirated titles, including DMs

4. Don't submit low-quality posts, be entitled, or harass others



Loot, Pillage, & Plunder

📜 c/Piracy Wiki (Community Edition):


💰 Please help cover server costs.

Ko-Fi Liberapay
Ko-fi Liberapay

founded 1 year ago
MODERATORS