An archived version, for anyone having access difficulties:
https://web.archive.org/web/20250713020638/https://www.bbc.com/news/articles/cyvj986l615o
and doesn’t need to be exactly right
What kind of tasks do you consider that don’t need to be exactly right?
I’m not sure if it would be viable for a long book, and I’m also avoiding google, but thanks for helping. I got some nice suggestions in this thread.
Well, I’m avoiding google, but I will keep it in mind as a last last resort, thanks
I’m giving preference to open source tools, but that’s a good thing to know, thanks
Thanks for the suggestions. That OCR_translate looks interesting. I will prioritize other recommended tools that seem to be more focused on books, but I bookmarked it for future needs.
I used tesseract, but the output pdf didn’t have visible text, and I found no way to change it. Maybe I don’t know how to properly use it., or it’s not intended to keep formatting.
That PaddleOCR looks very interesting. It will even extract images and formulas and somewhat preserve formatting in the output! I will try this one, even if takes more than a day to process is with my low end cpu. Thank you for the suggestion!
By the way, how can a site know your age through mere non logged in access?
If you’re not in a position to change your workflow and deal with new stuff, you can simply use windows 10 lts for a longer support and postpone the decision between linux and windows 11.
Personally, I’d recommend trying linux some day. It can drain some free time at first, but in the long run, you will find yourself dealing with much less bullshit than windows, and actually saving time in your life. Some linux users like to make things complicated and pass their time tinkering with the system, so it passes an image of linux being like that, but if you run a simple and stable distro, things will work nicely and will rarely require your time. I’m running fedora for a few years, and my laptop became so boring. I just use it for my work and hobbies, and turn it off when done. No bullshit.