I had reason to convert a PDF file into an eBook format at the weekend, so that I could more easily read it on my Kindle. The document was a 41,000 word document in 40 sections. Here’s how it went.
I initially just used cut and paste to drop the text into a single page in Scrivener. Sometimes it’s useful to put the text into a plain text editor as that can help strip out some hidden formatting codes in the text, most notable when using Microsoft Word. I didn’t need to do that in this case.
I then split it up into it’s original sections, one text block per section which will make scrolling and editing in Scrivener easier. This is especially noted if you are using the data files off a USB stick; I get a noticeable lag on occasion using it this way (using Scrivener 126.96.36.199 beta).
I then reformat the paragraphs, removing the returns at the end of each line so we had a single return character at the end of each paragraph. This ensures the eventual eBook document will flow properly. There is probably a search and replace combination that would do this far quicker.
I then proofread for hidden breaks and word run-ins and the use of hard-hyphens. These typically happen if the PDF is tweaked so that paragraphs format properly in A4 print versions of the documents. You only really start seeing these when you re-flow the text.
I then went through and corrected spellings and Americanisations. I did this as the eBook is going to be for my own personal use (the PDF is a commercial document without a Kindle version to purchase).
I then edited the text to switch to formatted bullet points and numbered lists. Often you’ll end up with an inserted character for the bullet point, or inserted numbers for the numbered lists and it’s easier for editing and correct eBook generation to have them properly formatted.
A quick final proofread for any final glaring errors, then it’s a compile in Scrivener using Kindlegen to generate the final eBook. Copy it to your reader and away you go.
You may still find errors in the formatting which you’ll see on your eReader; sometimes you can’t spot these until it’s in the end format. You’ll also notice errors when you change font size, or switch from portrait to landscape reading.
All in all I would say it took around four hours to complete all the work. Ultimately whether you do this or not will depend on the source document’s value to you, it’s length and the usefulness of reading on an eBook reader.
On final tip, do the work in stages, otherwise staring at this amount of text for a long time will make you go cross-eyed.