Sep 25 2012 00:13
The libcom library has so many great texts, but they are in pdf which is not great for ereaders. I tried converting them, but it doesn't work out too well. I managed to source some of the major books elsewhere, but there are so many that are only available here. Would it be possible for there to be some kind of drive to updating the library with sourcing additional options ie. mobi/epub?


Sep 25 2012 11:52

Yeah, some of the PDF files are small book versions which are okay, but most of them are A4, which is no good.

We have a guide to using e-readers with libcom here: http://libcom.org/library/using-e-book-readers-or-kindles-libcomorg

basically I would like to turn lots of the library texts into e-books. The thing is I don't know how to do it nicely with different chapters or separated, and footnotes linking properly to text at the back (as if you copy and paste entire books from libcom the chapters aren't picked up, and footnotes still link back to libcom.org).

Basically I need to look at how to do this, but my time is very limited. If someone else wanted to take it on that would be great. Certainly we are keen to have our content in as many useful formats as possible

Oct 8 2012 00:56

Thanks. I have got quite a few now through that conversion method, although those in pdf still don't work for me. It would be nice for the library to be in ebook format, although I do appreciate that it is a large undertaking!

Oct 8 2012 04:15

Yeah, it's a nice idea, was chatting to Uncreative about this the other day, but it seems to be a real bastard to actually convert web pages/pdfs to epub or mobi or what have you without fucking up the formatting.


Oct 8 2012 08:49

I'm not sure it's so hard [ETA: for web pages, at least]. An epub is essentially a cut down form of html. Try copying an epub, renaming it whatever.zip then opening the zip file in whatever program you use to view zips. You'll see a whole bunch of html files and css files, and maybe some images, and a few other files peculiar to epub.

If an epub isn't too complex (eg tables embedded within tables) you can probably convert to mobi with either Calibre or maybe with kindlegen (supplied by Amazon).

There is also a free epub editor called Sigil. You can import webpages into this, and edit WYSIWYG or raw html. That said, things don't always work, and it can be hard work fixing things, although global search and replace can help some. But converting a large number of texts will require doing it one at a time, so it still won't be that easy. (Too many things can go wrong for a simple batch convert.)

ETA: PDFs are a little more complex. It depends on whether the PDF is text, or a series of images for each page. If it's text, and the text is simply laid out (e.g. no tables, or boxed out sections of text), then reflowing the text is not impossible, but not perfect. But again, it will be hard work. I tend to just crop the borders/headers/footers/pagenumbers from the PDF, which usually makes the text easier to read (unless formatted for A4).