Kontrrazvedka: The story of the Makhnovist intelligence service - Vyacheslav Azarov

Comments

Steven.
Jan 20 2012 13:23

Brilliant, thanks for this!

Django
Jan 20 2012 17:05

Yeah, I've been pondering scanning this for ages - nice one!

flaneur
Jan 20 2012 17:58

Do your Anarchy's Cossack instead.

Django
Jan 20 2012 18:12

That's been donated to the Sparrow's Nest I'm afraid.

flaneur
Jan 20 2012 18:39

I see how it is, 'mate'.

Cooked
Jan 21 2012 21:22

I did a quick test ocr'ing the scan.

Dumped the pdf to ppm >turned into greyscale tiffs > cropped the spreads using imagemagick > ran tesseract-ocr on the result. It's fairly messy but actually a little bit better than I thought. Needs serious spellchecking and more, the notes are totally f'ed. Have you ocr'd stuff before, how successful have you been?

I automated it all using pdf2ppm, imagemagick, tesseract and a *tiny* bit of shell scripting. (linux tools). Had to manually blank out the photos though and added spaces between paragraphs. Was using tesseract 2.04-2.1 on Debian Sid, might be worth while trying the latest version.

It was mainly an experiment and didn't take much more that 30mins do you reckon it's worth while to do this sort of stuff?

Text can be found at http://pastebin.com/qxpJuf50

Steven.
Jan 21 2012 23:37

Hey, that's really great.

We have a lot of OCRed content, it's generally pretty painstaking and takes a long time correcting things, especially like to hear the quality of the scan isn't that great.

Especially with more people using Kindles and screen readers, it's great having stuff in text format. Would you be able to edit this article and paste in a formatted text version?

If you can do this for anything else in the library as well that would be amazing (in the PDFs tag), or write a short guide so that others can do it as well?

Karetelnik
Jan 21 2012 23:51

Perhaps I'm missing something here, but if you want to use a work which comrades in Ukraine and Canada have gone to a lot of trouble to prepare, why not ask them for a digital copy instead of going through this arcane procedure and possibly ending up with an inferior product?

Khawaga
Jan 22 2012 00:00

Good point. Black Cat Press/ Thoughtcrime are very reasonable. They will most likely send the text file if asked (they did this with the Atamansha text I think).

Battlescarred
Feb 8 2012 09:36

"Perhaps I'm missing something here, but if you want to use a work which comrades in Ukraine and Canada have gone to a lot of trouble to prepare, why not ask them for a digital copy instead of going through this arcane procedure and possibly ending up with an inferior product?"
As I said, ask authors of texts first before puttting them up on libcom. Not only good manners, but as Kareltelnik says, can save a lot of bother.

MT
Jul 16 2016 20:20

Can anyone clarify if the current PDF is the "flawed" one or if someone actually has contacted publishers and asked them for a text or PDF?