Ruby script: download document JPGs from ISSUU

Date: Friday, 2010.08.06Author: roblogic 43 Comments

Ian Wishart has released his excellent book ‘The Divinity Code‘ onto the wild woolly web. Sign up to issuu & download the document PDF now!

HOWEVER sometimes publishers don’t give you a PDF. You can download the JPGs individually using laborious methods, but come on! You’re using a computer.. a machine that specialises in automating boring tasks. Time for a script methinks.

This little ruby snippet downloads all the JPG files comprising a document from issuu.com, to the local directory. It needs a wee bit of setup. Instructions are provided in the code comments.
NOTE you’ll need ruby installed to use it!

#-------------------------------------------------
# get_issuu.rb - retrieve all jpg's for a document
#-------------------------------------------------

# 1. Open the issuu.com document in your web browser, as usual
#    example: http://issuu.com/iwishart/docs/thedivinity
# 2. Read the document page count; set variable $PAGES below
# 3. From browser menu, choose View > Source
# 4. Do a text search for "documentId" 
# 5. Copy string such as "081230122554-f76b0df1e7464a149caf5158813252d9" 
#    to $PUB variable below
# 6. Execute script:
#    ruby get_issuu.rb

require 'open-uri'
$PUB="081230122554-f76b0df1e7464a149caf5158813252d9"
$PAGES=20

for $X in 1..$PAGES do
  $PX="page_#{$X}.jpg"
  $PADX="page_#{"%03d" % $X}.jpg"
  puts(Time.now.strftime('%Y-%m-%d %X') +" get "+ $PX +">>"+ $PADX)
  open($PADX,"wb").write(open("http://image.issuu.com/#{$PUB}/jpg/#{$PX}").read)
end
puts("#{Time.now.strftime('%Y-%m-%d %X')} DONE")

Hat tip: rubynoob.

43 thoughts on “Ruby script: download document JPGs from ISSUU”

Add Comment

Archie says:

Friday, 2010.10.01 at 4:02 am

Sorry a stupid question, but where does it save the downloaded images?
Could you please email me?

Reply
ropata says:

Saturday, 2010.10.02 at 12:45 am

JPGs go to wherever the script is invoked.

Try this
1) copy the script “get_issuu.rb” to your desktop
2) start command prompt and cd to desktop (C: \winnt\profiles\???\desktop , or C: \users\???\desktop , depends on your OS)
3) run the script as described
4) JPGs copied to the desktop 🙂

Reply
Johan M Kjartansson says:

Saturday, 2010.11.27 at 5:57 am

I installed macruby on my mac and tried to run the script in terminal and got this message:

get_issuu.rb:1: syntax error, unexpected $undefined, expecting ‘}’
{\rtf1\ansi\ansicpg1252\cocoartf1038\cocoasubrtf350
^
???

Reply
Johan M Kjartansson says:

Saturday, 2010.11.27 at 6:11 am

Disregard.

It worked after I compiled the .rb file in Dreamweaver instead of TextEdit!?

But just so you know, the images are saved in my users folder even though I had the get_issuu.rb file located on my desktop and I executed it from there also in Terminal 😉

Thanks allot!

Reply
Gonzalo says:

Friday, 2011.06.03 at 12:22 pm

Many Thanx, dude! This worked just perfectly! I was looking for picture books so badly! Cheers!

Reply
Kremena Dyakova says:

Monday, 2011.10.24 at 2:55 am

Thank you.Big time!

Reply
Oleg says:

Wednesday, 2011.10.26 at 6:23 pm

Is it possible to download images in their original size?
For example : on this book http://issuu.com/junkshow/docs/2012_program_special_blend
script worked just fine, but images scaled down to fit the screen and practically unreadable.
thanks.

Reply
Kostas says:

Saturday, 2011.11.19 at 5:31 am

Excellent! Thanks a lot for that.

Reply
nicolas says:

Sunday, 2012.01.22 at 4:06 am

it works!!!! a m a z i n g….thanks

Reply
Herman Sasmita says:

Thursday, 2012.02.02 at 2:44 am

Waaa It Worked….. ^^ THank YOu……

Reply
Image Boss says:

Friday, 2012.02.03 at 4:52 pm

give me the step how to do it?

Reply
1. ropata says:
  
  Friday, 2014.09.12 at 11:41 am
  
  # 1. Open the issuu.com document in your web browser, as usual
  # example: http://issuu.com/iwishart/docs/thedivinity
  # 2. Read the document page count; set variable $PAGES below
  # 3. From browser menu, choose View > Source
  # 4. Do a text search for “documentId”
  # 5. Copy string such as “081230122554-f76b0df1e7464a149caf5158813252d9”
  # to $PUB variable below
  # 6. Execute script:
  # ruby get_issuu.rb
  
  Reply
2. Miika Mäki says:
  
  Thursday, 2016.11.24 at 9:57 am
  
  Step-by-step guide is in the original post. You basically…
  1) paste the code provided to Notepad++
  2) change the number of pages the publication has ($PAGES=)
  3) find and change the documentid ($PUB=)by pressing Ctrl+U on your broswer, then Ctrl+F “documentid” and copy the string (see OP # 3,4,5).
  4) save the modified code as a Ruby file (.rb)
  5) see this video on how to run the Ruby script you just created ( https://youtu.be/oNzoxzECces)
  
  That’s it!
  
  Reply
Mr.Thanks says:

Tuesday, 2012.02.14 at 2:13 pm

I second the explanation on downloading Original FULL SIZE images… I’m getting compressed hardly legible images 😐 totally bumming me out brah…

Reply
sundeep says:

Monday, 2012.02.20 at 4:49 am

how to download the pages inbetween the book??

Reply
S says:

Sunday, 2012.02.26 at 2:43 am

Works,

Seconding 100% original full size image download script proposal.

Reply
Suren says:

Saturday, 2012.03.31 at 3:23 am

Perfect post!

Reply
Cesare says:

Tuesday, 2012.05.01 at 12:23 am

thx work perfectly!!!

Reply
hunghanh says:

Friday, 2012.06.15 at 8:47 pm

Thanks for sharing!
And this is my python code:

#get_issuu.py

import urllib2
from datetime import date
pub=”081230122554-f76b0df1e7464a149caf5158813252d9″
page=20
for x in range(1,20):
px=”page_%s.jpg”%x
padx=”page_%03d.jpg”%x
print date.today().strftime(“%Y-%m-%d”)+str(x)+” get “+px+” >> “+padx
open(padx,’wb’).write(urllib2.urlopen(‘http://image.issuu.com/’+pub+’/jpg/’+px).read())

print date.today().strftime(‘%Y-%m-%d ‘)+page+”downloaded. Done!”

Good luck!

Hung

Reply
1. luluganeta says:
  
  Monday, 2012.12.31 at 2:48 pm
  
  Hi, this Python version is great! I tweaked it a bit to use arguments instead of hardcoded values: http://pastebin.com/LsrRWMhV
  
  Reply
  1. ropata says:
    
    Wednesday, 2013.02.20 at 5:40 pm
    
    Excellent.. I should put a link in the main post 🙂
afrodite p. says:

Tuesday, 2012.06.19 at 2:43 am

Hi there. I have come across a different situation, where this code does not work. The ‘issuu’ document I would like to download is found in a website but the owner has removed it from his/her issuu.com account-webpage. In other words you can only see the file in the owner’s webpage and only if you register (which is free, but requires email & password). It is obvious that the ruby code is developed so that it only downloads documents from this domain “http://image.issuu.com/#{$PUB}” but is it possible to download form other sites like in the case I described above?

Any help would be appreciated.
Thanks.

Reply
iodurodisodio says:

Wednesday, 2012.08.01 at 9:11 pm

Thanks, code work perfectly!

Reply
Allen says:

Wednesday, 2012.12.12 at 9:43 pm

You are amazing… thanks for your script…..

Reply
1. ropata says:
  
  Thursday, 2012.12.13 at 1:30 am
  
  🙂 thanks for kind feedback!
  
  Reply
Eks says:

Friday, 2013.03.22 at 12:51 am

Thanks using this and convert from ImageMagick, I was able to make a pdf file form a 300 page document.

convert *.pdf full_document.pdf

Reply
1. Eks says:
  
  Friday, 2013.03.22 at 12:52 am
  
  I meant:
  
  convert *.jpg full_document.pdf
  
  ImageMagick is free software, for those interested.
  
  Reply
Nick Nikolaev says:

Sunday, 2013.05.12 at 4:44 am

I install Python 3.3.0 on my laptop (win 7 ult. sp1 x64) i made a script file named get get_issuu.py, when I try to Run it, it gave me “Invalid syntax” some solution ?

Reply
1. Nick Nikolaev says:
  
  Sunday, 2013.05.12 at 5:03 am
  
  The Ruby script works fine on Windows! 🙂 Thank you!
  
  Reply
  1. ropata says:
    
    Thursday, 2013.06.06 at 1:25 pm
    
    Yeah I shoulda mentioned that I wrote this ruby script on windows … cheers Nick
An says:

Sunday, 2013.06.23 at 8:42 pm

It is working well. Thanks

Reply
kea says:

Wednesday, 2013.10.23 at 12:47 pm

The “compressed barely legible images” problem is because the files stored on image.issuu.com are basically higher quality thumbnails—JPGs generated from the original source flash videos. Those are accessed via https://page.issuu.com/$documentid/swf/page_1.swf, and so forth. (I think those SWF files are in turn generated from vector images, which I haven’t yet worked out how to find.)

I don’t suppose you also happen to have a ruby script lying around that will grab those SWFs, download them, extract the original images as SVGs and put them together into a PDF file do you? 😆

Reply
1. ropata says:
  
  Friday, 2013.11.22 at 1:48 pm
  
  Ahh heck no i just chucked the script together and forgot about it a few years ago. Now this is the busiest page on my blog 😛
  
  Reply
Yashodhan Bhatt says:

Saturday, 2013.11.02 at 8:31 pm

Awesome script man, you don’t have the slightest idea how valuable was this to me! Cheers… Keep up the good work..

Reply
user says:

Friday, 2014.03.14 at 7:48 am

GUI anyone?

Reply
Tom Zhang says:

Sunday, 2014.06.15 at 6:55 am

Works amazingly well. Thank you!

Reply
HwG says:

Wednesday, 2014.08.27 at 6:07 pm

thanks for the script!..great work!

Reply
Annie says:

Wednesday, 2014.10.29 at 6:03 am

I copied the code into notepad and saved it as .rb file.
Then I double-clicked the file and all I get is a “page_001.jpg” but it is BLANK.

What am I doing wrong?
I copied EXACTLY the code provided.

Thanks!

Reply
Rand says:

Sunday, 2015.03.01 at 5:04 am

instead of all of that just use the following tool
http://abuouday.com/tools/issuu-downloader/

Reply
1. ropata says:
  
  Thursday, 2015.08.27 at 4:54 pm
  
  Even easier to just download the PDFs from issuu.
  My script is applicable when you *can’t* get the PDF.
  
  Reply
photo ngentot abg indo says:

Saturday, 2016.01.23 at 7:40 pm

I copied EXACTLY the code provided.

Thanks!

Reply
www.issuu.com/alquilerdelocalesenalicante says:

Tuesday, 2016.03.08 at 2:07 pm

Genuinely when someone doesn’t know then its up to other people that they will
help, so here it happens.

Reply
Miika Mäki says:

Thursday, 2016.11.24 at 9:59 am

Thanks a lot! This code is awesome 😀

Anyone found a solution to how to get even better quality images/PDFs out?

Reply