Ruby script: download document JPGs from ISSUU

Ian Wishart has released his excellent book ‘The Divinity Code‘ onto the wild woolly web. Sign up to issuu & download the document PDF now!

HOWEVER sometimes publishers don’t give you a PDF. You can download the JPGs individually using laborious methods, but come on! You’re using a computer.. a machine that specialises in automating boring tasks. Time for a script methinks.

This little ruby snippet downloads all the JPG files comprising a document from, to the local directory. It needs a wee bit of setup. Instructions are provided in the code comments.
NOTE you’ll need ruby installed to use it!

# get_issuu.rb - retrieve all jpg's for a document

# 1. Open the document in your web browser, as usual
#    example:
# 2. Read the document page count; set variable $PAGES below
# 3. From browser menu, choose View > Source
# 4. Do a text search for "documentId" 
# 5. Copy string such as "081230122554-f76b0df1e7464a149caf5158813252d9" 
#    to $PUB variable below
# 6. Execute script:
#    ruby get_issuu.rb

require 'open-uri'

for $X in 1..$PAGES do
  $PADX="page_#{"%03d" % $X}.jpg"
  puts('%Y-%m-%d %X') +" get "+ $PX +">>"+ $PADX)
puts("#{'%Y-%m-%d %X')} DONE")

Hat tip: rubynoob.


46 thoughts on “Ruby script: download document JPGs from ISSUU

  1. JPGs go to wherever the script is invoked.

    Try this
    1) copy the script “get_issuu.rb” to your desktop
    2) start command prompt and cd to desktop (C: \winnt\profiles\???\desktop , or C: \users\???\desktop , depends on your OS)
    3) run the script as described
    4) JPGs copied to the desktop 🙂

  2. I installed macruby on my mac and tried to run the script in terminal and got this message:

    get_issuu.rb:1: syntax error, unexpected $undefined, expecting ‘}’

  3. Disregard.

    It worked after I compiled the .rb file in Dreamweaver instead of TextEdit!?

    But just so you know, the images are saved in my users folder even though I had the get_issuu.rb file located on my desktop and I executed it from there also in Terminal 😉

    Thanks allot!

    • # 1. Open the document in your web browser, as usual
      # example:
      # 2. Read the document page count; set variable $PAGES below
      # 3. From browser menu, choose View > Source
      # 4. Do a text search for “documentId”
      # 5. Copy string such as “081230122554-f76b0df1e7464a149caf5158813252d9”
      # to $PUB variable below
      # 6. Execute script:
      # ruby get_issuu.rb

    • Step-by-step guide is in the original post. You basically…
      1) paste the code provided to Notepad++
      2) change the number of pages the publication has ($PAGES=)
      3) find and change the documentid ($PUB=)by pressing Ctrl+U on your broswer, then Ctrl+F “documentid” and copy the string (see OP # 3,4,5).
      4) save the modified code as a Ruby file (.rb)
      5) see this video on how to run the Ruby script you just created (

      That’s it!

  4. I second the explanation on downloading Original FULL SIZE images… I’m getting compressed hardly legible images 😐 totally bumming me out brah…

  5. Thanks for sharing!
    And this is my python code:

    import urllib2
    from datetime import date
    for x in range(1,20):
    print“%Y-%m-%d”)+str(x)+” get “+px+” >> “+padx

    print‘%Y-%m-%d ‘)+page+”downloaded. Done!”

    Good luck!


  6. Hi there. I have come across a different situation, where this code does not work. The ‘issuu’ document I would like to download is found in a website but the owner has removed it from his/her account-webpage. In other words you can only see the file in the owner’s webpage and only if you register (which is free, but requires email & password). It is obvious that the ruby code is developed so that it only downloads documents from this domain “{$PUB}” but is it possible to download form other sites like in the case I described above?

    Any help would be appreciated.

  7. The “compressed barely legible images” problem is because the files stored on are basically higher quality thumbnails—JPGs generated from the original source flash videos. Those are accessed via$documentid/swf/page_1.swf, and so forth. (I think those SWF files are in turn generated from vector images, which I haven’t yet worked out how to find.)

    I don’t suppose you also happen to have a ruby script lying around that will grab those SWFs, download them, extract the original images as SVGs and put them together into a PDF file do you? 😆

  8. I copied the code into notepad and saved it as .rb file.
    Then I double-clicked the file and all I get is a “page_001.jpg” but it is BLANK.

    What am I doing wrong?
    I copied EXACTLY the code provided.


  9. I am writing to let you know what a magnificent experience my wife’s child developed going through yuor web blog. She mastered several pieces, which included what it’s like to possess a marvelous coaching nature to have other individuals without hassle completely grasp various very confusing issues. You undoubtedly exceeded our own expectations. Thank you for supplying such warm and helpful, healthy, educational as well as easy tips on this topic to Evelyn.

  10. You have remarked very interesting details ! ps decent internet site . “I just wish we knew a little less about his urethra and a little more about his arms sales to Iran.” by Andrew A. Rooney.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s