Ruby script: download document JPGs from ISSUU

Ian Wishart has released his excellent book ‘The Divinity Code‘ onto the wild woolly web. Sign up to issuu & download the document PDF now!

HOWEVER sometimes publishers don’t give you a PDF. You can download the JPGs individually using laborious methods, but come on! You’re using a computer.. a machine that specialises in automating boring tasks. Time for a script methinks.

This little ruby snippet downloads all the JPG files comprising a document from issuu.com, to the local directory. It needs a wee bit of setup. Instructions are provided in the code comments.
NOTE you’ll need ruby installed to use it!

#-------------------------------------------------
# get_issuu.rb - retrieve all jpg's for a document
#-------------------------------------------------

# 1. Open the issuu.com document in your web browser, as usual
#    example: http://issuu.com/iwishart/docs/thedivinity
# 2. Read the document page count; set variable $PAGES below
# 3. From browser menu, choose View > Source
# 4. Do a text search for "documentId" 
# 5. Copy string such as "081230122554-f76b0df1e7464a149caf5158813252d9" 
#    to $PUB variable below
# 6. Execute script:
#    ruby get_issuu.rb

require 'open-uri'
$PUB="081230122554-f76b0df1e7464a149caf5158813252d9"
$PAGES=20

for $X in 1..$PAGES do
  $PX="page_#{$X}.jpg"
  $PADX="page_#{"%03d" % $X}.jpg"
  puts(Time.now.strftime('%Y-%m-%d %X') +" get "+ $PX +">>"+ $PADX)
  open($PADX,"wb").write(open("http://image.issuu.com/#{$PUB}/jpg/#{$PX}").read)
end
puts("#{Time.now.strftime('%Y-%m-%d %X')} DONE")

Hat tip: rubynoob.

About these ads

38 thoughts on “Ruby script: download document JPGs from ISSUU

  1. JPGs go to wherever the script is invoked.

    Try this
    1) copy the script “get_issuu.rb” to your desktop
    2) start command prompt and cd to desktop (C: \winnt\profiles\???\desktop , or C: \users\???\desktop , depends on your OS)
    3) run the script as described
    4) JPGs copied to the desktop :)

  2. I installed macruby on my mac and tried to run the script in terminal and got this message:

    get_issuu.rb:1: syntax error, unexpected $undefined, expecting ‘}’
    {\rtf1\ansi\ansicpg1252\cocoartf1038\cocoasubrtf350
    ^
    ???

  3. Disregard.

    It worked after I compiled the .rb file in Dreamweaver instead of TextEdit!?

    But just so you know, the images are saved in my users folder even though I had the get_issuu.rb file located on my desktop and I executed it from there also in Terminal ;)

    Thanks allot!

  4. I second the explanation on downloading Original FULL SIZE images… I’m getting compressed hardly legible images :| totally bumming me out brah…

  5. Thanks for sharing!
    And this is my python code:

    #get_issuu.py

    import urllib2
    from datetime import date
    pub=”081230122554-f76b0df1e7464a149caf5158813252d9″
    page=20
    for x in range(1,20):
    px=”page_%s.jpg”%x
    padx=”page_%03d.jpg”%x
    print date.today().strftime(“%Y-%m-%d”)+str(x)+” get “+px+” >> “+padx
    open(padx,’wb’).write(urllib2.urlopen(‘http://image.issuu.com/’+pub+’/jpg/’+px).read())

    print date.today().strftime(‘%Y-%m-%d ‘)+page+”downloaded. Done!”

    Good luck!

    Hung

  6. Hi there. I have come across a different situation, where this code does not work. The ‘issuu’ document I would like to download is found in a website but the owner has removed it from his/her issuu.com account-webpage. In other words you can only see the file in the owner’s webpage and only if you register (which is free, but requires email & password). It is obvious that the ruby code is developed so that it only downloads documents from this domain “http://image.issuu.com/#{$PUB}” but is it possible to download form other sites like in the case I described above?

    Any help would be appreciated.
    Thanks.

  7. The “compressed barely legible images” problem is because the files stored on image.issuu.com are basically higher quality thumbnails—JPGs generated from the original source flash videos. Those are accessed via https://page.issuu.com/$documentid/swf/page_1.swf, and so forth. (I think those SWF files are in turn generated from vector images, which I haven’t yet worked out how to find.)

    I don’t suppose you also happen to have a ruby script lying around that will grab those SWFs, download them, extract the original images as SVGs and put them together into a PDF file do you? :lol:

  8. The more of these links you are able to establish,
    the higher your page rank will be. A reputable firm
    will understand your caution and be only too pleased to link you
    to articles which they’ve had specially written by an industry expert.

    This process creates one-way links to direct people to your site
    so that more and more people get to know about it.

  9. And I knw what problems are advised to choose the shoes have come faster under this collection. It
    is often described as” the world’s leading consumers of clothing, shoes must be of the shoes are a girl’s feet. Then, place masking tape. Theses women will have faster six pairs. Lovers is a must-have style staple for the wedding is one of his specific foot site. Onne of the shoes less masculine, but is it that is a high-top.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s