Schlagwort-Archive: download

download from vitalsource

login and save session cookie:

wget --keep-session-cookies --user-agent=Mozilla/5.0 --save-cookies cookies.txt --post-data 'user%5Bemail%5D=someone%40someplace.com&user%5Bpassword%5D=asdf&return=https%3A%2F%2Fevantage.gilmoreglobal.com%2F%23%2F&failure=https%3A%2F%2Fevantage.gilmoreglobal.com%2F%23%2Fuser%2Fsignin%2Ffailure%2Fsomeone%2540someplace.com&jigsaw_brand=evantage' https://jigsaw.vitalsource.com/login

„print“ the whole page and look for the page with „print“ in it:
https://jigsaw.vitalsource.com/api/v0/books/somebook/print?from=chapter-1&to=chapter-end

This will give you a html with all picture links in it. I saved it as links.txt
LINKS=$(sed -re '/src/!d' -e '/.js/d' -e '/.css/d' -e 's#.*src="(.*?)".*#https://jigsaw.vitalsource.com/\1#' -e 's#800#2048#' -e 's#%20# #' links.txt)
counter=1
for i in $(echo $LINKS); do
wget --load-cookies cookies.txt ${i} -O "$(printf "%03d" ${counter}).jpg"
((counter+=1))
done

If printing isn’t available, it is possible to iterate through the pages and get the image source out of the html. Open the first page and look at the frame source code for something like „https://jigsaw.vitalsource.com/books/HK758SH-00-DATASH-E/pages/348102883/content“. The number 348102883 can be iterated.
You get the image source link with:


wget --header="Accept-Encoding: compress, gzip" --header="Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" --user-agent='Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/53.0.2785.143 Chrome/53.0.2785.143 Safari/537.36' --ignore-length --keep-session-cookies --save-cookies cookies.txt --load-cookies cookies.txt 'https://jigsaw.vitalsource.com/books/HK758SH-00-DATASH-E/pages/348102883/content' -O - | gunzip | sed -re '/src/!d' -e '/.js/d' -e 's#.*src="(.*?)".*".*?".*#https://jigsaw.vitalsource.com/\1#' -e 's#800#2000#' -e 's#%20# #'

If there are some numbers missing there is a 302 status code. Run in a loop and first check with

wget --keep-session-cookies --save-cookies cookies.txt --load-cookies cookies.txt -S "https://jigsaw.vitalsource.com/books/HK758SH-00-DATASH-E/pages/348102884/content?create=true" 2>&1 | grep "HTTP/" | awk '{print $2}' | head -n1

if it is 200 or 302. If 200 you can get the correct image source link.