Archiv für den Monat: Dezember 2016

download from vitalsource

login and save session cookie:

wget --keep-session-cookies --user-agent=Mozilla/5.0 --save-cookies cookies.txt --post-data 'user%5Bemail%5D=someone%40someplace.com&user%5Bpassword%5D=asdf&return=https%3A%2F%2Fevantage.gilmoreglobal.com%2F%23%2F&failure=https%3A%2F%2Fevantage.gilmoreglobal.com%2F%23%2Fuser%2Fsignin%2Ffailure%2Fsomeone%2540someplace.com&jigsaw_brand=evantage' https://jigsaw.vitalsource.com/login

„print“ the whole page and look for the page with „print“ in it:
https://jigsaw.vitalsource.com/api/v0/books/somebook/print?from=chapter-1&to=chapter-end

This will give you a html with all picture links in it. I saved it as links.txt
LINKS=$(sed -re '/src/!d' -e '/.js/d' -e '/.css/d' -e 's#.*src="(.*?)".*#https://jigsaw.vitalsource.com/\1#' -e 's#800#2048#' -e 's#%20# #' links.txt)
counter=1
for i in $(echo $LINKS); do
wget --load-cookies cookies.txt ${i} -O "$(printf "%03d" ${counter}).jpg"
((counter+=1))
done

If printing isn’t available, it is possible to iterate through the pages and get the image source out of the html. Open the first page and look at the frame source code for something like „https://jigsaw.vitalsource.com/books/HK758SH-00-DATASH-E/pages/348102883/content“. The number 348102883 can be iterated.
You get the image source link with:


wget --header="Accept-Encoding: compress, gzip" --header="Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" --user-agent='Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/53.0.2785.143 Chrome/53.0.2785.143 Safari/537.36' --ignore-length --keep-session-cookies --save-cookies cookies.txt --load-cookies cookies.txt 'https://jigsaw.vitalsource.com/books/HK758SH-00-DATASH-E/pages/348102883/content' -O - | gunzip | sed -re '/src/!d' -e '/.js/d' -e 's#.*src="(.*?)".*".*?".*#https://jigsaw.vitalsource.com/\1#' -e 's#800#2000#' -e 's#%20# #'

If there are some numbers missing there is a 302 status code. Run in a loop and first check with

wget --keep-session-cookies --save-cookies cookies.txt --load-cookies cookies.txt -S "https://jigsaw.vitalsource.com/books/HK758SH-00-DATASH-E/pages/348102884/content?create=true" 2>&1 | grep "HTTP/" | awk '{print $2}' | head -n1

if it is 200 or 302. If 200 you can get the correct image source link.

Film aus einzelne Filmsegmente runterladen und zusammenführen von Arte

  • .m3u8 Datei finden und herunterladen
  • überflüssige Informationen entfernen
    sed -i -e '/^#/d' -e '/^$/d' index_1_av.m3u8
  • parallel die Dateien herunterladen
    cat index_1_av.m3u8 | parallel --gnu "wget -c {}"
  • Dateien umbenennen, damit in der richtigen Reihenfolge zusammengefügt wird
    for i in *.ts; do extracted_number=$(sed -re 's/segment([0-9]*)_.*/\1/' <<<"$i"); mv "$i" "segment_$(printf "%04d" ${extracted_number}).ts"; done
  • zu einem Film zusammenfügen
    cat segment_* > film.ts