One obvious issue with both these downloaders is lack of proper modularization. Which greatly hinders adoption, I believe. I would expect some kind of plugin system. Naturally, there are attempts to fix that: https://github.com/un-def/dl-plus is one example. As a bonus that would help greatly with recent youtube-dl sort of situations and RIAA would have barely made a splash.
Plenty of other features immediately come to mind as well: universal media support, proper parallelism, GUI, desktop integrations, proxies support, anti-captcha... The punchline is: all of this and much more you could find in jdownloader 10 years ago. But somehow youtube-dl won that race. How did that happen?
Jdownloader has the aesthetic of sketchy windows freeware from a bygone era. I tried it once, then uninstalled it a minute later. youtube-dl also integrates well with other tools or autonomous workflows; for example mpv invokes it to stream media from the web without downloading it to the disk.
I think it is OSS [1], although all the code is in Subversion and I don't see a web interface. I haven't dug through it much, but poking around inside the "trunk" module, I see files with headers that say they're GPLv3.
EDIT: Here's a Git mirror of trunk [2]. See, e.g., this file [3].
The emphasis was more on the "no web interface" part. In 2020, I feel like Subversion is significantly less common than git, so having to install SVN + possibly learn SVN commands is a moderate barrier to browsing the code.
I always run it under a separate user account because it looks so sketchy
I thought it was sketchy by association. It used to be always recommended by sketchy pirate sites to bypass the download limits on sketchy one-click-hosters
I still use jdownloader, it does it's job. It works on a wide range of sources, works on youtube very well.
The install bit is true you literally need to google "jdowloader clean installers no adware" which is pretty bad
This comment comes up every time weboob is mentioned, but it'd probably see much better adoption if it was named anything other than "we boob". It's a bit of an off-putting name.
> youtube-dl sort of situations and RIAA would have barely made a splash
First - that wouldn't be good, such situations should make big splashes. Second - removing a YouTube plug-in from YouTube-DL would make a significant splash anyway.
I mean, proxy support is listed in the config docs[1]. Those docs also break down the functionality of gallery-do by module, leading me to believe you either didn’t read or didn’t understand the docs.
[1] describes a very basic proxy support indeed. However I can't imagine a use case for that. What is useful is a support for rotation of proxy list with every download request together with auto-updating such list.
As for modules - splitting each site's support by file is an absolute minimum for sanity. Keeping up all that support in one project quickly becomes quite a chore for a single project maintainer. The whole process fetching the while project and initiating a PR on a github is also rather awkward for developer too. Hence the existence of https://github.com/un-def/dl-plus project.
Not to take anything from how great it is to have something like this, some of the supported sites have heavy rate limiting and bot detection and using this with your account can easily get you banned.
For example, I had immense difficulty parsing my own saved posts from instagram (used a one-off script that runs in the browser).
I had an account rate limited with this, though maybe I was using it wrong. It ended up forcing me to give it a mobile number to confirm which I didn't want to do.
Not sure why they'd do this when a spammer (not me) would just create a new account. I was just archiving twice a day with a random sleep time between accounts
I have been scraping regularly from Instagram (and a bunch of different sites) using gallery-dl for a while now, and I have yet to face any issues. Granted, I scrape at most only once a day, in a batch. So I don't know what problems you might face if you scrape more often.
The only rate-limit issues I've faced is with Twitter, because the way I do it is that I feed gallery-dl a text file containing the profile URLs that I want to scrape. But gallery-dl doesn't add a timeout in-between each input URLs, so Twitter might force a temporary cooldown on you if your list is a bit long.
But you can easily avoid this by writing your own shell script with your own custom timeout.
Does Instagram not have an export function (similar to Facebook) for personal data to comply with CCPA and GDPR? If not, they should be reported to the appropriate regulatory bodies to encourage such an export function.
Disclaimer: I don’t have or use Insta, so apologies if this is a naive question.
I'm perfectly serious; i want to download a bunch of stupid memes that my account has liked and saved into various groups in my account over the years. They're public photos that shitpost accounts have posted.
If you find a way, please comment here again.I have saved thousands of posts and even sorted them. I desperately want to download them before nuking my account.
Your likes absolutely are your own interactions, and it's no violation of privacy to allow you access to data to which you already have access. You can't like it unless you can view it first. In fact (it's hidden but it's there) you can go back and view all of your past likes on Instagram.
This is not privacy, this is walled garden nonsense. Instagram thinks they own all your curation activity on the site, and they do not.
Does anyone know of a downloader that works for private Facebook Groups? I'm a member of some local history groups and the material posted is amazing. I'd like to refer to it in the future, likely long after the posters are gone. I can't rely on Facebook being around then.
Copying and pasting into a personal archive is slow when you want to capture everything (posts and comments) since you don't know which history you'll want to refer to in the future.
There are browser extensions that simply lets you save all (or some selection of) images that load on a website. Maybe that's good enough if you can't find one that has proper support for Facebook groups.
I imagine you could quite quickly just click through the material and save it. Won't work for text or metadata like descriptions/titles I guess...
Facebook actively wants to make this hard. I hope you figure out a way. If so, please post to HN. I think many Facebook group users and admins would be interested.
I wrote something similar, but for the purpose of saving material to a personal image board while using a mobile device or browser, using a web service architecture. This means you don't have to use a terminal every time you want to save an image, and several different clients can be used without needing to rewrite the extraction code for each one. It also saves the original tags if the source supports them, which makes everything way more searchable. But this new program has a lot more site support than mine.
We need good wrappers for these kinds of programs for use with mobile devices.
The call to curl is missing a closing " pair. Besides, it seems some people submit a low-res image at the post, then post another link as the first reply, which is the case today:
Interesting that the list of image upload sites is a who's-who of sites. Always looking for alternatives to Imgur which is laden with ADs and doesn't work with an AD-blocker turned on (it explicitly asks you to turn it off)
Hm, I couldn't find the list of supported sites. Is Pinterest covered? Or does anyone know of similar working tools that would work for Pinterest? As a means of "backup" and/or sync with Pinry.
FB (and instagram, whatsapp) do something weird with all media files. They assign a unique ID to each image/video/file and they don't let you download if you try to remove that ID or change it. It changes every time you reload the page. Maybe something to do with that.
Plus there is also massive rate limiting. Especially instagram, since it is only images.
Plenty of other features immediately come to mind as well: universal media support, proper parallelism, GUI, desktop integrations, proxies support, anti-captcha... The punchline is: all of this and much more you could find in jdownloader 10 years ago. But somehow youtube-dl won that race. How did that happen?