Subject | Re: [firebird-support] Re: Mailing list change? |
---|---|
Author | River~~ |
Post date | 2018-08-11T10:33:51Z |
Hi Mark
I'm actually thinking about an accessible archive so they old post
remain available in some form.
Yes so I understood.
But creating a 'just-in-case' backup is
good idea.
Not least because it allows the post-disaster creation of an accessible archive.
Me:
it would be straightforward to wget everything as a zillion
> html files ...
Mark:
Not quite that easy: Yahoo applies rate limiting on their rest endpoints.
I suggested wget as it has options to cope with that issue. I recommend these:
--wait=66 --random-wait
Together these would insert a random wait of 33 to 99 sec between files. Just over half the gaps are >1min which helps stay under the radar, as does the random interval. Vary the figure as required:
66 will wget about 1300 files per day. For an 18yr archive assuming a dozen posts per day that would take 2 months to download. (adjust in proportion to actual average traffic over the history - adjust to suit)
Long time to stay up for a Windows machine, reasonable for Linux.
A wait of 10 would take under a fortnight but 8000 files/day might be noticed. Maybe try it to start with?
(uptime on the ancient laptop that is now running Linux and used as my router is over a year so a two or three month fetch is doable; and a raspberry PI with an attached external hard drive would do it easily and use less power)
The advantage of taking a precautionary backup is that there is (probably!) no need to hurry. Take it slow and you won't make take too much bandwidth from other Yahoo customers, won't make Yahoo's problems worse, and won't fall foul of their rate limiter.
See
to figure out which options you need to define your recursive download. You can avoid picking up graphic files for example.
Once it's running I suggest taking a
Iook at its download directory tree after a few hours and then once a day to confirm it's still running and not downloading unwanted stuff.
You probably know that wget is a standard command line utility on GNU / Linux.
Various people have compiled it as an .exe for the Windows command line, some are listed at
Good luck!
R~~
ū