It would shock me if there's a working copy of 'rm' allowed anywhere near the Internet Archive. They take this stuff down for compliance, but my dream is that the data lives on, waiting for a saner day when the legal climate for archivists gets a little warmer.
It's tough though--supportive as I am of the Internet Archive's goals. How is an "archivist" different from a random individual who scrapes stuff of the Internet and rehosts it? In the aggregate, the Internet Archive looks different from the typical person who is copying articles and blog posts, wrapping them up in ads, and displaying them. The IA is non-profit, doesn't run ads, etc. It also respects robots.txt. But it's not that clear to me what the legal regime would be that allows the Internet Archive to function free and clear and doesn't hit cases that most would agree are shady.
The only difference between science and screwing around is writing it down.
You can be trained as an archivist. You can get a Masters and a PhD in archival practice. There are industry-standard procedures and codes of ethics. There's a very specific understanding of what is important to save, how to save it, and how to document its context and its provenance.
That's why the Internet Archive requires a certain fidelity of capture (WARCs) that a screenshot service or a citation tool don't provide.
That's also why they are legally a library. Libraries have particular copyright exemptions for preservation. A typical person doesn't. But you generally have the right to make backups for your own use, and so you can also donate those backups.
It's like if you were a famous person, and you bought a newspaper and a book, and when you died your personal effects were donated to your alma mater who put on a big exhibit of your life and times, that newspaper (your backup of the original that lives in the hard drives of the publisher) and that book (your backup of the original that some author wrote) are there, too. No-one's conferring any rights to the content; the publisher still owns the newspaper, and the author still owns the book, but that was your copy that is now available for everyone to see.
But libraries can't freely republish out of print books, which is about what the IA is doing. The equivalent would require the archive to have a room you'd visit with a terminal connected to the archive.