Sunday, October 24, 2010

The TED Talks Torrents Project

"There are those that look at things the way they are, and ask why? 
I dream of things that never were, and ask why not?" - Robert F. Kennedy
Problem: TEDtalksTED.com Talks pages don't load up many times from my home
internet connection now. (Pune, India, BSNL broadband in case anyone wants to analyze). I have to refresh the page several times just to get it on. But their direct Talks download links from the RSS/podcast feed works fine. I’m not sure of the max download speed you can get on them, though. It’s upto 60KB/s which is the max my connection can take. I wonder what’s the speed for someone downloading them on 8mbps or higher connections. If you can clock it, please post it as a comment here along with which ISP you’re using like I’ve mentioned. Back to the problem, could it be that TED website is getting inundated with the increasing viewership??

Casual comments: They ought to start releasing all their Talks through torrents now, so that new converts (read: people who just came across it and gotta catch ‘em all) can download the/a whole bunch using a torrent, without affecting the performance of TED.com’s servers. Already we have an aggregated list (though it’s not exhaustive) – this should go one step further. If TED doesn’t, then some teenager somewhere will create a code to spider through the list of Talks, trace down the videos attached to them and download them en masse. I haven’t seen anything like this yet (I haven’t looked), but just saying this is a logical consequence and is bound to happen at some point.

Foresight: Once this code gets out and gets thousands of users, it’s surely going to crash TED’s servers in a few days or just hours. Compared to all the costs involved in fixes and downtimes and IP-restrictive patches (a retrograde move, denying a user more Talks after a quota has been consumed- like you’re limiting knowledge! People will walk away!) and the cost of massive expansion on bandwidth capacity, just making a Torrent of all the Talks and having the whole world share them peer-to-peer, will cost next to nothing!

Ponderings: Technically, the Talks ARE released under Creative Commons license – the video files can be reproduced and shared all over as long as the actual content isn’t manipulated. But someone like me or you doing this independently wouldn’t be able to include, or would have to work too hard at, the features I want in that torrent:
  1. With the Talk’s video file there should be an accompanying HTML/other doc, documenting whatever info is there on the website, including description, links to the speaker’s bio and his/her contact links, and full transcript. Also maybe a separate XML having meta information (ex:tags) that’s easily readable by programs. 
  2. The subtitles feature is present only for TED’s web player. When we download the Talks, no subs! Someone has made a site where you can get the subs, but it’s too cumbersome to do for so many Talks I’ve already downloaded! TED.com has the database linking each of its talks with the relevant subtitles. The Torrent feature would include the subs with the Talks just like they do in DVD’s.
  3. All versions of the Talks : mp3, low-res FLV, ipod-level mp4 and HD – should be made available in the same place.
How great would this be! It would easily become the biggest and most downloaded and shared torrent on the web, replacing all the allegedly illegal ones! (I maintain my neutral stance – I have objections to modern copyright laws.) Initially TED could host it on a drive of its own, and then in a few weeks as more and more people download and share it,TED could just close off and not need to bother about past Talks anymore!

Proposal: With this, I’m going to propose: the TED Talks Torrent project – what I want the fascinating team at TED, or a bunch of volunteers, to get working on ASAP!
  1. Just like their amazing success with the multi-language subtitles, launch an initiative to have a definite .torrent, with a magnet URL, created for every TED Talk. The torrent should contain all the Talk’s audio as well as all available video versions (typically FLV, Ipod MP4 and HD), plus a text or html file having links to the speaker’s bio, the Talk’s short description as appears on the website and video feed and the full transcript. And of course, all available subtitles in the best available format (.sub? .srt?)
  2. Suggestion: The TED website’s backend team can code an automated script to parse the database and do all of this for them in a short time, as they already have it all linked together in each Talk’s webpage. It’ll be easier to do this internally rather than having a regular visitor deploy a spider that can easily make mistakes and waste time. Nevertheless, if they don’t, we will.
  3. As in the subs project, an army of volunteers can be recruited for free from the site’s visitors to audit all the torrents, raise flags and make corrections wherever needed.
  4. Now at every Talk’s page, in the layer that appears when one presses the download button, simply include a link to the .torrent file, or the magnet URL or both! They seem to have space over there for that anyway!
  5. Also create a unified torrent that has all the Talks, possibly split among folders by event, category etc. Have some way of automatically updating that torrent every month or so as new Talks are added to the hive. Maybe we could have a script running which aggregates all the .torrent files together, same way RSS feed aggregators work.

What’s in this for TED?


image credit: http://wurstgott.deviantart.com/gallery/#/d1fgjcyIntangible: Everything! Torrents and P2P is a (relatively) new, disruptive and paradigm-shifting technology and I’m surprised I haven’t seen a Talk on it yet! (Invite me!) Though by common perception it’s seen as bad, remember that even a kitchen knife can be made to cook food as well as kill people. If we got scared of the latter and banned knives altogether, we would cease to exist! Even so, TED being TED should harness it to spread Ideas like never before. Even the TEDx events can follow suit : forget about expensive webcasting and third party video hosting with strings attached – a TEDx event can be shared with the whole world from a single computer with just a free torrent client and a community of diligent seeders! Spread ideas even further! But apart from this intangible benefit, a more tangible, economical benefit is on the cards:

Tangible: TED being a non-profit, every penny that comes in has to be optimally utilized. Presently with 700+ talks being viewed by over 300 million people and the numbers growing by the day, the bandwidth expenses from their servers is going to just go up and up. (why I started this post in the first place!). They have to insert ads of fat-ass evil (?) corporates after most of their Talks just to offset these expenses. Though most of the ads look benign in themselves, I’d rather go without them if given the choice! I do not want to see a potential situation where a big sponsor threatens to pull out and put TED in jeopardy just because of one Talk that has a disruptive idea that would put the said sponsor out of business (which is probably why they haven’t featured P2P technology yet!). Think about it: Would we have had that incredible Wikileaks talk had the Pentagon been a benefactor? I think not! By making Torrent links for each Talk available, they can take a huge load off their servers, maybe divest the HD versions purely to download through torrents, save massively on bandwidth expenses while allowing for infinite viewership growth, thus taking the dependency off sponsors. And then we might just live to see a TED Talk that doesn’t have an advertisement at the end!

Challenge: So, what say you? I don’t have the technical know-how or any contacts to pull this off, but maybe together we can! Comment on this post to show your support, tweet and share it if you think it’s an Idea worth Spreading! Try to get the TED team and your local TEDx team to read this! At least the TEDx guys all over the world can start putting out torrents of their events! You can reach me directly on nikhil.js [at] gmail.com

Further ideas: (I don’t have the know-how to implement these, but maybe some day someone could take it from here and do this)
  1. We need a way to browse websites P2P – peer-to-peer. If a site has 100 visitors who have the pages and images all sorts of files (incl streaming video) cached with them, they could become intermediate servers for another 100,000 visitors. Thus a website that’s hosted for free on a limited-bandwidth server will not get bogged down by its own increasing popularity, and the website’s creators don’t have to worry. Presently, I can host a basic website on my own PC using Opera Unite or some other personal web server solutions, but I can’t afford to have too many visitors. The P2P or torrent model could eliminate these limitations and make us independent of third-party web hosters.
  2. It should be possible to have a torrent that’s dynamic and updatable. So the content of the torrent can have more files added to it over time, so the user doesn’t have to go through the trouble of deleting an old torrent, starting a new one, and pointing it to the original files and also getting his/her Karma level (which tells as a seeder, how much have you seeded already) reset.
  3. The rigid folder structure of the torrent should make way for something more dynamic like gmail’s labels, so that 2 categories can have the same file under them. Also, even if individual files are moved to different places on the user’s system (I mean file A gets put here and B there), the torrent should still be able to point to them wherever they are and share them. Maybe a shortcut to the file placed in the download location could do the trick.
Interesting reads:
http://www.brainpickings.org/index.php/2009/04/28/ted-talks-typographic-visualization/
When You Realize That Copyright Law Violates Free Speech Rights, You Begin To Recognize The Problems...
No Copyright Law - The Real Reason for Germany's Industrial Expansion?

2 comments:

Nikhil Sheth said...

This link answered one of my biggest ponderings:
http://en.wikipedia.org/wiki/Metalink

It's a new format that combines the best of http/ftp and bit torrent.

soawesomejohn said...

I came across this looking for torrents of some of the ted talks. I actually wanted to get audio files to listen to during my commute. I understand I'd miss out on the power point stuff, but I figure the talk itself should still be fairly interesting.

I think it would be fairly simple code-wise to create individual torrents. One could track the RSS feed, get the base name, and then download the matching mp4 and audio files. A torrent file could be generated from this as well. In fact, torrent supports web seeds. So you don't even have to download the original file.

http://getright.com/seedtorrent.html


I would try and avoid making one massive torrent with everything that gets outdated each week. Instead, I would have torrents released every three months containing that quarter's TED talks.

Ideally, TED themselves would get involved and generate the torrents themselves. But if not, you could have a simple website listing the existing torrent files along with an RSS feed for these torrent files.

Related Posts with Thumbnails