Sunday, January 20, 2013

Sunday Globe Special: Government Has Got Your Tweet

Where do you think it went? 

"Library of Congress finds archiving tweets no small task" by Adrienne LaFrance  |  Washington Post, January 19, 2013

WASHINGTON — In the few minutes it will take you to read this story, 3 million new tweets will have flitted across the publishing platform Twitter and ricocheted across the Internet.

I almost stopped reading right there. 

The Library of Congress is busy archiving the sprawling and frenetic Twitter canon — with some key exceptions — dating back to the site’s 2006 launch. That means saving for posterity more than 170 billion tweets and counting, with an average of more than 400 million new tweets sent each day, according to Twitter.

But in the two years since the library announced this unprecedented acquisition project, few details have emerged about how its unwieldy corpus of 140-character bursts will be made available to the public....

Colorado-based data company Gnip is managing the transfer of tweets to the archive, which is populated by a fully automated system that processes tweets from across the globe.

Each archived tweet comes with more than 50 fields of metadata — where the tweet originated, how many times it was retweeted, who follows the account that posted the tweet, and so on — although content from links, photos, and videos attached to tweets are not included. For security’s sake, there are two copies of the complete collection.

But the library hasn’t started the task of sorting or filtering its 133 terabytes of Twitter data, which it receives from Gnip in chronological bundles, in any meaningful way.

‘‘It’s pretty raw,’’ Dizard said. ‘‘You often hear a reference to Twitter as a fire hose, that constant stream of tweets going around the world. What we have here is a large and growing lake. What we need is the technology that allows us to both understand and make useful that lake of information.’’

For now, giving researchers access to the archive remains cost-prohibitive for the library, which has spent tens of thousands of dollars on the project so far, Dizard says.

Aren't the tweets already out there somewhere anyway?

This at a time of austerity when they can't possibly analyze all the information coming in with their various spy webs. It's all a big con, a big lie, and the co$t to the American taxpayer for a self-administered tyranny is infinite. 

Oh, I imagine the voices out there saying, no, no, they are just trying to make records available to the public. Overwhelming amounts of tweets you couldn't analyze with more coming in every day? Reminds me of the stack of unread Globes I'm trying to whittle down for you fine readers.

Like many federal agencies, the Library of Congress has been hit by budget cuts in recent years. Without a major overhaul to its computing infrastructure, it isn’t equipped to handle even the simplest queries....

After six long years of looking out and commenting on all this stuff, I sit here and see the same things over and over again.  Unless it is for wars, Wall Street, Israel, corporate welfare, or the lavish, taxpayer-funded lifestyles of the politicos who isn't all hatchet.  

Oh, I know, I know, some people will say they are paying retirements and health benefits out. Yeah, they HAVE TO DO THAT because if they DIDN'T AT LEAST do SOMETHING they would have us all out in the streets. Over the six years I have been here I have come to learn that government and its mouthpiece media are not looking out for the best interests of the people of this country (or the world). It's selfish, narrow interests because they are either a part of the agenda-pushing plans, or they are under the control of them. 

And get this la$t bit:

The library is exploring whether it might be able to afford to pay a third party to provide public access to the archive.

Yeah, let's PRIVATIZE ACCE$$ so some well-connected company can get a contract to manage the unmanageable tweets. 

But for those who have immediate research interests — and many people have contacted the library, Dizard says — the wait is maddening.

--more--"