Content Discovery Smackdown: Hootsuite vs. Buffer vs. KloutContent Marketing Minds: Ingredients of the Tastiest Content [Nutrition Label]From the Corn Field to the Digital Era: Content Marketing Starts with TrustContent Marketing: Is 2014 Really Shaping Up to Be the Year of Video?
Your Customers Aren’t Listening! How to Create Consumer Dialogue that Converts4 Tools for Nonprofit Social Listening and Reputation ManagementThe Promising Role of Social Listening in Treating Health IssuesThe Importance of Social Listening for Brands
- Public Relations
Facebook Testing a Way for Users to Buy Products on the Platform7 Website Tips to Attract More Shoppers to Your PagesHow eCommerce, Augmented and Virtual Reality Will Redefine the Retail ExperienceSearch Query Analysis to Increase eCommerce Website Conversions
- Content Marketing
Technology & Data
Social Startups: Bizible Connects All the Dots from Marketing Contributions to RevenueCreating the Perfect Profile for Your Social Media Marketing EffortUsing GPS and Localization for Social AnalyticsAnalytics and Prospect Intel: Discovering Your Ideal Prospect
- Big Data
- Tech & Innovation
3 Security Risks You’re Taking Every Day While Using Social MediaShould the President Have the Power to "Pull the Plug" on the Internet?How Safe is Your WordPress Website From Hackers and Other Malicious Attacks?
- Software & Tools
- Small Business
- Social Organization
Celebrating the Grand Re-Launch of Social Media Today! SBH Podcast Episode 8Why Should You Care If Your Employees Are Thought Leaders?Beyond Engagement: The Art of Managing Social-Media Risk in Employee Advocacy
Why All-in-One Social Media Management Systems Don't Cut It for Social Customer ServiceWhat You Should Know About Customer, Digital, and Contextual ExperienceSurging into Q3: How to Make It Better Than Q2Is How You Serve Your Customers Costing You Business?
Join us September 15th in Atlanta for The Employee Advocacy Summit and learn how to unleash the power of your employees.
Post your event here and we'll share it with our community. If one of our members is featured, we'll promote as well on their profile.
- Marketplace & Webinars
The SMT Marketplace
Your resource for exclusive content and insights from Social Media Today, and opportunities to reach our community of professionals.
The Social Business Book Club brings you books, discussions, and insights from today's to business thought leaders.
Join interactive talks and and panel discussions with leading thinkers and practitioners on social media and networked business, or browse the catalogue of recorded sessions - all completely free.
Reach Social Media Today's community of marketing and communications professionals in an editor-approved context with a native advertising package.
The Library Of Congress Archives Your Tweets
Posted on January 14th 2013
In 2010, the Library of Congress used its Facebook page to announce it was acquiring the entire Twitter archive – all public tweets – back to March 2006. And it has been archiving public tweets ever since. Think about that. In the few minutes it will take you to read this, over three million new tweets will have flooded the Internet and been added to what Twitter estimates is some 400 million new tweets sent every day.
In the couple of years since the Library of Congress made its announcement, no details have emerged as to how this database of tweets is going to be made available to the public. As it turns out, The Library of Congress hasn’t figured that out yet.
“People expect fully indexed – if not online searchable – databases, and that’s very difficult to apply to massive digital databases in real time, ” said Deputy Librarian of Congress Robert Dizard Jr. “The technology for archival access has to catch up with the technology that has allowed for content creation and distribution on a massive scale. Twitter is focused on creating and distributing content; that’s the model. Our focus is on collecting that data, archiving it, stabilizing it and providing access; a very different model.”
Gnip is a Colorado company providing “Full historical access to the Twitter firehose.” Gnip manages the flow of tweets to the Library of Congress archive. Each tweet arrives at the archive with multiple fields of metadata, including where the tweet originated, how many times it was retweeted, who follows the account that posted the tweet, and more. But the Library of Congress has yet to determine how it is going to sort its 133 terabytes of Twitter data, received from Gnip in chronological bundles. Robert Dizzard Jr. says:
It’s pretty raw. You often hear a reference to Twitter as a fire hose, that constant stream of tweets going around the world. What we have here is a large and growing lake. What we need is the technology that allows us to both understand and make useful that lake of information.
As it stands, The Library is not able to provide access to people wanting to research the database. It’s cost-prohibitive and the Library has been hit with budget cuts. Without a major overhaul to its technological infrastructure, the Library doesn’t have the ability to process even the most basic of search requests.
“We know from the testing we’ve done with even small parts of the data that we are not going to be able to, on our own, provide really useful access at a cost that is reasonable for us,” Dizard said. “For even just the 2006 to 2010 [portion of the] archive, which is about 21 billion tweets, just to do one search could take 24 hours using our existing servers.”
“Milliseconds is not uncommon for expected latency from when the tweet happened to when someone would be able to get it and analyze it,” he said.
One day we’ll be able to personally visit the Library of Congress and perform research in person. Dizard says this was a condition of the deal with Twitter which gifted the archive, so that the Library won’t be “competing with the commercial sector.”
Certainly this project is further evidence of that fact that what you say online is going to be online forever.
I’m wondering what you think about this project to archive all our tweets? Is this a useful project?