Blog

Historical Architecture - Data Mining Billions of Tweets

The technology behind the platform In addition to the existing live and buffered streaming, DataSift can now record data to its massive storage cluster. The core of the recording platform is an HBase cluster of more than 30 nodes, with over 400TB of storage. Every piece of information is replicated…

Read Historical Architecture - Data Mining Billions of Tweets >

Monitoring Eurozone sentiment for just 20 cents an hour

Introduction I wrote a stream today to monitor social media commentary on the meeting between German Chancellor Angela Merkel and French President Nicolas Sarkozy in Paris. It’s “the start of a crucial week for the Eurozone,” one report read, and it almost sounded like understatement. Definitely, senti…

Read Monitoring Eurozone sentiment for just 20 cents an hour >

High Scalability

DataSift is the subject of the latest post on the High Scalability blog which includes a detailed overview of the platform architecture and the problems involved in meaningfully filtering unstructured data from the Twitter API in real time. ‘You have to be able to reliably consume it, normalize i…

Read High Scalability >

Standard and Poor's Downgrades US Banks

Here's a filter that collects comments and sentiment on Standard and Poor's downgrade of US banks. With DataSift, it's easy to filter out the insignificant content and focus on the things that are being retweeted, the thoughts from key players, and the comments that have strong sentiment. tag…

Read Standard and Poor's Downgrades US Banks >

Introducing : Links Augmentations

One of our favourite features of DataSift is our Links Augmentation. In short, it is used to fully resolve any URL to it's original, un-shortened form, allowing us to fetch content from the page the link points to. On top of resolving the link we also fetch the content of the page and allow you to…

Read Introducing : Links Augmentations >