Richard Caudle | 30th April 2014
This is a quick post to update you on some changes we've introduced recently to help you work with our platform and make your life a little easier.
Filtering On Content Age
We aim to deliver you data as soon as we possibly can, but for some sources there can be a delay between publication to the web and our delivery which is out of our control.
In most cases this does not have an impact, but in some situations (perhaps you only want to display extremely fresh content to a user) this is an issue.
For these sources we have introduced a new target, .age, which allows you to specify the maximum time since the content was posted. For instance if you want to filter on blog posts mentioning 'DataSift', making sure that you only receive posts published within the last hour:
blog.content contains "DataSift" AND blog.age < 3600
This new target applies to the Blog, Board, DailyMotion, IMDB, Reddit, Topix, Video and YouTube sources.
Push Destinations - New Payload Options
Many of our customers are telling us they can take much larger data volumes from our system. We aim to please, so have introduced options to help you get more data quicker.
Increased Payload Sizes
To enable you to receive more data quicker from our push connectors, we have upped the maximum delivery sizes for many of our destinations. See the table below for the new maximum delivery sizes.
As the data we deliver to you is text, compression can be used to greatly reduce the size of files we deliver, making transport far more efficient. Although compression rates do vary, we are typically seeing an 80% reduction in file size with this option enabled.
We have introduced GZip and ZLib compression to our most popular destinations. You can enable compression on a destination by selecting the option in your dashboard, or by specifying the output_param.compression parameter through the API.
When data is delivered you can tell it has been compressed in two ways:
HTTP destination: The HTTP header 'X-DataSift-Compression' will have the value none, zlib or gzip as appropriate
S3, SFTP destinations: Files delivered to your destination will have an addition '.gz' extension is they have been compressed, for example DataSift-xxxxxxxxxxxxxxxxxxx-yyyyyyy.json.gz
Here's a summary of our current push destinations support for these features.
|Destination||Maximum Payload Size||Compression Support|
|HTTP||200 MB||GZip, ZLib|
To stay in touch with all the latest developer news please subscribe to our RSS feed at http://dev.datasift.com/blog/feed
And, or follow us on Twitter at @DataSiftDev