POST /pylon/{service}/task

Submit a new PYLON task.

An HTTPS POST request sent to:

https://api.datasift.com/v1.4/pylon/{service}/task

A successful call to this endpoint returns: 202 Accepted plus a JSON object.

Parameters

Parameter Description
service
required

The service to which the task relates.

Example value: linkedin

subscription_id
required

The id of the recording you want the task to run against.

Example value: f8dde04774540ac119c2317a4d15a8b3a1350939

name
required

A human-readable name for the task.

Example value: Top articles in the US

type
required

The type of task to run. Currently only analysis tasks are supported.

Example value: analysis

parameters
required

The parameters for the task to run.

The parameters depend on the type of analysis you are looking to perform.

start
optional

You can optionally specify the time period for the analysis task to process. The start parameter sets the start of the time period.

Note that time ranges are treated as inclusive on the start and exclusive on the end.

Default:

  • If you specify a start and no end, the task's time period will be from the start point until now.
  • If you omit the start and end parameters, the task's time period will resort to a default time period.
    • For a frequency distribution analysis tasks the last 24 hours will be analyzed.
    • For time series analysis tasks the maximum period allowed by the selected interval will be analyzed. For example, if the interval is hours the analysis query defaults to 336 hours (2 weeks).
end
optional

You can optionally specify the time period for the analysis task to process. The end parameter sets the end of the time period.

Note that time ranges are treated as inclusive on the start and exclusive on the end.

Default:

  • If you specify a start and no end, the task's time period will be from the start point until now.
  • If you omit the start and end parameters, the task's time period will resort to a default time period.
    • For a frequency distribution analysis tasks the last 24 hours will be analyzed.
    • For time series analysis tasks the maximum period allowed by the selected interval will be analyzed. For example, if the interval is hours the analysis query defaults to 336 hours (2 weeks).
filter
optional

You can optionally specify a CSDL filter to be applied to the data processed by the task.

You do not need to specify a filter. If you omit this parameter the task will process the entire content of the index, within the time period selected by the start and end parameters.

offset
optional

When specified this time offset is automatically applied to the start and end parameters to adjust for your timezone.

The offset is expressed in hours:

Examples:

  • 8 Adjust for a timezone that is eight hours ahead of UTC.
  • +8 Adjust for a timezone that is eight hours ahead of UTC.
  • -8 Adjust for a timezone that is eight hours behind UTC.

For example, by setting the UTC offset to -8 and passing a time range which matches a 24-hour PST day, you can receive timeseries results at daily intervals and ensure that an author generating interactions which fall on two different UTC days, but the same PST day, are never double counted.

analysis_type
required for analysis tasks

Specifies the type of analysis to perform. One of:

  • timeSeries
  • freqDist

For nested frequency distribution analysis freqDist is the only valid type for child analysis groups.

interval
required for timeSeries

For time series analysis, the resolution to break down timeSeries analyses by. One of:

  • month
  • week
  • day
  • hour
  • minute

This parameter is not used for freqDist.

span
optional

For time series analysis, how many interval units to span.

A span value greater than 1 can be applied to the intervals week, day, hour, minute but not to month.

For example, if the interval is "week" and span is 2, the output is grouped into two-week buckets.

This parameter is not used for freqDist.

target
required for freqDist

For frequency distribution analysis, the target to analyze.

This parameter is not used for timeSeries.

threshold
required for freqDist

For frequency distribution analysis, the maximum number of results to return.

Maximum value is 200.

This parameter is not used for timeSeries.

child
optional

Optional for a frequency distribution analysis, specifies a nested analysis query. Up to three levels of nesting are permitted (parent and child nested to two levels).

This is an object containing:

  • analysis_type
  • parameters
    • target
    • threshold

Examples

  1. Create a time series analysis task

    This set of parameters will create a time series analysis task that breaks volume down into hourly buckets.

    {
      "subscription_id": "e9dde04774540ac119c2317a4d15a8b3a1350937",
      "name": "Time series analysis 1",
      "type": "analysis",
      "parameters": {
        "parameters": {
          "analysis_type": "timeSeries",
          "parameters": {
            "interval": "hour",
            "span": 1
          }
        }
      }
    }
  2. Create a frequency distribution analysis task

    This set of parameters will create a frequency distribution analysis task that analyzes the breakdown of active members by country.

    {
      "subscription_id": "e9dde04774540ac119c2317a4d15a8b3a1350937",
      "name": "Frequency distribution analysis 1",
      "type": "analysis",
      "parameters": {
        "parameters": {
          "analysis_type": "freqDist",
          "parameters": {
            "target": "li.user.member.country",
            "threshold": 10
          }
        }
      }
    }
  3. Create an analysis task with query filter

    This set of parameters performs the same analysis as the previous example, but uses the filter parameter to analyze only members in a certain age group only.

    {
      "subscription_id": "e9dde04774540ac119c2317a4d15a8b3a1350937",
      "name": "Frequency distribution analysis 2",
      "type": "analysis",
      "parameters": {
        "filter": "li.user.member.age == \"25-34\"",
        "parameters": {
          "analysis_type": "freqDist",
          "parameters": {
            "target": "li.user.member.country",
            "threshold": 10
          }
        }
      }
    }
  4. Create an analysis task for an explicit time period

    This set of parameters performs the same analysis as the previous example, but uses the start and end parameters to restrict analysis to the period between the 1st January 2017 and 15th January 2017.

    {
      "subscription_id": "e9dde04774540ac119c2317a4d15a8b3a1350937",
      "name": "Frequency distribution analysis 3",
      "type": "analysis",
      "parameters": {
        "filter": "li.user.member.age == \"25-34\"",
        "start": 1483228800,
        "end": 1484438400,
        "parameters": {
          "analysis_type": "freqDist",
          "parameters": {
            "target": "li.user.member.country",
            "threshold": 10
          }
        }
      }
    }
  5. Create a two-level nested analysis task

    This set of parameters performs an age-gender analysis using a nested query with a top level and one child analysis level.

    {
      "subscription_id": "e9dde04774540ac119c2317a4d15a8b3a1350937",
      "name": "Frequency distribution analysis 4",
      "type": "analysis",
      "parameters": {
        "parameters": {
          "analysis_type": "freqDist",
          "parameters": {
            "target": "li.user.member.age",
            "threshold": 5
          },
          "child": {
            "analysis_type": "freqDist",
            "parameters": {
              "target": "li.user.member.gender",
              "threshold": 2
            }
          }
        }
      }
    }
  6. Create a three-level nested analysis task

    This set of parameters performs a country, age, gender analysis using a nested query with a top level and two child analysis levels.

    {
      "subscription_id": "e9dde04774540ac119c2317a4d15a8b3a1350937",
      "name": "Frequency distribution analysis 5",
      "type": "analysis",
      "parameters": {
        "parameters": {
          "analysis_type": "freqDist",
          "parameters": {
            "target": "li.user.member.country",
            "threshold": 10
          },
          "child": {
            "analysis_type": "freqDist",
            "parameters": {
              "target": "li.user.member.age",
              "threshold": 6
            },
            "child": {
              "analysis_type": "freqDist",
              "parameters": {
                "target": "li.user.member.gender",
                "threshold": 2
              }
            }
          }
        }
      }
    }

Output

If the task is accepted for processing the id of the new task is provided in the output:

{
    "id": "f3756f8de519cbd7449b8780e7eaffd407eb7f00"
}

Responses

Response code Description
Status 202 Accepted

On success the response provides the task id you need to use to track progress of the task and retrieve analysis results.

{
    "id": "{task id}"
}
Status 400 Bad Request

One or more of the passed parameters is invalid, or a required parameter is not present.

{
    "error": "{error message}"
}
Status 404 Not Found

The recording you specified using subscription_id was not found.

{
    "error": "Subscription not found"
}

Notes

  1. All calls to the API must be properly authenticated with a DataSift username and API key.
  2. All calls to the API must be versioned. The current version is v1.4.
  3. The Rate Limit Cost for this endpoint is 25. However, this cost is not taken from your regular allowance of credits. Instead it is taken from a special allowance described under the API rate limit for POST /pylon/{service}/task section on our platform allowances page. Your exact rate limit for this endpoint depends on your package.

Resource information

Rate limit cost: 25

Requires authentication: Yes

Response formats: JSON, JSONP