Connecting the Query Builder to DataSift

Once your implementation of the Query Builder is fully operational, it is time to connect it to our platform. You need to capture the JCSDL code generated by your users working with the Query Builder, pass it on to DataSift, capture the results and send them back to the person who created the filter. Remember that you have full freedom to implement your own solution here and you have full freedom of user management. That also means it is your responsibility to write the code that glues it all together.

It is not a difficult task, but there are a few things you need to be aware of and a simple example like the one we present in this document will help you understand what is going on.

We will be building a page that lets the user construct filters using the Query Builder and see the results without having to reload the page. We need the following components:

  1. A server connected to the internet.

    An AWS EC2 micro instance will do just fine. In this example we will assume that you are running Ubuntu Linux, but the principles apply to any operating system.

  2. Open ports 22, 80, and 8888 on your server.

    If you are using the Amazon AWS EC2 service, you will need to modify the inbound rules for the security group associated with your EC2 instance. You can do that via the AWS EC2 Management Console.

  3. Install Apache.

    $ sudo apt-get install apache2

  4. Install PHP5 along with the Apache PHP5 module, cURL, and JSON.

    $ sudo apt-get install php5 libapache2-mod-php5 php5-curl php5-json

  5. Restart Apache.

    $ sudo apachectl restart

  6. Instal Python.

    $ sudo apt-get install python2.7

  7. Instal the Tornado HTTP server framework.

    See the TornadoWeb site for the installation instructions.

  8. Copy our sample HTTP connector receiver code to your server.

    You will find a discussion of what that code does on our HTTP connector page.

  9. Start the HTTP connector receiver in a separate terminal window.

    $ python ./connector-http-server.py

    The server prints out some basic debugging information when it receives interactions from DataSift. It is a handy sanity check and that is why it make sense to start it in a separate terminal window.

  10. Copy the Query Builder source code archive to the document root directory of your HTTP server.

    In case of Apache running on Ubuntu it will be located in /var/www:

    $ cd /var/www

    $ sudo unzip editor-master.zip

    $ sudo mv editor-master/minified jcsdl

  11. Copy index.html to the document root.

    <head>
    
      <title>How To Connect the Visual Query Builder to DataSift</title>
    
      <script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"></script>
      <script type="text/javascript" src="jcsdl/minified/jcsdl.min.js"></script>
      <script type="text/javascript" src="jcsdl/minified/jcsdl.definition.js"></script>
      <link rel="stylesheet" type="text/css" href="jcsdl/minified/jcsdl.min.css" />
    
      </head>
    
      <body>
    
      <div id="jcsdl"></div>
    
      <div id="newinteractions"></div>
    
      <script type="text/javascript">
    
      var yourHTTPServerURL = "http://example.com:port/";
      var xDSid = '';
      var editor = new JCSDLGui($('#jcsdl'), {
          save : function( code ) {
           $.post(YourHTTPServerURL + "call.php",
           {csdl: code},
           function(data) {
               window.xDSid = data;
               alert(window.xDSid);         
           });   
      }});
    
      setTimeout("updateList()", 5000);
    
      function updateList()
      {
          $.get(YourHTTPServerURL + "read.php", function(data) {
    
           $('#newinteractions').html(data);
    
          }).error(function(x, e) {
    
           if (x.status == 0) {
            alert('Check Your Network.');
           } else if (x.status == 404) {
            alert('Requested URL not found.');
           } else if (x.status == 500) {
            alert('Internel Server Error.');
           }  else {
            alert('Unknow Error.\n' + x.responseText);
           }
          });
    
          setTimeout("updateList()", 5000);
      }
    
      </script>
    
      </body>

    Make sure to change the value of the yourHTTPServerURL variable to the actual URL of your server.

  12. Copy call.php to the document root.

    <?php
    
      require '/home/ubuntu/datasift-php/lib/datasift.php';
    
      $USERNAME = 'YourDataSiftUserName';
      $API_KEY = 'YourDataSiftAPIKey';
      $httpServer = 'example.com:8888';
      $compileURL = 'https://api.datasift.com/compile';
      $pushCreateURL = 'https://api.datasift.com/push/create';
    
      $headers = array(
              "Auth: ${USERNAME}:${API_KEY}",
              "Content-Type: application/x-www-form-urlencoded"
      );
    
      $pushCreateParams = "name=connectorhttp" . "&" .
              "output_type=http" . "&" .
              "output_params.method=post" . "&" .
              "output_params.url=" . urlencode($httpServer) . "&" .
              "output_params.use_gzip=false" . "&" .
              "output_params.delivery_frequency=60" . "&" .
              "output_params.max_size=10485760" . "&" .
              "output_params.verify_ssl=false" . "&" .
              "output_params.auth.type=none";
    
      $csdl = '';
    
      if ($_POST == Array()) {
              print "Looks like you are trying to run this script from the command line. It will not work.\n";
              exit();        
      } else {
              if (array_key_exists('csdl', $_POST)) {
                      $csdl = $_POST['csdl'];
              }
      }
    
      // compile JCSDL
    
      $compileParams = "csdl=" . urlencode($csdl);
    
      $ch = curl_init();
    
      curl_setopt($ch, CURLOPT_URL, $compileURL);
      curl_setopt($ch, CURLOPT_POST, true);
      curl_setopt($ch, CURLOPT_POSTFIELDS, $compileParams);
      curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
      curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    
      $response = curl_exec($ch);
      $response_json = json_decode($response);
    
      if (array_key_exists("hash", $response_json)) {
              $hash = $response_json->{"hash"};
      } else {
              print_r($response_json->{"error"});
              exit();
      }
    
      // create HTTP Push subscription
    
      $pushCreateParams->{"hash"} = $hash;
    
      $ch = curl_init();
    
      curl_setopt($ch, CURLOPT_URL, $pushCreateURL);
      curl_setopt($ch, CURLOPT_POST, true);
      curl_setopt($ch, CURLOPT_POSTFIELDS, "hash=" . $hash . "&" . $pushCreateParams);
      curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
      curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    
      $response = curl_exec($ch);
      $response_json = json_decode($response);
    
      if (array_key_exists("id", $response_json)) {
              echo $response_json->{"id"};
      } else {
              print_r($response_json->{"error"});
      }
    
      curl_close($ch);
    
      ?>

    Change the value of httpServer to the URL that the HTTP connector receiver is listening on. Do not forget to set your DataSift username and API key. This script makes two calls to the /compile and /push/create endpoints. Do not forget to call /push/delete when you are done experimenting. You will need the ids of the Push subscription you want to delete, use /push/get to obtain them.

  13. Copy read.php to the document root.

    <?php
    
      $debug = FALSE;
      $ids = "";
      $dp = "/tmp";
      $f_ctime = 0;
      $fn = "";
      $fp = "";
    
      if ($dh = opendir($dp)) {
    
          while (false !== ($df= readdir($dh))) {
    
              $fp = "{$dp}/{$df}";
    
              if (is_file($fp) && filectime($fp) > $f_ctime) {
    
                  if (preg_match('/^DataSift-.*\.json$/', $df) > 0) {
    
                      $f_ctime = filectime($fp);
                      $fn = $fp;
    
                  }
              }
          } 
    
          closedir($dh);
      }
    
      // get the latest interactions file
    
      if ($fh = fopen($fn, "r")) {
    
          $f = fread($fh, filesize($fn));
    
          // decode JSON
          $i = json_decode($f);
          $interactions = $i->{"interactions"};
          $interactionCount = count($interactions);
    
          // extract interaction content
    
          $n = 0;
    
          while ($n < $interactionCount) {
    
              $ids = $ids . $interactions[$n]->{"interaction"}->{"created_at"} . "<br />";
              $ids = $ids . $interactions[$n]->{"interaction"}->{"content"} . "<br />";
              $n++;
          }
      }
    
      if ($debug == TRUE) {
              echo 'Now: '. date('Y-m-d, H:i:s') ."\n";
      } else {
              echo $ids;
      }
    
      ?>

  14. Load index.html into your favorite web browser.
  15. Create a new filter.
  16. Click Save and Preview.
  17. Wait five seconds and the interactions you filter for should appear under the Query Builder.