Solutions

Many different challenges with big streaming data can be solved by the StreamX platform. Whether the focus is on analyzing streams as they come in, mining the data off line for deeper understanding, or training AIs, the StreamX platform has the capability to find the exact data you are looking for and accelerate your use of that data.

 

Test and Validation

Challenges
  • Finding the right time slices of data to test with from inside many large data files
  • Quickly testing the new application version against those test data
  • Sensitivity analysis based on parametric variation on the state space
StreamX Solution
  • Content-based search of up to 100 PB of streaming data in parallel
  • Automatic parallel execution of the application against that data
  • Test automation engine allowing flexible parametric variation schemes and active feedback control of test scenarios

Training Classifiers

Challenges
  • Finding the right slices of video or sensor data files for training out of 100s of Terabytes or Petabytes
  • Manual labor or scripts to
    feed the training data to the classifier
StreamX Solution
  • Content-based search of up to 100 PB of streaming data in parallel, extracting just the relevant scenes
  • Automated test harness trains the classifier after initial setup from web-based user interface

Real-time Analysis

Challenges
  • Making newly acquired data instantly available for analysis
  • Quickly compare newly acquired data to mounds of historical data
StreamX Solution
  • Instantly makes newly acquired data on StreamX data recorders part of the distributed the StreamX distributed file system
  • Automatic parallel execution of a comparison application for the new data against existing data

Automated Regression Testing

Challenges
  • Separating test data from data used to development application
  • Manually setting up each regression run, or writing complicated scripts to manage some of the testing
StreamX Solution
  • Data management that keeps regression test data separate from development data
  • Automated runs of test data in parallel, executing the application on the cluster nodes where the data resides

How We Do It

XCube has five guiding principles that drive the design of our stream data management solutions: Never require a rewrite of the customer’s application. Those are complex and proprietary. Never move the data, because individual datasets can be Terabytes and collections of testing can be Petabytes. Also some data sets are restricted from being moved. Support globally distributed data and teams, so that any of the data can be accessed and used anywhere in the world. Enable customers to define what content is important for searches.

Provide the maximum amount of automation, so the customer can focus on the science and engineering instead of IT and data management.