MIT researchers say they have developed a simple tool to help time-series data analysts predict the future. Defined as a collection of observations recorded over time, time-series datasets and their predictions are critical for stock analysis, medical diagnostics, and even weather forecasting.
ABILITY TO PREDICT THE FUTURE IS A NEED FOR MANY SCIENTISTS
No one knows what the future holds. Less-than-scientific approaches to forecasting the future include folks like psychics, mediums, and astrology forecasters. But such practitioners and their methods have failed to produce scientifically viable results consistently.
Conversely, mathematicians and statisticians who work in data analysis often employ predictive analysis tools to offer a reasonably accurate glimpse of possible future events. Unfortunately, most of these prediction systems are dependent on complex algorithms and significant computation power, making them mainly unavailable to the rank and file time-series data analysts.
MIT researchers say they have changed that equation by developing a simplified future prediction algorithm that any researchers can use.
SIMPLE ALGORITHM OUTPERFORMS COMPLEX ‘PREDICT THE FUTURE’ TOOLS
“Making predictions using time-series data typically requires several data-processing steps and the use of complex machine-learning algorithms,” explains a press release announcing the new future prediction tool, “which have such a steep learning curve they aren’t readily accessible to nonexperts.”
Enter MIT researchers and their new tool, the tspDB (time series predict database). Unlike other complex data analysis and prediction tools, tspDB “does all the complex modeling behind the scenes so a nonexpert can easily generate a prediction in only a few seconds.”
Surprisingly, the team behind the new system says tspDB is more accurate and more efficient than nearly all state-of-the-art deep learning methods in two key areas: predicting future values and filling in missing data points. Program researcher Abdullah Alomar says that efficiency exists because tspDB incorporates a “novel time-series-prediction algorithm,” which is uniquely effective at analyzing multivariate time-series data. One cited example is weather analysis, where data points like cloud cover, temperature, and dew point depend on past values.
In their published results, the MIT team explains how they tested the tspDB system against competing algorithms, including cutting-edge deep-learning methods, by analyzing real-world time-series datasets. These included data points from the electricity grid, traffic patterns, and financial markets. As ‘predicted,’ the new algorithm came through with flying colors, outperforming all but one of the other tested systems in forecasting future values.
“One reason I think this works so well is that the model captures a lot of time series dynamics, but at the end of the day, it is still a simple model,” said Alomar. “When you are working with something simple like this, instead of a neural network that can easily overfit the data, you can actually perform better.”
IS TSPDB PREDICT THE FUTURE TOOL THE FUTURE?
In its current form, the MIT prediction tool can be installed on top of an existing database, allowing researchers to run a prediction query “with just a few keystrokes in about 0.9 milliseconds, as compared to 0.5 milliseconds for a standard search query.” Along with this unprecedented speed and accuracy, the researchers note that their prediction tool becomes even more accurate when more data is added to the system.
“Even as the time-series data becomes more and more complex, this algorithm can effectively capture any time-series structure out there,” says senior author Devavrat Shah. “It feels like we have found the right lens to look at the model complexity of time-series data.”
The MIT team is celebrating the system’s efficiency and accuracy, but they say it is the system’s ease of use for rank and file researchers driving their efforts.
“Our interest at the highest level is to make tspDB a success in the form of a broadly utilizable, open-source system,” added Alomar. “Time-series data are very important, and this is a beautiful concept of actually building prediction functionalities directly into the database. It has never been done before, and so we want to make sure the world uses it.”
Follow and connect with author Christopher Plain on Twitter:@plain_fiction