Identifying one faulty turbine in a wind farm, which may involve taking a look at tons of of signals and hundreds of thousands of information points, is akin to finding a needle in a haystack.
Engineers often streamline this complex problem using deep-learning models that may detect anomalies in measurements taken repeatedly over time by each turbine, generally known as time-series data.
But with tons of of wind turbines recording dozens of signals each hour, training a deep-learning model to research time-series data is expensive and cumbersome. That is compounded by the indisputable fact that the model may must be retrained after deployment, and wind farm operators may lack the obligatory machine-learning expertise.
In a brand new study, MIT researchers found that enormous language models (LLMs) hold the potential to be more efficient anomaly detectors for time-series data. Importantly, these pretrained models may be deployed right out of the box.
The researchers developed a framework, called SigLLM, which incorporates a component that converts time-series data into text-based inputs an LLM can process. A user can feed these prepared data to the model and ask it to begin identifying anomalies. The LLM may also be used to forecast future time-series data points as a part of an anomaly detection pipeline.
While LLMs couldn’t beat state-of-the-art deep learning models at anomaly detection, they did perform in addition to another AI approaches. If researchers can improve the performance of LLMs, this framework could help technicians flag potential problems in equipment like heavy machinery or satellites before they occur, without the necessity to train an expensive deep-learning model.
“Since that is just the primary iteration, we didn’t expect to get there from the primary go, but these results show that there’s a chance here to leverage LLMs for complex anomaly detection tasks,” says Sarah Alnegheimish, an electrical engineering and computer science (EECS) graduate student and lead writer of a paper on SigLLM.
Her co-authors include Linh Nguyen, an EECS graduate student; Laure Berti-Equille, a research director on the French National Research Institute for Sustainable Development; and senior writer Kalyan Veeramachaneni, a principal research scientist within the Laboratory for Information and Decision Systems. The research will likely be presented on the IEEE Conference on Data Science and Advanced Analytics.
An off-the-shelf solution
Large language models are autoregressive, which suggests they will understand that the latest values in sequential data depend upon previous values. As an illustration, models like GPT-4 can predict the following word in a sentence using the words that precede it.
Since time-series data are sequential, the researchers thought the autoregressive nature of LLMs might make them well-suited for detecting anomalies in any such data.
Nevertheless, they desired to develop a method that avoids fine-tuning, a process during which engineers retrain a general-purpose LLM on a small amount of task-specific data to make it an authority at one task. As an alternative, the researchers deploy an LLM off the shelf, with no additional training steps.
But before they may deploy it, they’d to convert time-series data into text-based inputs the language model could handle.
They achieved this through a sequence of transformations that capture a very powerful parts of the time series while representing data with the fewest variety of tokens. Tokens are the essential inputs for an LLM, and more tokens require more computation.
“For those who don’t handle these steps very fastidiously, you would possibly find yourself chopping off some a part of your data that does matter, losing that information,” Alnegheimish says.
Once they’d found out methods to transform time-series data, the researchers developed two anomaly detection approaches.
Approaches for anomaly detection
For the primary, which they call Prompter, they feed the prepared data into the model and prompt it to locate anomalous values.
“We needed to iterate various times to determine the suitable prompts for one specific time series. It is just not easy to grasp how these LLMs ingest and process the information,” Alnegheimish adds.
For the second approach, called Detector, they use the LLM as a forecaster to predict the following value from a time series. The researchers compare the expected value to the actual value. A big discrepancy suggests that the actual value is probably going an anomaly.
With Detector, the LLM can be a part of an anomaly detection pipeline, while Prompter would complete the duty by itself. In practice, Detector performed higher than Prompter, which generated many false positives.
“I feel, with the Prompter approach, we were asking the LLM to leap through too many hoops. We were giving it a harder problem to unravel,” says Veeramachaneni.
After they compared each approaches to current techniques, Detector outperformed transformer-based AI models on seven of the 11 datasets they evaluated, regardless that the LLM required no training or fine-tuning.
In the longer term, an LLM can also have the opportunity to supply plain language explanations with its predictions, so an operator might be higher in a position to understand why an LLM identified a certain data point as anomalous.
Nevertheless, state-of-the-art deep learning models outperformed LLMs by a large margin, showing that there continues to be work to do before an LLM might be used for anomaly detection.
“What is going to it take to get to the purpose where it’s doing in addition to these state-of-the-art models? That’s the million-dollar query gazing us straight away. An LLM-based anomaly detector must be a game-changer for us to justify this form of effort,” Veeramachaneni says.
Moving forward, the researchers need to see if finetuning can improve performance, though that may require additional time, cost, and expertise for training.
Their LLM approaches also take between half-hour and two hours to supply results, so increasing the speed is a key area of future work. The researchers also need to probe LLMs to grasp how they perform anomaly detection, within the hopes of finding a method to boost their performance.
“With regards to complex tasks like anomaly detection in time series, LLMs really are a contender. Perhaps other complex tasks may be addressed with LLMs, as well?” says Alnegheimish.
This research was supported by SES S.A., Iberdrola and ScottishPower Renewables, and Hyundai Motor Company.