Time series modeling is vital across many fields, including demand planning, anomaly detection, and weather forecasting, but it faces challenges like high dimensionality, non-linearity, and distribution shifts. While traditional methods rely on task-specific neural network designs, there is potential for adapting foundational small-scale pretrained language models (SLMs) for universal time series applications. However, SLMs, primarily trained on text, may need help with continuous time series data and patterns like seasonality. Recent approaches, like Retrieval-Augmented Generation (RAG), enhance models with external knowledge, offering new possibilities for improving time series analysis and complex goal-oriented tasks.
Researchers from IIT Dharwad and TCS Research propose an agentic RAG framework for time series analysis using a hierarchical, multi-agent architecture. A master agent orchestrates specialized sub-agents, each fine-tuned with SLMs for specific time series tasks like forecasting or anomaly detection. These sub-agents retrieve relevant prompts from specialized knowledge repositories, or prompt pools, that store historical patterns, enabling better predictions on new data. This modular approach enhances flexibility and accuracy, outperforming traditional methods across various time series tasks by effectively addressing complex challenges.
The proposed method introduces a framework for time series analysis, utilizing a hierarchical, multi-agent architecture where a master agent coordinates specialized sub-agents focused on tasks like forecasting, anomaly detection, and imputation. These sub-agents leverage pre-trained language models and employ a dynamic prompting mechanism to retrieve relevant prompts from an internal knowledge base. This mechanism allows the model to adapt to various trends within complex time series data by accessing historical patterns stored as key-value pairs in a shared prompt pool. The dynamic prompting approach overcomes the limitations of traditional fixed-window methods by enabling the model to adjust to different trends and patterns, enhancing the accuracy of predictions across diverse time series tasks.
Additionally, the framework builds upon recent advancements in SLMs by incorporating a two-tiered attention mechanism to handle long-range dependencies in time series data. The method improves the processing of long sequences without fine-tuning. Still, it also leverages instruction-tuning and parameter-efficient fine-tuning (PEFT) techniques to enhance SLM performance on specific time series tasks. This includes improving the context length of SLMs to 32K tokens, enabling them to capture complex spatio-temporal dependencies. Furthermore, the framework utilizes Direct Preference Optimization (DPO) to fine-tune SLMs, ensuring that the models favor more accurate task-specific outcomes, ultimately enhancing the effectiveness of time series analysis.
The proposed Agentic-RAG framework was evaluated across the forecasting, classification, anomaly detection, and imputation tasks. It employed variants like SelfExtend-Gemma-2B-instruct, Gemma-7B-instruct, and Llama 3-8B-instruct. Real-world traffic datasets (e.g., PeMS, METR-LA) and multivariate anomaly detection datasets (e.g., SWaT, NASA telemetry) were used. Evaluation metrics included MAE, RMSE, accuracy, precision, and F1-score. The framework consistently outperformed baselines in forecasting tasks, especially on METR-LA and PEMS-BAY datasets, demonstrating superior predictive accuracy and robustness across all metrics.
In conclusion, The Agentic RAG framework, proposed for time series analysis, addresses challenges like distribution shifts and fixed-length subsequences. It employs a hierarchical, multi-agent architecture with specialized sub-agents for different tasks. These sub-agents use prompt pools as knowledge bases, retrieving relevant information to enhance predictions on new data. The modular design allows the framework to outperform traditional methods in handling complex time series tasks. Using SLMs within this framework enables flexibility and achieves state-of-the-art performance across major time series benchmarks.
Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..
Don’t Forget to join our 50k+ ML SubReddit
Here is a highly recommended webinar from our sponsor: ‘Building Performant AI Applications with NVIDIA NIMs and Haystack’
Sana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.