Back to AI Research

AI Research

Bridging the Last Mile of Time Series Forecasting w... | AI Research

Key Takeaways

  • Bridging the Last Mile of Time Series Forecasting with LLM Agents In real-world business, a statistical forecast is rarely ready to be used immediately.
  • Time series forecasting has advanced rapidly, especially with the emergence of foundation models that show strong zero-shot performance on numerical extrapolation.
  • However, in real-world forecasting settings, a statistically plausible baseline is rarely the final forecast used in practice.
  • Before a forecast becomes decision-ready, it often needs to be revised using weakly structured business context such as holiday effects, campaign plans, external events, historical analogs, and expert feedback.
  • This practical stage remains underexplored in the forecasting literature.
Paper AbstractExpand

Time series forecasting has advanced rapidly, especially with the emergence of foundation models that show strong zero-shot performance on numerical extrapolation. However, in real-world forecasting settings, a statistically plausible baseline is rarely the final forecast used in practice. Before a forecast becomes decision-ready, it often needs to be revised using weakly structured business context such as holiday effects, campaign plans, external events, historical analogs, and expert feedback. This practical stage remains underexplored in the forecasting literature. In this paper, we formulate this stage as the \textbf{last-mile forecasting} problem and present an LLM-agent framework that sits on top of a forecasting backbone. Our system maintains a unified forecast workspace, invokes tools to retrieve contextual evidence, and converts reasoning trajectories into explicit forecast revision actions under structural safety constraints. It also supports long-horizon forecasting through map-reduce-style decomposition and post-hoc reflection through a memory bank. The resulting system is designed to be controllable and auditable. Through real-world case studies, we show how LLM agents can bridge the gap between statistical prediction and business-ready forecasting.

Bridging the Last Mile of Time Series Forecasting with LLM Agents
In real-world business, a statistical forecast is rarely ready to be used immediately. While foundation models are excellent at predicting numerical trends based on historical data, they often lack the "business context"—such as holiday schedules, marketing campaigns, or unexpected external events—that human planners use to adjust predictions. This paper introduces a framework that treats this final stage, called "last-mile forecasting," as a structured, auditable, and agent-driven process. By placing an LLM agent on top of a standard forecasting model, the system allows for context-aware revisions that are transparent and easy to track.

A Unified Workspace for Forecast Revision

The core of this framework is a "forecast workspace." Instead of asking an AI to simply generate a new forecast from scratch, the system maintains a shared state that includes the original historical data, the immutable baseline forecast, and an editable version of the forecast. This separation ensures that the agent cannot accidentally overwrite the baseline or corrupt the historical record. By keeping these elements in one place, the agent can compare its proposed changes against the original statistical prediction, ensuring that every adjustment is intentional and grounded in specific evidence.

Constrained Actions and Evidence-Based Reasoning

To ensure the system remains controllable and reliable, the agent is restricted to a specific set of "revision actions." Rather than outputting free-form text, the agent uses tools to retrieve evidence—such as calendar events or historical analogs—and then applies precise edits, such as adjusting a specific date range or overriding a point in time. Every action taken by the agent is recorded in a revision trace. This creates an audit trail, allowing human planners to see exactly why a change was made, what evidence supported it, and how the final forecast differs from the initial statistical baseline.

Handling Long Horizons and Self-Improvement

For long-term forecasting, the framework uses a "map-reduce" approach. It decomposes a long timeline into smaller, manageable event windows. A local reasoner examines each window individually, proposes specific revisions, and then aggregates these into the final forecast. Furthermore, the system includes a memory bank for post-hoc reflection. Once actual data becomes available, the system compares its revised forecast to the real-world outcome. It stores these lessons as structured experiences, allowing the agent to improve its future calibration and decision-making without needing to retrain the underlying forecasting model.

Real-World Performance

Case studies using air travel ticket data demonstrate that this agentic approach significantly outperforms standard statistical models during complex periods like holidays. By applying targeted, evidence-backed revisions, the framework drastically reduces error rates during high-impact events while maintaining accuracy across the rest of the forecast horizon. The results show that by bridging the gap between raw statistical output and business-ready planning, the system provides a more reliable and transparent tool for operational decision-making.

Comments (0)

No comments yet

Be the first to share your thoughts!