Back to AI Research

AI Research

TreeAgent: A Generalizable Multi-Agent Framework fo... | AI Research

Key Takeaways

  • TreeAgent: A Generalizable Multi-Agent Framework for Automated Bias Labeling in Forestry via Compiled Expert Rules and Vision-Language Models Forestry expert...
  • Human-labeled data are widely used as reference annotations in ML, despite known variability across annotators in many expert-driven domains.
  • In addition, expert annotation is slow, inconsistent, and remains a major bottleneck for scaling tasks like tree height bias classification in forestry remote sensing.
  • We formalize a Decoupled Declarative Decision (D3) Framework that enables zero-modification generalization across diverse expert-defined decision structures.
  • On a tree bias classification testbed, our framework outperforms supervised ML baselines and reduces the amount of expert labeling effort required.
Paper AbstractExpand

Human-labeled data are widely used as reference annotations in ML, despite known variability across annotators in many expert-driven domains. In addition, expert annotation is slow, inconsistent, and remains a major bottleneck for scaling tasks like tree height bias classification in forestry remote sensing. We propose a multi-agent system (MAS) that orchestrates expert decision trees with Vision-Language Models (VLMs), treating the decision tree as a structural prior while VLMs perform localized semantic perception at individual nodes, with multi-agent voting to mitigate VLM stochasticity. We formalize a Decoupled Declarative Decision (D3) Framework that enables zero-modification generalization across diverse expert-defined decision structures. On a tree bias classification testbed, our framework outperforms supervised ML baselines and reduces the amount of expert labeling effort required. These results suggest that agentic orchestration of VLMs with expert priors can reproduce expert-defined labeling procedures at substantially lower annotation cost while maintaining interpretability.

TreeAgent: A Generalizable Multi-Agent Framework for Automated Bias Labeling in Forestry via Compiled Expert Rules and Vision-Language Models
Forestry experts often struggle to scale the task of identifying tree height biases, which are critical for accurate carbon accounting and climate policy. Currently, experts must manually inspect complex data—such as field measurements, lidar point clouds, and canopy height models—to classify these biases. This process is slow, inconsistent, and difficult to scale. TreeAgent addresses this bottleneck by using a multi-agent system that combines structured expert rules with the perceptual capabilities of Vision-Language Models (VLMs) to automate the labeling process.

A New Framework for Expert Logic

The core of the system is the Decoupled Declarative Decision (D3) framework. Instead of forcing an AI to learn expert rules from scratch, D3 separates the "what" from the "how." Experts write their diagnostic rules in natural language, which are then compiled into a structured, executable decision graph. This approach ensures that every decision made by the system is traceable to a specific rule, maintaining the interpretability required in scientific workflows. Because the logic is decoupled from the execution, experts can update their rules as configuration changes rather than needing to rewrite the underlying code.

How the Multi-Agent System Works

TreeAgent functions as a multi-agent system that navigates the compiled decision graph. It uses two types of nodes: deterministic nodes, which perform standard arithmetic calculations on numerical data, and VLM nodes, which use vision-language models to interpret complex visual data like canopy maps and cross-section transects. To ensure reliability, the system mitigates the inherent unpredictability of AI models by using a majority-vote mechanism, where multiple independent samples are taken for each visual judgment to reach a consensus.

Performance and Scalability

In testing, TreeAgent demonstrated significant advantages over traditional machine learning approaches. On a testbed of expert-labeled trees from diverse ecosystems, the framework achieved a 67.6% Macro-F1 score, substantially outperforming tuned tabular machine learning baselines, which reached only 36.2%. Furthermore, the system is highly efficient, requiring only about 0.040 minutes per tree compared to the 3–5 minutes required for human experts. This suggests that agentic orchestration can successfully replicate expert-defined labeling procedures at a fraction of the cost and time.

Key Considerations

The D3 framework is designed to be generalizable, meaning it can be applied to other scientific labeling tasks where expert reasoning is structured but requires occasional visual perception. By using a fixed inventory of logic primitives, the system remains robust and verifiable. While the framework excels at automating tasks that rely on established expert diagnostic rules, its success depends on the ability to decompose those rules into a clear, binary decision process.

Comments (0)

No comments yet

Be the first to share your thoughts!