A Foundation Model for Zero-Shot Logical Rule Induction
Inductive Logic Programming (ILP) is a method for discovering interpretable, human-readable logical rules from data. Traditionally, these systems are "transductive," meaning they are built for specific tasks and must be retrained from scratch whenever the data or domain changes. This paper introduces the Neural Rule Inducer (NRI), a foundation model designed to perform zero-shot rule induction. By training on massive amounts of synthetic data, NRI learns the underlying process of logical reasoning rather than memorizing specific domain patterns, allowing it to generate logical rules for new, unseen tasks without the need for retraining.
How NRI Works
Instead of identifying variables by their specific names or labels, NRI represents them using domain-agnostic statistical properties. These include class-conditional rates, entropy, and how often variables appear together. By focusing on these statistical signatures, the model can process any set of boolean variables regardless of their identity or count. The architecture consists of a statistical encoder that captures these properties and a parallel slot-based decoder. This decoder generates multiple logical clauses simultaneously, which preserves the mathematical property of permutation invariance—ensuring that the order of clauses does not change the meaning of the final rule.
Differentiable Logical Reasoning
A key challenge in training neural models to perform symbolic logic is that logical operations are typically discrete and non-differentiable. NRI overcomes this by using "product t-norm relaxation," a mathematical technique that treats logical conjunctions and disjunctions as continuous values between 0 and 1. This allows the model to be trained end-to-end using standard gradient descent, optimizing for prediction accuracy while simultaneously refining the logical rules. The model also employs a multi-objective loss function to balance accuracy with the need for concise, diverse, and interpretable rules.
Performance and Generalization
The researchers evaluated NRI on its ability to recover known rules, its resilience to noisy labels, and its capacity to handle spurious correlations. Because the model is trained on diverse synthetic boolean formulas, it learns to recognize the general structure of induction. This enables it to perform zero-shot transfer to real-world tabular benchmarks. By outputting rules in Disjunctive Normal Form (DNF), the model provides transparent, interpretable explanations for its predictions, offering a potential path toward building foundation models for symbolic reasoning that are grounded in observed data rather than just abstract language concepts.
Considerations
While NRI demonstrates strong capabilities in zero-shot rule induction, it is currently focused on boolean variables. The authors note that the framework could be extended to multi-valued or continuous domains through discretization or fuzzy predicates. Furthermore, while the model is designed to handle varying numbers of variables, its example-conditioned encoder uses a pragmatic approximation for dimension adaptation when encountering tasks with a different number of features than those seen during training. The model is intended to provide interpretable logical hypotheses, making it particularly relevant for high-stakes fields like healthcare and finance where black-box models are often insufficient.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!