Back to AI Research

AI Research

Learning Local Constraints for Reinforcement-Learne... | AI Research

Key Takeaways

  • Learning Local Constraints for Reinforcement-Learned Content Generators This paper introduces a hybrid approach to procedural game level generation that comb...
  • On the other hand, reinforcement-learning trained generators can guarantee global properties -- because such properties can easily be included in reward functions -- but the results can be visually dissatisfying.
  • In this paper, we explore ways to combine these methods.
  • Specifically, we constrain the action space of a PCGRL generator with constraints learned by WFC, effectively allowing the PCGRL generator to achieve global properties while forced to adhere to local constraints.
  • To better analyze how this hybrid content generation method operates, we vary the number and type of inputs, and we test whether to randomly collapse the starting state and exclude rare patterns.
Paper AbstractExpand

Constraint-based game content generators that learn local constraints from existing content, such as Wave Function Collapse (WFC), can generate visually satisfying game levels but face challenges in guaranteeing global properties, such as playability. On the other hand, reinforcement-learning trained generators can guarantee global properties -- because such properties can easily be included in reward functions -- but the results can be visually dissatisfying. In this paper, we explore ways to combine these methods. Specifically, we constrain the action space of a PCGRL generator with constraints learned by WFC, effectively allowing the PCGRL generator to achieve global properties while forced to adhere to local constraints. To better analyze how this hybrid content generation method operates, we vary the number and type of inputs, and we test whether to randomly collapse the starting state and exclude rare patterns. While the method is sensitive to hyperparameter tuning, the best of our trained generators produce visually satisfying and playable puzzle-platform game levels -- such as Lode Runner levels -- with desired global properties.

Learning Local Constraints for Reinforcement-Learned Content Generators
This paper introduces a hybrid approach to procedural game level generation that combines the visual consistency of constraint-based systems with the functional reliability of reinforcement learning (RL). While traditional constraint-based methods like Wave Function Collapse (WFC) excel at creating visually coherent levels, they often struggle to guarantee that a level is actually playable. Conversely, RL-based generators can be trained to ensure playability but often produce levels that look messy or aesthetically broken. By using WFC to define local rules that constrain the RL agent’s choices, this research creates a framework that generates levels that are both visually satisfying and functionally playable.

Combining Constraints and Learning

The core of this framework is to use WFC to learn the "grammar" of a game level—specifically, the local patterns and adjacency rules that define how tiles should be placed to look like a human-designed level. The RL agent is then tasked with building the level step-by-step. However, instead of allowing the agent to place any tile anywhere, the system restricts the agent’s action space to only those moves that are valid according to the WFC rules. If the agent attempts an action that would violate these local constraints, the system masks that action, preventing the agent from making invalid moves.

Ensuring Playability

To ensure the generated levels are fun and functional, the RL agent is guided by a reward function. This function uses a simplified game-playing agent to test if the gold pieces in a generated Lode Runner level are reachable from the player's starting position. If the agent’s choices improve the connectivity of the level, it receives a positive reward. If the agent’s choices lead to a "contradiction"—a state where no valid tiles can be placed according to the WFC rules—the agent receives a significant negative penalty. This forces the agent to learn how to balance the aesthetic requirements of the WFC constraints with the functional requirement of creating a path to the goal.

Experimental Insights

The researchers tested their framework by varying several factors, including the number of input levels, the diversity of those inputs, and the starting conditions of the generation process. They found that the system is sensitive to hyperparameter tuning, such as whether to include or exclude rare tile patterns found in the training data. By experimenting with these variables, the team successfully generated playable, visually consistent levels for the puzzle-platformer Lode Runner. The results demonstrate that constraining an RL agent with learned local rules is an effective way to bridge the gap between aesthetic quality and functional design in procedural content generation.

Comments (0)

No comments yet

Be the first to share your thoughts!