Getting Started

Let's walk through how to let GPT-4o-mini play against Claude-3.5-haiku in text-based games, with detailed explanations of each component.

Installation

Install TextArena directly from PyPI:

Bash
1pip install textarena

Step 1: Initialize Agents

We provide several out-of-the-box classes for easy usage of publicly available LLMs. The OpenRouterAgent wrapper handles all the API communication and response formatting:

Python
1import textarena as ta
2agents = {
3 0: ta.agents.OpenRouterAgent(model_name="GPT-4o-mini"),
4 1: ta.agents.OpenRouterAgent(model_name="anthropic/claude-3.5-haiku")
5}

The dictionary keys (0 and 1) are player IDs that the environment uses to track turns. Each agent will be called when it's their respective player ID's turn.

Step 2: Create Environment

Similar to OpenAI gym, we use make() to create the environment:

Python
1env = ta.make(env_id="BalancedSubset-v0")

The BalancedSubset environment randomly selects one game from its collection each time it's initialized. This encourages the development of generalist agents that can handle various game types rather than specializing in a single game.

Step 3: Add Wrappers

Wrappers modify how the environment behaves. Each wrapper serves a specific purpose:

Python
1# This wrapper accumulates game history and formats it for language models
2env = ta.wrappers.LLMObservationWrapper(env=env)
3
4# This wrapper provides nicely formatted output for human readability
5env = ta.wrappers.SimpleRenderWrapper(
6 env=env,
7 player_names={0: "GPT-4o-Mini", 1: "Claude-3.5-Haiku"}
8)

The LLMObservationWrapper is particularly important because:

  • The base environment provides observations as a list of (sender_id, message) tuples
  • Language models expect a single string input
  • This wrapper maintains the conversation history and formats everything as a coherent dialogue

The SimpleRenderWrapper helps with:

  • Color-coding messages by player
  • Adding clear turn indicators
  • Formatting game state information
  • Making the output more readable in the terminal

Step 4: Game Loop

Run the main game loop with clear control flow:

Python
1# Reset the environment
2env.reset()
3done = False
4
5# Continue until the game is complete
6while not done:
7 # Get the current observation
8 player_id, observation = env.get_observation()
9
10 # Generate your model's action
11 action = agents[player_id](observation)
12
13 # Apply the action
14 done, info = env.step(action=action)
15
16# Get final rewards when game is complete
17rewards = env.close()

The game loop handles:

  • Turn management (which player goes when)
  • Observation delivery to agents
  • Action processing
  • Game state updates
  • Victory/defeat determination

Full Code

Python
1import textarena as ta
2
3# Initialize agents
4agents = {
5 0: ta.agents.OpenRouterAgent(model_name="GPT-4o-mini"),
6 1: ta.agents.OpenRouterAgent(model_name="anthropic/claude-3.5-haiku"),
7}
8
9# Initialize environment from subset
10env = ta.make(env_id="BalancedSubset-v0")
11env = ta.wrappers.LLMObservationWrapper(env=env)
12env = ta.wrappers.SimpleRenderWrapper(
13 env=env,
14 player_names={0: "GPT-4o-Mini", 1: "Claude-3.5-Haiku"}
15)
16
17env.reset()
18done = False
19while not done:
20 player_id, observation = env.get_observation()
21 action = agents[player_id](observation)
22 done, info = env.step(action=action)
23rewards = env.close()