Getting Started

Let's walk through how to let GPT-4o-mini play against Claude-3.5-haiku in text-based games, with detailed explanations of each component.

Installation

Install TextArena directly from PyPI:

Bash

1pip install textarena

Step 1: Initialize Agents

We provide several out-of-the-box classes for easy usage of publicly available LLMs. The OpenRouterAgent wrapper handles all the API communication and response formatting:

Python

1import textarena as ta
2agents = {
3    0: ta.agents.OpenRouterAgent(model_name="GPT-4o-mini"),
4    1: ta.agents.OpenRouterAgent(model_name="anthropic/claude-3.5-haiku")
5}

The dictionary keys (0 and 1) are player IDs that the environment uses to track turns. Each agent will be called when it's their respective player ID's turn.

Step 2: Create Environment

Similar to OpenAI gym, we use make() to create the environment:

Python

1env = ta.make(env_id="BalancedSubset-v0")

The BalancedSubset environment randomly selects one game from its collection each time it's initialized. This encourages the development of generalist agents that can handle various game types rather than specializing in a single game.

Step 3: Add Wrappers

Wrappers modify how the environment behaves. Each wrapper serves a specific purpose:

Python

1# This wrapper accumulates game history and formats it for language models
2env = ta.wrappers.LLMObservationWrapper(env=env)
3
4# This wrapper provides nicely formatted output for human readability
5env = ta.wrappers.SimpleRenderWrapper(
6    env=env,
7    player_names={0: "GPT-4o-Mini", 1: "Claude-3.5-Haiku"}
8)

The LLMObservationWrapper is particularly important because:

The base environment provides observations as a list of (sender_id, message) tuples
Language models expect a single string input
This wrapper maintains the conversation history and formats everything as a coherent dialogue

The SimpleRenderWrapper helps with:

Color-coding messages by player
Adding clear turn indicators
Formatting game state information
Making the output more readable in the terminal

Step 4: Game Loop

Run the main game loop with clear control flow:

Python

1# Reset the environment
2env.reset()
3done = False
4
5# Continue until the game is complete
6while not done:
7    # Get the current observation
8    player_id, observation = env.get_observation()
9    
10    # Generate your model's action
11    action = agents[player_id](observation)
12    
13    # Apply the action
14    done, info = env.step(action=action)
15
16# Get final rewards when game is complete
17rewards = env.close()

The game loop handles:

Turn management (which player goes when)
Observation delivery to agents
Action processing
Game state updates
Victory/defeat determination

Full Code

Python

1import textarena as ta
2
3# Initialize agents
4agents = {
5    0: ta.agents.OpenRouterAgent(model_name="GPT-4o-mini"),
6    1: ta.agents.OpenRouterAgent(model_name="anthropic/claude-3.5-haiku"),
7}
8
9# Initialize environment from subset
10env = ta.make(env_id="BalancedSubset-v0")
11env = ta.wrappers.LLMObservationWrapper(env=env)
12env = ta.wrappers.SimpleRenderWrapper(
13    env=env,
14    player_names={0: "GPT-4o-Mini", 1: "Claude-3.5-Haiku"}
15)
16
17env.reset()
18done = False
19while not done:
20    player_id, observation = env.get_observation()
21    action = agents[player_id](observation)
22    done, info = env.step(action=action)
23rewards = env.close()

Next Steps

Train a Model

Compete Online