Env¶
Gymnasium environment wrapper for Grid Universe.
Provides a structured observation that pairs a rendered RGBA image with rich
info dictionaries (agent status / inventory / active effects, environment
config). Reward is the delta of state.score per step. terminated is
True on win, truncated on lose.
ImageObservation schema (see docs for full details):
{
"image": np.ndarray(H,W,4),
"info": {
"agent": {...},
"status": {...},
"config": {...},
"message": str # empty string if None
}
}
or
GridState # if observation_type="gridstate"
Usage:
from grid_universe.env import GridUniverseEnv
from grid_universe.examples.maze import generate as maze_generate
env = GridUniverseEnv(initial_state_fn=maze_generate, width=9, height=9, seed=123)
Customization hooks
initial_state_fn: Provide a callable that returns a fully builtState.render_image_map/ resolution let you swap assets or resolution.
The environment is purposely not vectorized; wrap externally if needed.
AgentInfo
¶
Bases: TypedDict
Agent sub‑observation grouping health, effects and inventory.
ConfigInfo
¶
Bases: TypedDict
Environment configuration.
EffectEntry
¶
Bases: TypedDict
Single active effect entry.
Fields use sentinel defaults in the runtime observation
- Empty string ("") for absent text fields.
- -1 for numeric fields that are logically None / not applicable.
GridUniverseEnv(initial_state_fn, render_mode='rgb_array', render_resolution=DEFAULT_RESOLUTION, render_image_map=DEFAULT_IMAGE_MAP, render_asset_root=DEFAULT_ASSET_ROOT, observation_type='image', **kwargs)
¶
Bases: Env[ImageObservation | GridState, integer]
Gymnasium Env implementation for the Grid Universe.
Create a new environment instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
render_mode
|
str
|
"rgb_array" to return PIL image frames, "human" to open a window. |
'rgb_array'
|
render_resolution
|
int
|
Width (pixels) of rendered image (height derived). |
DEFAULT_RESOLUTION
|
render_image_map
|
ImageMap
|
Mapping of |
DEFAULT_IMAGE_MAP
|
initial_state_fn
|
Callable[..., State]
|
Callable returning an initial |
required |
**kwargs
|
Any
|
Forwarded to |
{}
|
gridstate
property
¶
Return the current state as a GridState dataclass
close()
¶
Release any renderer resources (no-op placeholder).
render(mode=None)
¶
Render the current state.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mode
|
str | None
|
"human" to display, "rgb_array" to return PIL image. Defaults to the instance's configured render mode. |
None
|
reset(*, seed=None, options=None)
¶
Start a new episode.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seed
|
int | None
|
Currently unused (procedural seed is passed via kwargs on construction). |
None
|
options
|
dict | None
|
Gymnasium options (unused). |
None
|
Returns:
| Type | Description |
|---|---|
tuple[ImageObservation | GridState, dict[str, object]]
|
Tuple[ImageObservation, dict]: ImageObservation dict and empty info dict per Gymnasium API. |
state_info()
¶
Return structured info sub-dict used in observations.
step(action)
¶
Apply one environment step.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
action
|
int | integer | Action
|
Integer index (or |
required |
Returns:
| Type | Description |
|---|---|
tuple[ImageObservation | GridState, float, bool, bool, dict[str, object]]
|
Tuple[ImageObservation, float, bool, bool, dict]: |
HealthInfo
¶
Bases: TypedDict
Health block; -1 indicates missing (agent has no health component).
ImageObservation
¶
Bases: TypedDict
Top‑level observation returned by the environment.
image: RGBA image array (H x W x 4, dtype=uint8)
info: Rich structured dictionaries (see InfoDict).
InfoDict
¶
Bases: TypedDict
Full structured info payload accompanying every observation.
InventoryItem
¶
Bases: TypedDict
Inventory item entry (key / core / coin / generic item).
StatusInfo
¶
Bases: TypedDict
Environment status (score, phase, current turn).
agent_observation_dict(state, agent_id)
¶
Compose structured agent sub‑observation.
Includes health, list of active effect entries, and inventory items.
Missing health is represented by None values which are later converted
to sentinel numbers in the space definition (-1) when serialized to numpy
arrays (Gym leaves them as ints here).
env_config_observation_dict(state)
¶
Config portion of observation (function names, seed, dimensions).
env_status_observation_dict(state)
¶
Status portion of observation (score, phase, turn).