Skip to content

Env

Gymnasium environment wrapper for Grid Universe.

Provides a structured observation that pairs a rendered RGBA image with rich info dictionaries (agent status / inventory / active effects, environment config). Reward is the delta of state.score per step. terminated is True on win, truncated on lose.

ImageObservation schema (see docs for full details):

{
    "image": np.ndarray(H,W,4),
    "info": {
            "agent": {...},
            "status": {...},
            "config": {...},
            "message": str  # empty string if None
    }
}

or

GridState  # if observation_type="gridstate"

Usage:

from grid_universe.env import GridUniverseEnv
from grid_universe.examples.maze import generate as maze_generate

env = GridUniverseEnv(initial_state_fn=maze_generate, width=9, height=9, seed=123)
Customization hooks
  • initial_state_fn: Provide a callable that returns a fully built State.
  • render_image_map / resolution let you swap assets or resolution.

The environment is purposely not vectorized; wrap externally if needed.

AgentInfo

Bases: TypedDict

Agent sub‑observation grouping health, effects and inventory.

ConfigInfo

Bases: TypedDict

Environment configuration.

EffectEntry

Bases: TypedDict

Single active effect entry.

Fields use sentinel defaults in the runtime observation
  • Empty string ("") for absent text fields.
  • -1 for numeric fields that are logically None / not applicable.

GridUniverseEnv(initial_state_fn, render_mode='rgb_array', render_resolution=DEFAULT_RESOLUTION, render_image_map=DEFAULT_IMAGE_MAP, render_asset_root=DEFAULT_ASSET_ROOT, observation_type='image', **kwargs)

Bases: Env[ImageObservation | GridState, integer]

Gymnasium Env implementation for the Grid Universe.

Create a new environment instance.

Parameters:

Name Type Description Default
render_mode str

"rgb_array" to return PIL image frames, "human" to open a window.

'rgb_array'
render_resolution int

Width (pixels) of rendered image (height derived).

DEFAULT_RESOLUTION
render_image_map ImageMap

Mapping of (appearance_name, properties) to asset paths.

DEFAULT_IMAGE_MAP
initial_state_fn Callable[..., State]

Callable returning an initial State.

required
**kwargs Any

Forwarded to initial_state_fn (e.g., size, densities, seed).

{}

gridstate property

Return the current state as a GridState dataclass

close()

Release any renderer resources (no-op placeholder).

render(mode=None)

Render the current state.

Parameters:

Name Type Description Default
mode str | None

"human" to display, "rgb_array" to return PIL image. Defaults to the instance's configured render mode.

None

reset(*, seed=None, options=None)

Start a new episode.

Parameters:

Name Type Description Default
seed int | None

Currently unused (procedural seed is passed via kwargs on construction).

None
options dict | None

Gymnasium options (unused).

None

Returns:

Type Description
tuple[ImageObservation | GridState, dict[str, object]]

Tuple[ImageObservation, dict]: ImageObservation dict and empty info dict per Gymnasium API.

state_info()

Return structured info sub-dict used in observations.

step(action)

Apply one environment step.

Parameters:

Name Type Description Default
action int | integer | Action

Integer index (or Action enum member) selecting an action from the discrete action space.

required

Returns:

Type Description
tuple[ImageObservation | GridState, float, bool, bool, dict[str, object]]

Tuple[ImageObservation, float, bool, bool, dict]: (observation, reward, terminated, truncated, info).

HealthInfo

Bases: TypedDict

Health block; -1 indicates missing (agent has no health component).

ImageObservation

Bases: TypedDict

Top‑level observation returned by the environment.

image: RGBA image array (H x W x 4, dtype=uint8) info: Rich structured dictionaries (see InfoDict).

InfoDict

Bases: TypedDict

Full structured info payload accompanying every observation.

InventoryItem

Bases: TypedDict

Inventory item entry (key / core / coin / generic item).

StatusInfo

Bases: TypedDict

Environment status (score, phase, current turn).

agent_observation_dict(state, agent_id)

Compose structured agent sub‑observation.

Includes health, list of active effect entries, and inventory items. Missing health is represented by None values which are later converted to sentinel numbers in the space definition (-1) when serialized to numpy arrays (Gym leaves them as ints here).

env_config_observation_dict(state)

Config portion of observation (function names, seed, dimensions).

env_status_observation_dict(state)

Status portion of observation (score, phase, turn).