compiler_gym.spaces
CompilerGym extends the builtin gym spaces to better describe the observation spaces available to compilers.
Additional spaces:
Commandline
- class compiler_gym.spaces.Commandline(items: Iterable[CommandlineFlag], name: str)[source]
A
NamedDiscrete
space where each element represents a commandline flag.Example usage:
>>> space = Commandline([ CommandlineFlag("a", "-a", "A flag"), CommandlineFlag("b", "-b", "Another flag"), ]) >>> space.n 2 >>> space["a"] 0 >>> space.names[0] a >>> space.flags[0] -a >>> space.descriptions[0] A flag >>> space.sample() 1 >>> space.commandline([0, 1]) -a -b
- Variables
flags – A list of flag strings.
descriptions – A list of flag descriptions.
- __init__(items: Iterable[CommandlineFlag], name: str)[source]
Constructor.
- Parameters
items – The commandline flags that comprise the space.
name – The name of the space.
- commandline(values: Union[int, Iterable[int]]) str [source]
Produce a commandline invocation from a sequence of values.
- Parameters
values – A numeric value from the space, or sequence of values.
- Returns
A string commandline invocation.
- from_commandline(commandline: str) List[int] [source]
Produce a sequence of actions from a commandline.
- Parameters
commandline – A string commandline invocation, as produced by
commandline()
.- Returns
A list of action values.
- Raises
LookupError – If any of the flags in the commandline are not recognized.
Dict
Discrete
NamedDiscrete
- class compiler_gym.spaces.NamedDiscrete(items: Iterable[str], name: str)[source]
An extension of the
Discrete
space in which each point in the space has a name. Additionally, the space itself may have a name.- Variables
name (str) – The name of the space.
names (List[str]) – A list of names for each element in the space.
Example usage:
>>> space = NamedDiscrete(["a", "b", "c"]) >>> space.n 3 >>> space["a"] 0 >>> space.names[0] a >>> space.sample() 1
- __init__(items: Iterable[str], name: str)[source]
Constructor.
- Parameters
items – A list of names for items in the space.
name – The name of the space.
- __getitem__(name: str) int [source]
Lookup the numeric value of a point in the space.
- Parameters
name – A name.
- Returns
The numeric value.
- Raises
ValueError – If the name is not in the space.
Permutation
- class compiler_gym.spaces.Permutation(name: str, scalar_range: Scalar)[source]
The space of permutations of all numbers in the range scalar_range.
- __init__(name: str, scalar_range: Scalar)[source]
Constructor.
- Parameters
name – The name of the permutation space.
scalar_range – Range of numbers in the permutation. For example the scalar range [1, 3] would define permutations like [1, 2, 3] or [2, 1, 3], etc.
- Raises
TypeError – If scalar_range.dtype is not an integral type.
Reward
- class compiler_gym.spaces.Reward(name: str, observation_spaces: Optional[List[str]] = None, default_value: float = 0, min: Optional[float] = None, max: Optional[float] = None, default_negates_returns: bool = False, success_threshold: Optional[float] = None, deterministic: bool = False, platform_dependent: bool = True)[source]
An extension of the
Scalar
space that is used for computing a reward signal.A
Reward
is a scalar value used to determine the reward for a particular action. An instance ofReward
is used to represent the reward function for a particular episode. For everyenv.step()
of the environment, thereward.update()
method is called to produce a new incremental reward.Environments provide implementations of
Reward
that compute reward signals based on observation values computed by the backend service.- __init__(name: str, observation_spaces: Optional[List[str]] = None, default_value: float = 0, min: Optional[float] = None, max: Optional[float] = None, default_negates_returns: bool = False, success_threshold: Optional[float] = None, deterministic: bool = False, platform_dependent: bool = True)[source]
Constructor.
- Parameters
name – The name of the reward space. This is a unique name used to represent the reward.
observation_spaces – A list of observation space IDs (
space.id
values) that are used to compute the reward. May be an empty list if no observations are requested. Requested observations will be provided to theobservations
argument ofreward.update()
.default_value – A default reward. This value will be returned by
env.step()
if the service terminates.min – The lower bound of the reward.
max – The upper bound of the reward.
default_negates_returns – If true, the default value will be offset by the sum of all rewards for the current episode. For example, given a default reward value of -10.0 and an episode with prior rewards [0.1, 0.3, -0.15], the default value is: -10.0 - sum(0.1, 0.3, -0.15).
success_threshold – The cumulative reward threshold before an episode is considered successful. For example, episodes where reward is scaled to an existing heuristic can be considered “successful” when the reward exceeds the existing heuristic.
deterministic – Whether the reward space is deterministic.
platform_dependent – Whether the reward values depend on the execution environment of the service.
- property range: Tuple[float, float]
The lower and upper bounds of the reward.
- reset(benchmark: str, observation_view: ObservationView) None [source]
Reset the rewards space. This is called on
env.reset()
.- Parameters
benchmark – The URI of the benchmark that is used for this episode.
observation – An observation view for reward initialization
- reward_on_error(episode_reward: float) float [source]
Return the reward value for an error condition.
This method should be used to produce the reward value that should be used if the compiler service cannot be reached, e.g. because it has crashed or the connection has dropped.
- Parameters
episode_reward – The current cumulative reward of an episode.
- Returns
A reward.
- update(actions: List[ActionType], observations: List[ObservationType], observation_view: ObservationView) float [source]
Calculate a reward for the given action.
- Parameters
action – The action performed.
observations – A list of observation values as requested by the
observation_spaces
constructor argument.observation_view – The
ObservationView
instance.
Scalar
- class compiler_gym.spaces.Scalar(name: str, min: ~typing.Optional[float] = None, max: ~typing.Optional[float] = None, dtype=<class 'numpy.float64'>)[source]
A scalar value.
- __init__(name: str, min: ~typing.Optional[float] = None, max: ~typing.Optional[float] = None, dtype=<class 'numpy.float64'>)[source]
Constructor.
- Parameters
name – The name of the space.
min – The lower bound for a value in this space. If None, there is no lower bound.
max – The upper bound for a value in this space. If None, there is no upper bound.
dtype – The type of this scalar.
SpaceSequence
- class compiler_gym.spaces.SpaceSequence(name: str, space: Space, size_range: Tuple[int, Optional[int]] = (0, None))[source]
Variable-length sequence of subspaces that have the same definition.
Sequence
- class compiler_gym.spaces.Sequence(name: str, size_range: ~typing.Tuple[int, ~typing.Optional[int]] = (0, None), dtype=<class 'bytes'>, opaque_data_format: ~typing.Optional[str] = None, scalar_range: ~typing.Optional[~compiler_gym.spaces.scalar.Scalar] = None)[source]
A sequence of values. Each element of the sequence is of dtype. The length of the sequence is bounded by size_range.
Example:
>>> space = Sequence(size_range=(0, None), dtype=str) >>> space.contains("Hello, world!") True
>>> space = Sequence(size_range=(256, 256), dtype=bytes) >>> space.contains("Hello, world!") False
- Variables
size_range – A tuple indicating the (lower, upper) bounds for sequence lengths. An upper bound of None means no upper bound. All sequences must have a lower bound of length >= 0.
dtype – The data type for each element in a sequence.
opaque_data_format – An optional string describing an opaque data format, e.g. a data structure that is serialized to a string/binary array for transmission to the client. It is up to the client and service to agree on how to decode observations using this value. For example, an opaque_data_format of string_json could be used to indicate that the observation is a string-serialized JSON value.
- __init__(name: str, size_range: ~typing.Tuple[int, ~typing.Optional[int]] = (0, None), dtype=<class 'bytes'>, opaque_data_format: ~typing.Optional[str] = None, scalar_range: ~typing.Optional[~compiler_gym.spaces.scalar.Scalar] = None)[source]
Constructor.
- Parameters
name – The name of the space.
size_range – A tuple indicating the (lower, upper) bounds for sequence lengths. An upper bound of None means no upper bound. All sequences must have a lower bound of length >= 0.
dtype – The data type for each element in a sequence.
opaque_data_format – An optional string describing an opaque data format, e.g. a data structure that is serialized to a string/binary array for transmission to the client. It is up to the client and service to agree on how to decode observations using this value. For example, an opaque_data_format of string_json could be used to indicate that the observation is a string-serialized JSON value.
scalar_range – If specified, this denotes the legal range of each element in the sequence. This is enforced by
contains()
checks.