compiler_gym.envs
The CompilerEnv
environment
is a drop-in replacement for the basic gym.Env
class, with
extended functionality for compilers. Some compiler services may further
extend the functionality by subclassing from
CompilerEnv
. The following
environment classes are available:
Environment Classes:
CompilerEnv
- class compiler_gym.envs.CompilerEnv[source]
An OpenAI gym environment for compiler optimizations.
The easiest way to create a CompilerGym environment is to call
gym.make()
on one of the registered environments:>>> env = gym.make("llvm-v0")
See
compiler_gym.COMPILER_GYM_ENVS
for a list of registered environment names.Alternatively, an environment can be constructed directly, such as by connecting to a running compiler service at
localhost:8080
(see this document for more details):>>> env = ClientServiceCompilerEnv( ... service="localhost:8080", ... observation_space="features", ... reward_space="runtime", ... rewards=[env_reward_spaces], ... )
Once constructed, an environment can be used in exactly the same way as a regular
gym.Env
, e.g.>>> observation = env.reset() >>> cumulative_reward = 0 >>> for i in range(100): >>> action = env.action_space.sample() >>> observation, reward, done, info = env.step(action) >>> cumulative_reward += reward >>> if done: >>> break >>> print(f"Reward after {i} steps: {cumulative_reward}") Reward after 100 steps: -0.32123
- abstract __init__()[source]
Construct an environment.
Do not construct an environment directly. Use
gym.make()
on one of the registered environments:>>> with gym.make("llvm-v0") as env: ... pass # Use environment
- abstract property action_space: Space
The current action space.
- Getter
Get the current action space.
- Setter
Set the action space to use. Must be an entry in
action_spaces
. IfNone
, the default action space is selected.
- abstract property action_spaces: List[str]
A list of supported action space names.
- abstract apply(state: CompilerEnvState) None [source]
Replay this state on the given environment.
- Parameters
state – A
CompilerEnvState
instance.- Raises
ValueError – If this state cannot be applied.
- abstract property benchmark: Benchmark
Get or set the benchmark to use.
- Getter
Get
Benchmark
that is currently in use.- Setter
Set the benchmark to use. Either a
Benchmark
instance, or the URI of a benchmark as inenv.datasets.benchmark_uris()
.
Note
Setting a new benchmark has no effect until
env.reset()
is called.
- abstract close()[source]
Close the environment.
Once closed,
reset()
must be called before the environment is used again.Note
You must make sure to call
env.close()
on a CompilerGym environment when you are done with it. This is needed to perform manual tidying up of temporary files and processes. See the FAQ for more details.
- abstract commandline() str [source]
Interface for
CompilerEnv
subclasses to provide an equivalent commandline invocation to the current environment state.See also
commandline_to_actions()
.- Returns
A string commandline invocation.
- abstract commandline_to_actions(commandline: str) List[ActionType] [source]
Interface for
CompilerEnv
subclasses to convert from a commandline invocation to a sequence of actions.See also
commandline()
.- Returns
A list of actions.
- abstract property compiler_version: str
The version string of the underlying compiler that this service supports.
- abstract property episode_reward: Optional[float]
CompilerEnv.reward_space <compiler_gym.envs.CompilerGym.reward_space> is set, this value is the sum of all rewards for the current episode.
- Type
If
- Type
func
- abstract property episode_walltime: float
Return the amount of time in seconds since the last call to
reset()
.
- abstract fork() CompilerEnv [source]
Fork a new environment with exactly the same state.
This creates a duplicate environment instance with the current state. The new environment is entirely independently of the source environment. The user must call
close()
on the original and new environments.If not already in an episode,
reset()
is called.Example usage:
>>> env = gym.make("llvm-v0") >>> env.reset() # ... use env >>> new_env = env.fork() >>> new_env.state == env.state True >>> new_env.step(1) == env.step(1) True
Note
The client/service implementation of CompilerGym means that the forked and base environments share a common backend resource. This means that if either of them crash, such as due to a compiler assertion, both environments must be reset.
- Returns
A new environment instance.
- abstract property in_episode: bool
Whether the service is ready for
step()
to be called, i.e.reset()
has been called andclose()
has not.- Returns
True
if in an episode, elseFalse
.
- abstract multistep(actions: Iterable[ActionType], observation_spaces: Optional[Iterable[Union[str, ObservationSpaceSpec]]] = None, reward_spaces: Optional[Iterable[Union[str, Reward]]] = None, observations: Optional[Iterable[Union[str, ObservationSpaceSpec]]] = None, rewards: Optional[Iterable[Union[str, Reward]]] = None, timeout: float = 300)[source]
Take a sequence of steps and return the final observation and reward.
- Parameters
action – A sequence of actions to apply in order.
observation_spaces – A list of observation spaces to compute observations from. If provided, this changes the
observation
element of the return tuple to be a list of observations from the requested spaces. The defaultenv.observation_space
is not returned.reward_spaces – A list of reward spaces to compute rewards from. If provided, this changes the
reward
element of the return tuple to be a list of rewards from the requested spaces. The defaultenv.reward_space
is not returned.timeout – The maximum number of seconds to wait for the steps to succeed. Accepts a float value. The default is 300 seconds.
- Returns
A tuple of observation, reward, done, and info. Observation and reward are None if default observation/reward is not set.
- abstract property observation: ObservationView
A view of the available observation spaces that permits on-demand computation of observations.
- abstract property observation_space: Optional[Space]
The observation space that is used to return an observation value in
step()
.- Getter
Returns the underlying observation space, or
None
if not set.- Setter
Set the default observation space.
- abstract render(mode='human') Optional[str] [source]
Render the environment.
- Parameters
mode – The render mode to use.
- Raises
TypeError – If a default observation space is not set, or if the requested render mode does not exist.
- abstract reset(benchmark: Optional[Union[str, Benchmark]] = None, action_space: Optional[str] = None, observation_space: Union[OptionalArgumentValue, str, ObservationSpaceSpec] = OptionalArgumentValue.UNCHANGED, reward_space: Union[OptionalArgumentValue, str, Reward] = OptionalArgumentValue.UNCHANGED, timeout: float = 300) Optional[ObservationType] [source]
Reset the environment state.
This method must be called before
step()
.- Parameters
benchmark – The name of the benchmark to use. If provided, it overrides any value that was set during
__init__()
, and becomes subsequent calls toreset()
will use this benchmark. If no benchmark is provided, and no benchmark was provided to__init___()
, the service will randomly select a benchmark to use.action_space – The name of the action space to use. If provided, it overrides any value that set during
__init__()
, and subsequent calls toreset()
will use this action space. If no action space is provided, the default action space is used.observation_space – Compute and return observations at each
step()
from this space. Accepts a string name or anObservationSpaceSpec
. IfNone
,step()
returnsNone
for the observation value. IfOptionalArgumentValue.UNCHANGED
(the default value), the observation space remains unchanged from the previous episode. For available spaces, seeenv.observation.spaces
.reward_space – Compute and return reward at each
step()
from this space. Accepts a string name or aReward
. IfNone
,step()
returnsNone
for the reward value. IfOptionalArgumentValue.UNCHANGED
(the default value), the observation space remains unchanged from the previous episode. For available spaces, seeenv.reward.spaces
.timeout – The maximum number of seconds to wait for reset to succeed.
- Returns
The initial observation.
- Raises
BenchmarkInitError – If the benchmark is invalid. This can happen if the benchmark contains code that the compiler does not support, or because of some internal error within the compiler. In this case, another benchmark must be used.
TypeError – If no benchmark has been set, and the environment does not have a default benchmark to select from.
- abstract property reward: RewardView
A view of the available reward spaces that permits on-demand computation of rewards.
- abstract property reward_range: Tuple[float, float]
A tuple indicating the range of reward values.
Default range is (-inf, +inf).
- abstract property reward_space: Optional[Reward]
The default reward space that is used to return a reward value from
step()
.- Getter
Returns a
Reward
, orNone
if not set.- Setter
Set the default reward space.
- abstract property state: CompilerEnvState
The tuple representation of the current environment state.
- abstract step(action: ActionType, observation_spaces: Optional[Iterable[Union[str, ObservationSpaceSpec]]] = None, reward_spaces: Optional[Iterable[Union[str, Reward]]] = None, observations: Optional[Iterable[Union[str, ObservationSpaceSpec]]] = None, rewards: Optional[Iterable[Union[str, Reward]]] = None, timeout: float = 300) Tuple[Optional[Union[ObservationType, List[ObservationType]]], Optional[Union[float, List[float]]], bool, Dict[str, Any]] [source]
Take a step.
- Parameters
action – An action.
observation_spaces – A list of observation spaces to compute observations from. If provided, this changes the
observation
element of the return tuple to be a list of observations from the requested spaces. The defaultenv.observation_space
is not returned.reward_spaces – A list of reward spaces to compute rewards from. If provided, this changes the
reward
element of the return tuple to be a list of rewards from the requested spaces. The defaultenv.reward_space
is not returned.timeout – The maximum number of seconds to wait for the step to succeed. Accepts a float value. The default is 300 seconds.
- Returns
A tuple of observation, reward, done, and info. Observation and reward are None if default observation/reward is not set.
- abstract validate(state: Optional[CompilerEnvState] = None) ValidationResult [source]
Validate an environment’s state.
- Parameters
state – A state to environment. If not provided, the current state is validated.
- Returns
- abstract property version: str
The version string of the compiler service.
LlvmEnv
- class compiler_gym.envs.LlvmEnv(*args, benchmark: Optional[Union[str, Benchmark]] = None, datasets_site_path: Optional[Path] = None, **kwargs)[source]
A specialized ClientServiceCompilerEnv for LLVM.
This extends the default
ClientServiceCompilerEnv
environment, adding extra LLVM functionality. Specifically, the actions use theCommandlineFlag
space, which is a type ofDiscrete
space that provides additional documentation about each action, and theLlvmEnv.commandline()
method can be used to produce an equivalent LLVM opt invocation for the current environment state.- commandline(textformat: bool = False) str [source]
Returns an LLVM
opt
command line invocation for the current environment state.- Parameters
textformat – Whether to generate a command line that processes text-format LLVM-IR or bitcode (the default).
- Returns
A command line string.
- commandline_to_actions(commandline: str) List[int] [source]
Returns a list of actions from the given command line.
- Parameters
commandline – A command line invocation, as generated by
env.commandline()
.- Returns
A list of actions.
- Raises
ValueError – In case the command line string is malformed.
- fork()[source]
Fork a new environment with exactly the same state.
This creates a duplicate environment instance with the current state. The new environment is entirely independently of the source environment. The user must call
close()
on the original and new environments.If not already in an episode,
reset()
is called.Example usage:
>>> env = gym.make("llvm-v0") >>> env.reset() # ... use env >>> new_env = env.fork() >>> new_env.state == env.state True >>> new_env.step(1) == env.step(1) True
Note
The client/service implementation of CompilerGym means that the forked and base environments share a common backend resource. This means that if either of them crash, such as due to a compiler assertion, both environments must be reset.
- Returns
A new environment instance.
- property ir: str
Print the LLVM-IR of the program in its current state.
Alias for
env.observation["Ir"]
.- Returns
A string of LLVM-IR.
- property ir_sha1: str
Return the 40-characeter hex sha1 checksum of the current IR.
Equivalent to:
hashlib.sha1(env.ir.encode("utf-8")).hexdigest()
.- Returns
A 40-character hexadecimal sha1 string.
- make_benchmark(inputs: Union[str, Path, ClangInvocation, List[Union[str, Path, ClangInvocation]]], copt: Optional[List[str]] = None, system_includes: bool = True, timeout: int = 600) Benchmark [source]
Create a benchmark for use with this environment.
This function takes one or more inputs and uses them to create an LLVM bitcode benchmark that can be passed to
compiler_gym.envs.LlvmEnv.reset()
.The following input types are supported:
File Suffix
Treated as
Converted using
.bc
LLVM IR bitcode
No conversion required.
.ll
LLVM IR text format
Assembled to bitcode using llvm-as.
.c
,.cc
,.cpp
,.cxx
C / C++ source
Compiled to bitcode using clang and the given
copt
.Note
The LLVM IR format has no compatability guarantees between versions (see LLVM docs). You must ensure that any
.bc
and.ll
files are compatible with the LLVM version used by CompilerGym, which can be reported usingenv.compiler_version
.E.g. for single-source C/C++ programs, you can pass the path of the source file:
>>> benchmark = env.make_benchmark('my_app.c') >>> env = gym.make("llvm-v0") >>> env.reset(benchmark=benchmark)
The clang invocation used is roughly equivalent to:
$ clang my_app.c -O0 -c -emit-llvm -o benchmark.bc
Additional compile-time arguments to clang can be provided using the
copt
argument:>>> benchmark = env.make_benchmark('/path/to/my_app.cpp', copt=['-O2'])
If you need more fine-grained control over the options, you can directly construct a
ClangInvocation
to pass a list of arguments to clang:>>> benchmark = env.make_benchmark( ClangInvocation(['/path/to/my_app.c'], system_includes=False, timeout=10) )
For multi-file programs, pass a list of inputs that will be compiled separately and then linked to a single module:
>>> benchmark = env.make_benchmark([ 'main.c', 'lib.cpp', 'lib2.bc', 'foo/input.bc' ])
- Parameters
inputs – An input, or list of inputs.
copt – A list of command line options to pass to clang when compiling source files.
system_includes – Whether to include the system standard libraries during compilation jobs. This requires a system toolchain. See
get_system_library_flags()
.timeout – The maximum number of seconds to allow clang to run before terminating.
- Returns
A
Benchmark
instance.- Raises
FileNotFoundError – If any input sources are not found.
TypeError – If the inputs are of unsupported types.
OSError – If a suitable compiler cannot be found.
BenchmarkInitError – If a compilation job fails.
TimeoutExpired – If a compilation job exceeds
timeout
seconds.
- make_benchmark_from_command_line(cmd: Union[str, List[str]], replace_driver: bool = True, system_includes: bool = True, timeout: int = 600) Benchmark [source]
Create a benchmark for use with this environment.
This function takes a command line compiler invocation as input, modifies it to produce an unoptimized LLVM-IR bitcode, and then runs the modified command line to produce a bitcode benchmark.
For example, the command line:
>>> benchmark = env.make_benchmark_from_command_line( ... ["gcc", "-DNDEBUG", "a.c", "b.c", "-o", "foo", "-lm"] ... )
Will compile a.c and b.c to an unoptimized benchmark that can be then passed to
reset()
.The way this works is to change the first argument of the command line invocation to the version of clang shipped with CompilerGym, and to then append command line flags that causes the compiler to produce LLVM-IR with optimizations disabled. For example the input command line:
gcc -DNDEBUG a.c b.c -o foo -lm
Will be rewritten to be roughly equivalent to:
/path/to/compiler_gym/clang -DNDEG a.c b.c \ -Xclang -disable-llvm-passes -Xclang -disable-llvm-optzns \ -c -emit-llvm -o -
The generated benchmark then has a method
compile()
which completes the linking and compilatilion to executable. For the above example, this would be roughly equivalent to:/path/to/compiler_gym/clang environment-bitcode.bc -o foo -lm
- Parameters
cmd – A command line compiler invocation, either as a list of arguments (e.g.
["clang", "in.c"]
) or as a single shell string (e.g."clang in.c"
).replace_driver – Whether to replace the first argument of the command with the clang driver used by this environment.
system_includes – Whether to include the system standard libraries during compilation jobs. This requires a system toolchain. See
get_system_library_flags()
.timeout – The maximum number of seconds to allow the compilation job to run before terminating.
- Returns
A
BenchmarkFromCommandLine
instance.- Raises
ValueError – If no command line is provided.
BenchmarkInitError – If executing the command line fails.
TimeoutExpired – If a compilation job exceeds
timeout
seconds.
- render(mode='human') Optional[str] [source]
Render the environment.
ClientServiceCompilerEnv instances support two render modes: “human”, which prints the current environment state to the terminal and return nothing; and “ansi”, which returns a string representation of the current environment state.
- Parameters
mode – The render mode to use.
- Raises
TypeError – If a default observation space is not set, or if the requested render mode does not exist.
- reset(*args, **kwargs)[source]
Reset the environment state.
This method must be called before
step()
.- Parameters
benchmark – The name of the benchmark to use. If provided, it overrides any value that was set during
__init__()
, and becomes subsequent calls toreset()
will use this benchmark. If no benchmark is provided, and no benchmark was provided to__init___()
, the service will randomly select a benchmark to use.action_space – The name of the action space to use. If provided, it overrides any value that set during
__init__()
, and subsequent calls toreset()
will use this action space. If no action space is provided, the default action space is used.observation_space – Compute and return observations at each
step()
from this space. Accepts a string name or anObservationSpaceSpec
. IfNone
,step()
returnsNone
for the observation value. IfOptionalArgumentValue.UNCHANGED
(the default value), the observation space remains unchanged from the previous episode. For available spaces, seeenv.observation.spaces
.reward_space – Compute and return reward at each
step()
from this space. Accepts a string name or aReward
. IfNone
,step()
returnsNone
for the reward value. IfOptionalArgumentValue.UNCHANGED
(the default value), the observation space remains unchanged from the previous episode. For available spaces, seeenv.reward.spaces
.timeout – The maximum number of seconds to wait for reset to succeed.
- Returns
The initial observation.
- Raises
BenchmarkInitError – If the benchmark is invalid. This can happen if the benchmark contains code that the compiler does not support, or because of some internal error within the compiler. In this case, another benchmark must be used.
TypeError – If no benchmark has been set, and the environment does not have a default benchmark to select from.
- property runtime_observation_count: int
The number of runtimes to return for the Runtime observation space.
See the Runtime observation space reference for further details.
Example usage:
>>> env = compiler_gym.make("llvm-v0") >>> env.reset() >>> env.runtime_observation_count = 10 >>> len(env.observation.Runtime()) 10
- Getter
Returns the number of runtimes that will be returned when a
Runtime
observation is requested.- Setter
Set the number of runtimes to compute when a
Runtime
observation is requested.- Type
int
- property runtime_warmup_runs_count: int
The number of warmup runs of the binary to perform before measuring the Runtime observation space.
See the Runtime observation space reference for further details.
Example usage:
>>> env = compiler_gym.make("llvm-v0") >>> env.reset() >>> env.runtime_observation_count = 10 >>> len(env.observation.Runtime()) 10
- Getter
Returns the number of runs that be performed before measuring the
Runtime
observation is requested.- Setter
Set the number of warmup runs to perform when a
Runtime
observation is requested.- Type
int
GccEnv
- class compiler_gym.envs.GccEnv(*args, gcc_bin: Union[str, Path] = 'docker:gcc:11.2.0', benchmark: Union[str, Benchmark] = 'benchmark://chstone-v0/adpcm', datasets_site_path: Optional[Path] = None, connection_settings: Optional[ConnectionOpts] = None, **kwargs)[source]
A specialized ClientServiceCompilerEnv for GCC.
This class exposes the optimization space of GCC’s command line flags as an environment for reinforcement learning. For further details, see the GCC Environment Reference.
- property asm: str
Get the assembly code.
- property asm_hash: str
Get a hash of the assembly code.
- property asm_size: int
Get the assembly code size in bytes.
- property choices: List[int]
Get the current choices
- commandline() str [source]
Return a string representing the command line options.
- Returns
A string.
- property instruction_counts: Dict[str, int]
Get a count of the instruction types in the assembly code.
Note, that it will also count fields beginning with a
.
, like.bss
and.align
. Make sure to remove those if not needed.
- property obj: bytes
Get the object code.
- property obj_hash: str
Get a hash of the object code.
- property obj_size: int
Get the object code size in bytes.
- reset(benchmark: Optional[Union[str, Benchmark]] = None, action_space: Optional[str] = None, observation_space: Union[OptionalArgumentValue, str, ObservationSpaceSpec] = OptionalArgumentValue.UNCHANGED, reward_space: Union[OptionalArgumentValue, str, Reward] = OptionalArgumentValue.UNCHANGED) Optional[ObservationType] [source]
Reset the environment state.
This method must be called before
step()
.- Parameters
benchmark – The name of the benchmark to use. If provided, it overrides any value that was set during
__init__()
, and becomes subsequent calls toreset()
will use this benchmark. If no benchmark is provided, and no benchmark was provided to__init___()
, the service will randomly select a benchmark to use.action_space – The name of the action space to use. If provided, it overrides any value that set during
__init__()
, and subsequent calls toreset()
will use this action space. If no action space is provided, the default action space is used.observation_space – Compute and return observations at each
step()
from this space. Accepts a string name or anObservationSpaceSpec
. IfNone
,step()
returnsNone
for the observation value. IfOptionalArgumentValue.UNCHANGED
(the default value), the observation space remains unchanged from the previous episode. For available spaces, seeenv.observation.spaces
.reward_space – Compute and return reward at each
step()
from this space. Accepts a string name or aReward
. IfNone
,step()
returnsNone
for the reward value. IfOptionalArgumentValue.UNCHANGED
(the default value), the observation space remains unchanged from the previous episode. For available spaces, seeenv.reward.spaces
.timeout – The maximum number of seconds to wait for reset to succeed.
- Returns
The initial observation.
- Raises
BenchmarkInitError – If the benchmark is invalid. This can happen if the benchmark contains code that the compiler does not support, or because of some internal error within the compiler. In this case, another benchmark must be used.
TypeError – If no benchmark has been set, and the environment does not have a default benchmark to select from.
- property rtl: str
Get the final rtl of the program.
- property source: str
Get the source code.