compiler_gym.wrappers
The compiler_gym.wrappers
module provides a set of classes that can
be used to transform an environment in a modular way.
For example:
>>> env = compiler_gym.make("llvm-v0")
>>> env = TimeLimit(env, n=10)
>>> env = CycleOverBenchmarks(
... env,
... benchmarks=[
... "benchmark://cbench-v1/crc32",
... "benchmark://cbench-v1/qsort",
... ],
... )
Warning
CompilerGym environments are incompatible with the OpenAI Gym wrappers. This is because CompilerGym extends the environment API with additional arguments and methods. You must use the wrappers from this module when wrapping CompilerGym environments. We provide a set of base wrappers that are equivalent to those in OpenAI Gym that you can use to write your own wrappers.
Document contents:
Base wrappers
- class compiler_gym.wrappers.CompilerEnvWrapper(env: CompilerEnv)[source]
Wraps a
CompilerEnv
environment to allow a modular transformation.This class is the base class for all wrappers. This class must be used rather than
gym.Wrapper
to support the CompilerGym API extensions such as thefork()
method.- __init__(env: CompilerEnv)[source]
Constructor.
- Parameters
env – The environment to wrap.
- Raises
TypeError – If
env
is not aCompilerEnv
.
- class compiler_gym.wrappers.ActionWrapper(env: CompilerEnv)[source]
Wraps a
CompilerEnv
environment to allow an action space transformation.
- class compiler_gym.wrappers.ObservationWrapper(env: CompilerEnv)[source]
Wraps a
CompilerEnv
environment to allow an observation space transformation.- observation()
A view of the available observation spaces that permits on-demand computation of observations.
- class compiler_gym.wrappers.RewardWrapper(env: CompilerEnv)[source]
Wraps a
CompilerEnv
environment to allow an reward space transformation.- reward()
A view of the available reward spaces that permits on-demand computation of rewards.
Action space wrappers
- class compiler_gym.wrappers.CommandlineWithTerminalAction(env: CompilerEnv, terminal=CommandlineFlag(name='end-of-episode', flag='# end-of-episode', description='End the episode'))[source]
Creates a new action space with a special “end of episode” terminal action at the start. If step() is called with it, the “done” flag is set.
- __init__(env: CompilerEnv, terminal=CommandlineFlag(name='end-of-episode', flag='# end-of-episode', description='End the episode'))[source]
Constructor.
- Parameters
env – The environment to wrap.
terminal – The flag to use as the terminal action. Optional.
- class compiler_gym.wrappers.ConstrainedCommandline(env: CompilerEnv, flags: Iterable[str], name: Optional[str] = None)[source]
Constrains a Commandline action space to a subset of the original space’s flags.
- __init__(env: CompilerEnv, flags: Iterable[str], name: Optional[str] = None)[source]
Constructor.
- Parameters
env – The environment to wrap.
flags – A list of entries from
env.action_space.flags
denoting flags that are available in this wrapped environment.name – The name of the new action space.
- class compiler_gym.wrappers.TimeLimit(env: CompilerEnv, max_episode_steps: Optional[int] = None)[source]
A step-limited wrapper that is compatible with CompilerGym.
Example usage:
>>> env = TimeLimit(env, max_episode_steps=3) >>> env.reset() >>> _, _, done, _ = env.step(0) >>> _, _, done, _ = env.step(0) >>> _, _, done, _ = env.step(0) >>> done True
- __init__(env: CompilerEnv, max_episode_steps: Optional[int] = None)[source]
Constructor.
- Parameters
env – The environment to wrap.
- Raises
TypeError – If
env
is not aCompilerEnv
.
- class compiler_gym.wrappers.ForkOnStep(env: CompilerEnv)[source]
A wrapper that creates a fork of the environment before every step.
This wrapper creates a new fork of the environment before every call to
env.reset()
. Because of this, this environment supports an additionalenv.undo()
method that can be used to backtrack.Example usage:
>>> env = ForkOnStep(compiler_gym.make("llvm-v0")) >>> env.step(0) >>> env.actions [0] >>> env.undo() >>> env.actions []
- Variables
stack (List[CompilerEnv]) – A fork of the environment before every previous call to
env.reset()
, ordered oldest to newest.
- __init__(env: CompilerEnv)[source]
Constructor.
- Parameters
env – The environment to wrap.
- undo() CompilerEnv [source]
Undo the previous action.
- Returns
Self.
Datasets wrappers
- class compiler_gym.wrappers.IterateOverBenchmarks(env: CompilerEnv, benchmarks: Iterable[Union[str, Benchmark]], fork_shares_iterator: bool = False)[source]
Iterate over a (possibly infinite) sequence of benchmarks on each call to reset(). Will raise
StopIteration
onreset()
once the iterator is exhausted. UseCycleOverBenchmarks
orRandomOrderBenchmarks
for wrappers which will loop over the benchmarks.- __init__(env: CompilerEnv, benchmarks: Iterable[Union[str, Benchmark]], fork_shares_iterator: bool = False)[source]
Constructor.
- Parameters
env – The environment to wrap.
benchmarks – An iterable sequence of benchmarks.
fork_shares_iterator – If
True
, thebenchmarks
iterator will bet shared by a forked environment created byenv.fork()
. This means that callingenv.reset()
with one environment will advance the iterator in the other. IfFalse
, forked environments will useitertools.tee()
to create a copy of the iterator so that each iterator may advance independently. However, this requires shared buffers between the environments which can lead to memory overheads ifenv.reset()
is called many times more in one environment than the other.
- class compiler_gym.wrappers.CycleOverBenchmarks(env: CompilerEnv, benchmarks: Iterable[Union[str, Benchmark]], fork_shares_iterator: bool = False)[source]
Cycle through a list of benchmarks on each call to
reset()
. Same asIterateOverBenchmarks
except the list of benchmarks repeats once exhausted.- __init__(env: CompilerEnv, benchmarks: Iterable[Union[str, Benchmark]], fork_shares_iterator: bool = False)[source]
Constructor.
- Parameters
env – The environment to wrap.
benchmarks – An iterable sequence of benchmarks.
fork_shares_iterator – If
True
, thebenchmarks
iterator will be shared by a forked environment created byenv.fork()
. This means that callingenv.reset()
with one environment will advance the iterator in the other. IfFalse
, forked environments will useitertools.tee()
to create a copy of the iterator so that each iterator may advance independently. However, this requires shared buffers between the environments which can lead to memory overheads ifenv.reset()
is called many times more in one environment than the other.
- class compiler_gym.wrappers.CycleOverBenchmarksIterator(env: CompilerEnv, make_benchmark_iterator: Callable[[], Iterable[Union[str, Benchmark]]])[source]
Same as
CycleOverBenchmarks
except that the user generates the iterator.- __init__(env: CompilerEnv, make_benchmark_iterator: Callable[[], Iterable[Union[str, Benchmark]]])[source]
Constructor.
- Parameters
env – The environment to wrap.
make_benchmark_iterator – A callback that returns an iterator over a sequence of benchmarks. Once the iterator is exhausted, this callback is called to produce a new iterator.
- class compiler_gym.wrappers.RandomOrderBenchmarks(env: CompilerEnv, benchmarks: Iterable[Union[str, Benchmark]], rng: Optional[Generator] = None)[source]
Select randomly from a list of benchmarks on each call to
reset()
.Note
Uniform random selection is provided by evaluating the input benchmarks iterator into a list and sampling randomly from the list. For very large and infinite iterables of benchmarks you must use the
IterateOverBenchmarks
wrapper with your own random sampling iterator.- __init__(env: CompilerEnv, benchmarks: Iterable[Union[str, Benchmark]], rng: Optional[Generator] = None)[source]
Constructor.
- Parameters
env – The environment to wrap.
benchmarks – An iterable sequence of benchmarks. The entirety of this input iterator is evaluated during construction.
rng – A random number generator to use for random benchmark selection.
LLVM Environment wrappers
- class compiler_gym.wrappers.RuntimePointEstimateReward(env: ~compiler_gym.envs.llvm.llvm_env.LlvmEnv, runtime_count: int = 30, warmup_count: int = 0, estimator: ~typing.Callable[[~typing.Iterable[float]], float] = <function median>)[source]
LLVM wrapper that uses a point estimate of program runtime as reward.
This class wraps an LLVM environment and registers a new runtime reward space. Runtime is estimated from one or more runtime measurements, after optionally running one or more warmup runs. At each step, reward is the change in runtime estimate from the runtime estimate at the previous step.
- __init__(env: ~compiler_gym.envs.llvm.llvm_env.LlvmEnv, runtime_count: int = 30, warmup_count: int = 0, estimator: ~typing.Callable[[~typing.Iterable[float]], float] = <function median>)[source]
Constructor.
- Parameters
env – The environment to wrap.
runtime_count – The number of times to execute the binary when estimating the runtime.
warmup_count – The number of warmup runs of the binary to perform before measuring the runtime.
estimator – A function that takes a list of runtime measurements and produces a point estimate.
- class compiler_gym.wrappers.SynchronousSqliteLogger(env: LlvmEnv, db_path: Path, commit_frequency_in_seconds: int = 300, max_step_buffer_length: int = 5000)[source]
A wrapper for an LLVM environment that logs all transitions to an sqlite database.
Wrap an existing LLVM environment and then use it as per normal:
>>> env = SynchronousSqliteLogger( ... env=gym.make("llvm-autophase-ic-v0"), ... db_path="example.db", ... )
Connect to the database file you specified:
There are two tables:
States: records every unique combination of benchmark + actions. For each entry, records an identifying state ID, the episode reward, and whether the episode is terminated:
sqlite> .mode markdown sqlite> .headers on sqlite> select * from States limit 5; | benchmark_uri | done | ir_instruction_count_oz_reward | state_id | actions | |--------------------------|------|--------------------------------|------------------------------------------|----------------| | generator://csmith-v0/99 | 0 | 0.0 | d625b874e58f6d357b816e21871297ac5c001cf0 | | | generator://csmith-v0/99 | 0 | 0.0 | d625b874e58f6d357b816e21871297ac5c001cf0 | 31 | | generator://csmith-v0/99 | 0 | 0.0 | 52f7142ef606d8b1dec2ff3371c7452c8d7b81ea | 31 116 | | generator://csmith-v0/99 | 0 | 0.268005818128586 | d8c05bd41b7a6c6157b6a8f0f5093907c7cc7ecf | 31 116 103 | | generator://csmith-v0/99 | 0 | 0.288621664047241 | c4d7ecd3807793a0d8bc281104c7f5a8aa4670f9 | 31 116 103 109 |
Observations: records pickled, compressed, and text observation values for each unique state.
Caveats of this implementation:
Only
LlvmEnv
environments may be wrapped.The wrapped environment must have an observation space and reward space set.
The observation spaces and reward spaces that are logged to database are hardcoded. To change what is recorded, you must copy and modify this implementation.
Writing to the database is synchronous and adds significant overhead to the compute cost of the environment.
- __init__(env: LlvmEnv, db_path: Path, commit_frequency_in_seconds: int = 300, max_step_buffer_length: int = 5000)[source]
Constructor.
- Parameters
env – The environment to wrap.
db_path – The path of the database to log to. This file may already exist. If it does, new entries are appended. If the files does not exist, it is created.
commit_frequency_in_seconds – The maximum amount of time to elapse before writing pending logs to the database.
max_step_buffer_length – The maximum number of calls to
step()
before writing pending logs to the database.