compiler_gym

CompilerEnvState

class compiler_gym.CompilerEnvState(*, benchmark: str, commandline: str, walltime: float, reward: Optional[float] = None)[source]

The representation of a compiler environment state.

The state of an environment is defined as a benchmark and a sequence of actions that has been applied to it. For a given environment, the state contains the information required to reproduce the result.

benchmark: str

The URI of the benchmark used for this episode.

commandline: str

The list of actions that produced this state, as a commandline.

property has_reward: bool

Return whether the state has a reward value.

reward: Optional[float]

The cumulative reward for this episode. Optional.

walltime: float

The walltime of the episode in seconds. Must be non-negative.

class compiler_gym.CompilerEnvStateWriter(f: TextIO, header: bool = True)[source]

Serialize compiler environment states to CSV.

Example use:

>>> with CompilerEnvStateWriter(open("results.csv", "wb")) as writer:
...     writer.write_state(env.state)
__init__(f: TextIO, header: bool = True)[source]

Constructor.

Parameters
  • f – The file to write to.

  • header – Whether to include a header row.

write_state(state: CompilerEnvState, flush: bool = False) None[source]

Write the state to file.

Parameters
  • state – A compiler environment state.

  • flush – Write to file immediately.

class compiler_gym.CompilerEnvStateReader(f: TextIO)[source]

Read states from a CSV file.

Example usage:

>>> with CompilerEnvStateReader(open("results.csv", "rb")) as reader:
...     for state in reader:
...         print(state)
__init__(f: TextIO)[source]

Constructor.

Parameters

f – The file to read.

__iter__() Iterable[CompilerEnvState][source]

Read the states from the file.

static read_paths(paths: Iterable[str]) Iterable[CompilerEnvState][source]

Read a states from a list of file paths.

Read states from stdin using a special path "-".

Param

A list of paths.

Returns

A generator of compiler env states.

Validation

class compiler_gym.ValidationResult(*, state: CompilerEnvState, walltime: float, reward_validated: bool = False, actions_replay_failed: bool = False, reward_validation_failed: bool = False, benchmark_semantics_validated: bool = False, benchmark_semantics_validation_failed: bool = False, errors: List[ValidationError] = [])[source]

A tuple that represents the result of validating a compiler environment state.

actions_replay_failed: bool

Whether the commandline was unable to be reproduced.

benchmark_semantics_validated: bool

Whether the semantics of the benchmark were validated.

benchmark_semantics_validation_failed: bool

Whether the semantics of the benchmark were found to have changed.

property error_details: str

A summary description of the validation errors.

errors: List[ValidationError]

A list of ValidationError

classmethod join(results: Iterable[ValidationResult])[source]

Create a validation result that is the union join of multiple results.

okay() bool[source]

Whether validation succeeded.

reward_validated: bool

Whether the reward that was recorded in the original state was validated.

reward_validation_failed: bool

Whether the validated reward differed from the original state.

state: CompilerEnvState

The compiler environment state that was validated.

walltime: float

The wall time in seconds that the validation took.

class compiler_gym.ValidationError(*, type: str, data: Dict[str, Any] = {})[source]

A ValidationError describes an error encountered in a call to env.validate().

data: Dict[str, Any]

A JSON-serializable dictionary of data that further describes the error. This data dictionary can contain any information that may be relevant for diagnosing the underlying issue, such as a stack trace or an error line number. There is no specified schema for this data, validators are free to return whatever data they like. Setting this field is optional.

type: str

A short name describing the type of error that occured. E.g. "Runtime crash".

compiler_gym.validate_states(make_env: Callable[[], CompilerEnv], states: Iterable[CompilerEnvState], nproc: Optional[int] = None, inorder: bool = False) Iterable[ValidationResult][source]

A parallelized implementation of env.validate() for batched validation.

Parameters
  • make_env – A callback which instantiates a compiler environment.

  • states – A sequence of compiler environment states to validate.

  • nproc – The number of parallel worker processes to run.

  • inorder – Whether to return results in the order they were provided, or in the order that they are available.

Returns

An iterator over validation results. The order of results may differ from the input states.

Filesystem Paths

compiler_gym.cache_path(relpath: str) Path[source]

Return a path within the cache directory.

CompilerGym uses a directory to cache files in, such as downloaded content. The default location for this cache is ~/.local/cache/compiler_gym. Set the environment variable $COMPILER_GYM_CACHE to override this default location.

It is safe to delete this directory, so long as no CompilerGym environments are running.

No checks are to made to ensure that the path, or the containing directory, exist.

Parameters

relpath – The relative path within the cache tree.

Returns

An absolute path.

compiler_gym.site_data_path(relpath: str) Path[source]

Return a path within the site data directory.

CompilerGym uses a directory to store persistent site data files in, such as benchmark datasets. The default location is ~/.local/share/compiler_gym. Set the environment variable $COMPILER_GYM_SITE_DATA to override this default location.

No checks are to made to ensure that the path, or the containing directory, exist.

Files in this directory are intended to be long lived (this is not a cache), but it is safe to delete this directory, so long as no CompilerGym environments are running.

Parameters

relpath – The relative path within the site data tree.

Returns

An absolute path.

compiler_gym.transient_cache_path(relpath: str) Path[source]

Return a path within the transient cache directory.

The transient cache is a directory used to store files that do not need to persist beyond the lifetime of the current process. When available, the temporary filesystem /dev/shm will be used. Else, cache_path() is used as a fallback. Set the environment variable $COMPILER_GYM_TRANSIENT_CACHE to override the default location.

Files in this directory are not meant to outlive the lifespan of the CompilerGym environment that creates them. It is safe to delete this directory, so long as no CompilerGym environments are running.

No checks are to made to ensure that the path, or the containing directory, exist.

Parameters

relpath – The relative path within the cache tree.

Returns

An absolute path.

compiler_gym.download(urls: Union[str, List[str]], sha256: Optional[str] = None, max_retries: int = 5) bytes[source]

Download a file and return its contents.

If sha256 is provided and the download succeeds, the file contents are cached locally in $cache_path/downloads/$sha256. See compiler_gym.cache_path().

An inter-process lock ensures that only a single call to this function may execute at a time.

Parameters
  • urls – Either a single URL of the file to download, or a list of URLs to download.

  • sha256 – The expected sha256 checksum of the file.

Returns

The contents of the downloaded file.

Raises

IOError – If the download fails, or if the downloaded content does match the expected sha256 checksum.

Debugging

compiler_gym.get_debug_level() int[source]

Get the debugging level.

The debug level is a non-negative integer that controls the verbosity of logging messages and other debugging behavior. At each level, the types of messages that are logged are:

  • 0 - only non-fatal errors are logged (default).

  • 1 - extra warnings message are logged.

  • 2 - enables purely informational logging messages.

  • 3 and above - extremely verbose logging messages are enabled that may be useful for debugging.

The debugging level can be set using the $COMPILER_GYM_DEBUG environment variable, or by calling set_debug_level().

Returns

A non-negative integer.

compiler_gym.get_logging_level() int[source]

Returns the logging level.

The logging level is not set directly, but as a result of setting the debug level using set_debug_level().

Returns

An integer.

compiler_gym.set_debug_level(level: int) None[source]

Set a new debugging level.

See get_debug_level() for a description of the debug levels.

The debugging level should be set first when interacting with CompilerGym as many CompilerGym objects will check the debug level only at initialization time and not throughout their lifetime.

Setting the debug level affects the entire process and is not thread safe.

Parameters

level – The debugging level to use.