.. Generated from CHANGELOG.md. Do not edit! Changelog ========= Release 0.2.5 (2022-11-01) -------------------------- CompilerGym v0.2.5 adds a new LLVM dataset, two new observation spaces, and includes numerous updates and bug fixes. Summary of Changes ------------------ - [llvm] Added two new observation spaces, ``LexedIr`` and ``LexedIrTuple``, providing access to a sequence of IR tokens (`#742 `__, thanks @fivosts!). - [llvm] Added the “Jotaibench” benchmark suite, providing 18,761 new executable C programs extracted from handwritten code on GitHub (`#705 `__, thanks @canesche!). - Added support for Python 3.10. - [llvm] Fixed a bug with non-terminating subprocesses (`#741 `__, thanks @thecoblack!). - [llvm] Fixed a bug where the incorrect number of runtimes were reported by ``reset()`` (`#761 `__), and an incorrect number of warm up runs were being performed (`#717 `__, thanks @lqwk!). - [llvm] New leaderboard submission using GATv2 and DD-PPO (`#728 `__, thanks @anthony0727!). - Added the ability to set timeout on each of the individual environment operations (`#716 `__, thanks @ricardoprins!). - Added support for loading URLs in ``CompilerEnvStateReader.read_paths()`` (`#692 `__, thanks @thecoblack!). - Simplified Makefile rules: renamed ``install-test`` to ``test`` and deprecated bazel test rules. - Fixed a bug where the ``TimeLimit`` wrapper would interfere with benchmark iterator wrappers (`#739 `__, thanks @nluu175!). - [ci] Added CI test coverage of example services (`#695 `__, `#642 `__, `#699 `__, thanks @mostafaelhoushi!). - [ci] Updated Github actions to use Node v16. - Reduced verbosity and wall time of CMake builds. - Updates and fixes dependent package conflicts (fixes #771, #768). Credits ------- A huge thank you to all code contributors! - @anthony0727 - @canesche made their first contribution in `#705 `__ - @fivosts made their first contribution in `#742 `__ - @jaopaulolc made their first contribution in `#738 `__ - @lqwk made their first contribution in `#717 `__ - @mostafaelhoushi - @nluu175 made their first contribution in `#739 `__ - @ricardoprins made their first contribution in `#716 `__ - @ryanrussell made their first contribution in `#755 `__ - @sahirgomez1 - @thecoblack - @youweiliang made their first contribution in `#751 `__ **Full Changelog**: `v0.2.4…v0.2.5 `__ Release 0.2.4 (2022-05-24) -------------------------- This release adds a new compiler environment, new APIs, and a suite of backend improvements to improve the flexibility of CompilerGym environments. Many thanks to code contributors: @sogartar, @KyleHerndon, @SoumyajitKarmakar, @uduse, and @anthony0727! Highlights of this release include: - [mlir] Began work on a new environment for matrix multiplication using MLIR (`#652 `__, thanks @KyleHerndon and @sogartar!). Note this environment is not yet included in the pypi package and must be `compiled from source `__. - [llvm] Added a new ``env.benchmark_from_clang_invocation()`` method (`#577 `__) that can be used for constructing LLVM environment automatically from C/C++ compiler invocations. This makes it much easier to integrate CompilerGym with your existing build scripts. - Added three new wrapper classes: ``Counter``, that provides op counts for analysis (`#683 `__); ``SynchronousSqliteLogger``, that provides logging of environment interactions to a relational database (`#679 `__), and ``ForkOnStep`` that provides an ``undo()`` operation (`#682 `__). - Added ``reward_space`` and ``observation_space`` parameters to ``env.reset()`` (`#659 `__, thanks @SoumyajitKarmakar!) This release includes a number of improvements to the backend APIs that make it easier to write new CompilerGym environments: - Refactored the backend to make ``CompilerEnv`` an abstract interface, and ``ClientServiceCompilerEnv`` the concrete implementation of this interface. This enables new environments to be implemented without using gRPC (`#633 `__, thanks @sogartar!). - Extended the support for different types of action and observation spaces (`#641 `__, `#643 `__, thanks @sogartar!), including new ``Permutation`` and ``SpaceSequence`` spaces (`#645 `__, thanks @sogartar!).. - Added a new ``disk/`` subdirectory to compiler service’s working directories, which is symlinked to an on-disk location for devices which support in-memory working directories. This fixes a bug with leftover temporary directories from LLVM (`#672 `__). This release also includes numerous bug fixes and improvements, many of which were reported or fixed by the community. For example, fixing a bug in cache file locations (`#656 `__, thanks @uduse!), and a missing flag definition in example code (`#684 `__, thanks @anthony0727!). **Full Changelog**: https://github.com/facebookresearch/CompilerGym/compare/v0.2.3…v0.2.4 Release 0.2.3 (2022-03-18) -------------------------- This release brings in deprecating changes to the core ``env.step()`` routine, and lays the groundwork for enabling new types of compiler optimizations to be exposed through CompilerGym. Many thanks to code contributors: @mostafaelhoushi, @sogartar, @KyleHerndon, @uduse, @parthchadha, and @xtremey! Highlights of this release include: - Added a new ``TextSizeInBytes`` observation space for LLVM (`#575 `__). - Added a new PPO leaderboard entry (`#580 `__. Thanks @xtremey! - Fixed a bug in which temporary directories created by the LLVM environment were not cleaned up (`#592 `__). - **[Backend]** The function ``createAndRunCompilerGymService`` now returns an int, which is the exit return code (`#592 `__). - Improvements to the examples documentation (`#548 `__) and FAQ (`#586 `__) Deprecations and breaking changes: - ``CompilerEnv.step`` no longer accepts a list of actions (`#627 `__). A new method, ``CompilerEnv.multistep`` provides this functionality. This is to provide compatibility with environments whose action spaces are lists. To update your code, replace any calls to ``env.step()`` which take a list of actions to use ``env.multistep()``. Thanks @sogartar! - The arguments ``observations`` and ``rewards`` to ``step()`` have been renamed ``observation_spaces`` and ``reward_spaces``, respectively (`#627 `__). - ``Reward.id`` has been renamed ``Reward.name`` (`#565 `__, `#612 `__). Thanks @parthchadha! - The backend protocol buffer schema has been updated to natively support more types of observation and action, and to support nested spaces (`#531 `__). Thanks @sogartar! **Full Changelog**: https://github.com/facebookresearch/CompilerGym/compare/v0.2.2…v0.2.3 Release 0.2.2 (2022-01-19) -------------------------- Amongst the highlights of this release are support for building with CMake and a new compiler environment based on loop unrolling. Many thanks to @sogartar, @mostafaelhoushi, @KyleHerndon, and @yqtianust for code contributions! - Added support for building CompilerGym from source on Linux using **CMake** (`#498 `__, `#478 `__). The new build system coexists with the bazel build and enables customization over the CMake configuration used to build the LLVM environment. See `INSTALL.md `__ for details. Credit: @sogartar, @KyleHerndon. - Added an environment for loop optimizations in LLVM (`#530 `__, `#529 `__, `#517 `__). This new example environment provides control over loop unrolling factors and demonstrates how to build a standalone LLVM binary using the new CMake build system. Credit: @mostafaelhoushi. - Added a new ``BenchmarkUri`` class and API for parsing URIs (`#525 `__). This enables benchmarks to have optional parameters that can be used by the backend services to modify their behavior. - **[llvm]** Enabled runtime reward to be calculated on systems where ``/dev/shm`` does not permit executables (`#510 `__). - **[llvm]** Added a new ``benchmark://mibench-v1`` dataset and deprecated ``benchmark://mibench-v0`` (`#511 `__). If you are using ``mibench-v0``, please update to the new version. - **[llvm]** Enabled all 20 of the cBench runtime datasets to be used by the ``benchmark://cbench-v1`` dataset (`#525 `__). - Made the ``site_data_base`` argument of the ``Dataset`` class constructor optional (`#518 `__). - Added support for building CompilerGym from source on macOS Monterey (`#494 `__). - Removed the legacy dataset scripts and APIs that were deprecated in v0.1.8. Please use the `new dataset API `__. The following has been removed: - The ``compiler_gym.bin.datasets`` script. - The properties: ``CompilerEnv.available_datasets``, and ``CompilerEnv.benchmarks``. - The ``CompilerEnv.require_dataset()``, ``CompilerEnv.require_datasets()``, ``CompilerEnv.register_dataset()``, and ``CompilerEnv.get_benchmark_validation_callback()`` methods. - Numerous other bug fixes and improvements. **Full Change Log**: `v0.2.1…v0.2.2 `__ Release 0.2.1 (2021-11-17) -------------------------- Highlights of this release include: - **[Complex and composite action spaces]** Added a new schema for describing action spaces (`#369 `__). This complete overhaul enables a much richer set of actions to be exposed, such as composite actions spaces, dictionaries, and continuous actions. - **[State Transition Dataset]** We have released the first iteration of the state transition dataset, a large collection of (state,action,reward) tuples for the LLVM environments, suitable for large-scale supervised learning. We have added an example learned cost model using a graph neural network in ``examples/gnn_cost_model`` (`#484 `__, thanks @bcui19!). - **[New examples]** We have added several new examples to the ``examples/`` directory, including a new loop unrolling demo based on LLVM (`#477 `__, thanks @mostafaelhoushi!), a loop tool demo (`#457 `__, thanks @bwasti!), micro-benchmarks for operations, and example reinforcement learning scripts (`#484 `__). See ``examples/README.md`` for details. We also overhauled the example compiler gym service (`#467 `__). - **[New logo]** Thanks Christy for designing a great new logo for CompilerGym! (`#471 `__) - **[llvm]** Added a new ``Bitcode`` observation space (`#442 `__). - Numerous bug fixes and improvements. Deprecations and breaking changes: - **[Backend API change]** Out-of-tree compiler services will require updating to the new action space API (`#369 `__). - The ``env.observation.add_derived_space()`` method has been deprecated and will be removed in a future release. Please use the new ``derived_observation_spaces`` argument to the ``CompilerEnv`` constructor (`#463 `__). - The ``compiler_gym.utils.logs`` module has been deprecated. Use ``compiler_gym.utils.runfiles_path`` instead (`#453 `__). - The ``compiler_gym.replay_search`` module has been deprecated and merged into the ``compiler_gym.random_search`` (`#453 `__). Release 0.2.0 (2021-09-28) -------------------------- This release adds two new compiler optimization problems to CompilerGym: GCC command line flag optimization and CUDA loop nest optimization. - **[GCC]** A new ``gcc-v0`` environment, authored by @hughleat, exposes the command line flags of `GCC `__ as a reinforcement learning environment. GCC is a production-grade compiler for C and C++ used throughout industry. The environment provides several datasets and a large, high dimensional action space that works on several GCC versions. For further details check out the `reference documentation `__. - **[loop_tool]** A new ``loop_tool-v0`` environment, authored by @bwasti, provides an experimental intermediate representation of *n*-dimensional data computation that can be lowered to both CPU and GPU backends. This provides a reinforcement learning environment for manipulating nests of loop computations to maximize throughput. For further details check out the `reference documentation `__. Other highlights of this release include: - **[Docker]** Published a `chriscummins/compiler_gym `__ docker image that can be used to run CompilerGym services in standalone isolated containers (`#424 `__). - **[LLVM]** Fixed a bug in the experimental ``Runtime`` observation space that caused observations to slow down over time (`#398 `__). - **[LLVM]** Added a new utility module to compute observations from bitcodes (`#405 `__). - Overhauled the continuous integration services to reduce computational requirements by 59.4% while increasing test coverage (`#392 `__). - Improved error reporting if computing an observation fails (`#380 `__). - Changed the return type of ``compiler_gym.random_search()`` to a ``CompilerEnv`` (`#387 `__). - Numerous other bug fixes and improvements. Many thanks to code contributors: @thecoblack, @bwasti, @hughleat, and @sahirgomez1! Release 0.1.10 (2021-09-08) --------------------------- This release lays the foundation for several new exciting additions to CompilerGym: - [LLVM] Added experimental support for **optimizing for runtime** and **compile time** (`#307 `__). This is still proof of concept and is not yet stable. For now, only the ``benchmark://cbench-v1`` and ``generator://csmith-v0`` datasets are supported. - [CompilerGym Explorer] Started development of a **web frontend** for the LLVM environments. The work-in-progress Flask API and React website can be found in the ``www`` directory. - [New Backend API] Added a mechanism for sending arbitrary data payloads to the compiler service backends (`#313 `__). This allows ad-hoc parameters that do not conform to the usual action space to be set for the duration of an episode. Add support for these parameters in the backend by implementing the optional `handle_session_parameter() `__ method, and then send parameters using the `send_params() `__ method. Other highlights of this release include: - [LLVM] The Csmith program generator is now shipped as part of the CompilerGym binary release, removing the need to compile it locally (`#348 `__). - [LLVM] A new ``ProgramlJson`` observation space provides the JSON node-link data of a ProGraML graph without parsing to a ``nx.MultiDiGraph`` (`#332 `__). - [LLVM] Added a leaderboard submission for a DQN agent (`#292 `__, thanks @phesse001!). - [Backend API Update] The ``Reward.reset()`` method now receives an observation view that can be used to compute initial states (`#341 `__, thanks @bwasti!). - [Datasets API] The size of infinite datasets has been changed from ``float("inf")`` to ``0`` (`#347 `__). This is a compatibility fix for ``__len__()`` which requires integers values. - Prevent excessive growth of in-memory caches (`#299 `__). - Multiple compatibility fixes for ``compiler_gym.wrappers``. - Numerous other bug fixes and improvements. Release 0.1.9 (2021-06-03) -------------------------- This release of CompilerGym focuses on **backend extensibility** and adds a bunch of new features to make it easier to add support for new compilers: - Adds a new ``CompilationSession`` class encapsulates a single incremental compilation session (`#261 `__). - Adds a common runtime for CompilerGym services that takes a ``CompilationSession`` subclass and handles all the RPC wrangling for you (`#270 `__). - Ports the LLVM service and example services to the new runtime (`#277 `__). This provides a net performance win with fewer lines of code. Other highlights of this release include: - [Core API] Adds a new ``compiler_gym.wrappers`` module that makes it easy to apply modular transformations to CompilerGym environments without modifying the environment code (`#272 `__). - [Core API] Adds a new ``Datasets.random_benchmark()`` method for selecting a uniform random benchmark from one or more datasets (`#247 `__). - [Core API] Adds a new ``compiler_gym.make()`` function, equivalent to ``gym.make()`` (`#257 `__). - [LLVM] Adds a new ``IrSha1`` observation space that uses a fast, service-side C++ implementation to compute a checksum of the environment state (`#267 `__). - [LLVM] Adds 12 new C programs from the CHStone benchmark suite (`#284 `__). - [LLVM] Adds the ``anghabench-v1`` dataset and deprecated ``anghabench-v0`` (`#242 `__). - Numerous bug fixes and improvements. Release 0.1.8 (2021-04-30) -------------------------- This release introduces some significant changes to the way that benchmarks are managed, introducing a new dataset API. This enabled us to add support for millions of new benchmarks and a more efficient implementation for the LLVM environment, but this will require some migrating of old code to the new interfaces (see “Migration Checklist” below). Some of the key changes of this release are: - **[Core API change]** We have added a Python `Benchmark `__ class (`#190 `__). The ``env.benchmark`` attribute is now an instance of this class rather than a string (`#222 `__). - **[Core behavior change]** Environments will no longer select benchmarks randomly. Now ``env.reset()`` will now always select the last-used benchmark, unless the ``benchmark`` argument is provided or ``env.benchmark`` has been set. If no benchmark is specified, a default is used. - **[API deprecations]** We have added a new `Dataset `__ class hierarchy (`#191 `__, `#192 `__). All datasets are now available without needing to be downloaded first, and a new `Datasets `__ class can be used to iterate over them (`#200 `__). We have deprecated the old dataset management operations, the ``compiler_gym.bin.datasets`` script, and removed the ``--dataset`` and ``--ls_benchmark`` flags from the command line tools. - **[RPC interface change]** The ``StartSession`` RPC endpoint now accepts a list of initial observations to compute. This removes the need for an immediate call to ``Step``, reducing environment reset time by 15-21% (`#189 `__). - [LLVM] We have added several new datasets of benchmarks, including the Csmith and llvm-stress program generators (`#207 `__), a dataset of OpenCL kernels (`#208 `__), and a dataset of compilable C functions (`#210 `__). See `the docs `__ for an overview. - ``CompilerEnv`` now takes an optional ``Logger`` instance at construction time for fine-grained control over logging output (`#187 `__). - [LLVM] The ModuleID and source_filename of LLVM-IR modules are now anonymized to prevent unintentional overfitting to benchmarks by name (`#171 `__). - [docs] We have added a `Feature Stability `__ section to the documentation (`#196 `__). - Numerous bug fixes and improvements. Please use this checklist when updating code for the previous CompilerGym release: - Review code that accesses the ``env.benchmark`` property and update to ``env.benchmark.uri`` if a string name is required. Setting this attribute by string (``env.benchmark = "benchmark://a-v0/b"``) and comparison to string types (``env.benchmark == "benchmark://a-v0/b"``) still work. - Review code that calls ``env.reset()`` without first setting a benchmark. Previously, calling ``env.reset()`` would select a random benchmark. Now, ``env.reset()`` always selects the last used benchmark, or a predetermined default if none is specified. - Review code that relies on ``env.benchmark`` being ``None`` to select benchmarks randomly. Now, ``env.benchmark`` is always set to the previously used benchmark, or a predetermined default benchmark if none has been specified. Setting ``env.benchmark = None`` will raise an error. Select a benchmark randomly by sampling from the ``env.datasets.benchmark_uris()`` iterator. - Remove calls to ``env.require_dataset()`` and related operations. These are no longer required. - Remove accesses to ``env.benchmarks``. An iterator over available benchmark URIs is now available at ``env.datasets.benchmark_uris()``, but the list of URIs cannot be relied on to be fully enumerable (the LLVM environments have over 2^32 URIs). - Review code that accesses ``env.observation_space`` and update to ``env.observation_space_spec`` where necessary (`#228 `__). - Update compiler service implementations to support the updated RPC interface by removing the deprecated ``GetBenchmarks`` RPC endpoint and replacing it with ``Dataset`` classes. See the `example service `__ for details. - [LLVM] Update references to the ``poj104-v0`` dataset to ``poj104-v1``. - [LLVM] Update references to the ``cBench-v1`` dataset to ``cbench-v1``. Release 0.1.7 (2021-04-01) -------------------------- This release introduces `public leaderboards `__ to track the performance of user-submitted algorithms on compiler optimization tasks. - Added a new ``compiler_gym.leaderboard`` package which contains utilities for preparing leaderboard submissions `(#161) `__. - Added a LLVM instruction count leaderboard and seeded it with a random search baseline `(#117) `__. - Added support for Python 3.9, extending the set of supported python versions to 3.6, 3.7, 3.8, and 3.9 `(#160) `__. - [llvm] Added a new ``InstCount`` observation space that contains the counts of each type of instruction `(#159) `__. **Build dependencies update notice:** If you are building from source and upgrading from an older version of CompilerGym, your build environment will need to be updated. The easiest way to do that is to remove your existing conda environment using ``conda remove --name compiler_gym --all`` and to repeat the steps in `building from source `__. Release 0.1.6 (2021-03-22) -------------------------- This release focuses on hardening the LLVM environments, providing improved semantics validation, and improving the datasets. Many thanks to @JD-at-work, @bwasti, and @mostafaelhoushi for code contributions. - [llvm] Added a new ``cBench-v1`` dataset which changes the function attributes of the IR to permit inlining. ``cBench-v0`` is deprecated and will be removed no earlier than v0.1.6. - [llvm] Removed 15 passes from the LLVM action space: ``-bounds-checking``, ``-chr``, ``-extract-blocks``, ``-gvn-sink``, ``-loop-extract-single``, ``-loop-extract``, ``-objc-arc-apelim``, ``-objc-arc-contract``, ``-objc-arc-expand``, ``-objc-arc``, ``-place-safepoints``, ``-rewrite-symbols``, ``-strip-dead-debug-info``, ``-strip-nonlinetable-debuginfo``, ``-structurizecfg``. Passes are removed if they are: irrelevant (e.g. used only debugging), if they change the program semantics (e.g. inserting runtimes bound checking), or if they have been found to have nondeterministic behavior between runs. - Extended ``env.step()`` so that it can take a list of actions that are all performed in a single batch. This improve efficiency. - Added default reward spaces for ``CompilerEnv`` that are derived from scalar observations (thanks @bwasti!) - Added a new Q learning example (thanks @JD-at-work!). - *Deprecation:* The v0.1.8 release will introduce a new datasets API that is easier to use and more flexible. In preparation for this, the ``Dataset`` class has been renamed to ``LegacyDataset``, the following dataset operations have been marked deprecated: ``activate()``, ``deactivate()``, and ``delete()``. The ``GetBenchmarks()`` RPC interface method has also been marked deprecated. - [llvm] Improved semantics validation using LLVM’s memory, thread, address, and undefined behavior sanitizers. - Numerous bug fixes and improvements. Release 0.1.3 (2021-02-25) -------------------------- This release adds numerous enhancements aimed at improving ease-of-use. Thanks to @broune, @hughleat, and @JD-ETH for contributions. - Added a new ``env.validate()`` API for validating the state of an environment. Added semantics validation for some LLVM benchmarks. - Added a ``env.fork()`` method to efficiently duplicate an environment state. - The ``manual_env`` environment has been improved with new features such as hill climbing search and tab completion. - Ease of use improvements for string observation space and reward space names: Added new getter methods such as ``env.observation.Autophase()`` and generated constants such as ``llvm.observation_spaces.autophase``. - *Breaking change*: Calculation of environment reward has been moved to Python. Reward functions have been removed from backend service implementations and replaced with equivalent Python classes. - Various bug fixes and improvements. Release 0.1.2 (2021-01-25) -------------------------- - Add a new ``compiler_gym.views.ObservationView.add_derived_space(...)`` API for constructing derived observation spaces. - Added default reward and observation values for ``env.step()`` in case of service failure. - Extended the public ``compiler_gym.datasets`` API for managing datasets. - [llvm] Adds ``-Norm``-suffixed rewards that are normalized to unoptimized cost. - Extended documentation and example codes. - Numerous bug fixes and improvements. Release 0.1.1 (2020-12-28) -------------------------- - Expose the package version through ``compiler_gym.__version__``, and the compiler version through ``CompilerEnv.compiler_version``. - Add a `notebook version `__ of the “Getting Started” guide that can be run in colab. - [llvm] Reformulate reward signals to be cumulative. - [llvm] Add a new reward signal based on the size of the ``.text`` section of compiled object files. - [llvm] Add a ``LlvmEnv.make_benchmark()`` API for easily constructing custom benchmarks for use in environments. - Numerous bug fixes and improvements. Release 0.1.0 (2020-12-21) -------------------------- Initial release.