HirGen is a powerful and effective fuzzer designed specifically for Deep Learning (DL) compilers. It operates primarily at the high-level optimization stage, making it uniquely positioned to uncover bugs that might be missed by traditional testing methods.
Currently, HirGen boasts support for an extensive range of 58 operators, encompassing both binary and unary operations:
- Binary Operators: Add, Subtract, Multiply, Divide, Power, Mod, Floor Mod, Floor Divide, Logical And, Logical Or, Logical Xor, Bitwise And, Bitwise Or, Equal, Not Equal, Less, LessEqual, Greater, GreaterEqual, Maximum, Minimum, Right Shift, Left Shift.
- Unary Operators: Log, Log2, Log10, Tan, Tanh, Cos, Cosh, Sin, Sinh, Acos, Acosh, Asin, Asinh, Atan, Atanh, Exp, Erf, Sqrt, Rsqrt, Sigmoid, Floor, Ceil, Trunc, Round, Abs, Sign, Negative, Logical not, Bitwise not, Zeros Like, Ones Like, Copy, isNan, isFinite, isInf.
Proven Bug Detection in Deep Learning Compilers
The efficacy of HirGen is underscored by its track record of identifying 21 bugs to date. Of these, 17 have been rigorously confirmed, and an impressive 12 have already been successfully resolved. These discovered bugs are documented and accessible for review in the experimental branch of the HirGen repository: https://github.com/anonymousWork000/HirGen/tree/experiment. This demonstrates HirGen’s practical capability in uncovering real-world issues within deep learning compilers.
Getting Started with HirGen for Compiler Fuzzing
To begin utilizing HirGen for fuzzing deep learning compilers, follow these straightforward steps:
- Create a Build Directory: Initiate the process by creating a directory named
build
in your desired location. - CMake Configuration: Navigate into the newly created
build
directory via your terminal. Execute the commandcmake .. -G Ninja
. This command configures the build environment using CMake and the Ninja build system. - Compilation with Ninja: Once CMake configuration is complete, initiate the compilation process by running
ninja
. Upon successful compilation, the HirGen executable, namedhirgen
, will be located within thebuild
directory. - Execute HirGen: Run HirGen by executing the command
./hirgen
from within thebuild
directory.
For a smooth CMake configuration, it’s recommended to specify the paths to your Clang++ and Clang compilers within the CMakeLists.txt
file. Alternatively, if you prefer using GCC/G++ and your system’s default C/C++ compiler is GCC/G++, you can simply remove the C/C++ compiler path specifications from CMakeLists.txt
.
HirGen Execution Options for Customized Fuzzing
HirGen offers several command-line options that allow you to tailor the computational graph generation process to your specific fuzzing needs:
-num
: This option controls the number of operators to be included in the generated computational graphs. The default value is-num=100
. You can adjust this value to generate graphs of varying complexity.-testing
: This flag determines whether to activate test oracle 3, as detailed in the HirGen research paper. By default,-testing=nodf
is enabled, which skips test oracle 3 for faster execution but might overlook calculation differences across diverse hardware. To enable test oracle 3 for more comprehensive testing, run HirGen with-testing=df
.-clevel
: This option sets the generation mode for computational graphs. The default mode is-clevel=strict
, representing strict generation. You can switch to-clevel=disruptive
for disruptive generation, exploring different graph structures.-coverage
: This option enables or disables coverage guidance during graph generation. The default setting is-coverage=yes
, which leverages coverage guidance to explore code paths more effectively. To disable coverage guidance and generate computational graphs randomly, use-coverage=no
.
TVM Integration for Bug Reproduction
HirGen has been particularly effective in finding bugs within TVM (Apache TVM), a widely-used deep learning compiler framework. To reproduce the bugs discovered by HirGen and ensure compatibility, it is advised to install TVM from source, following the instructions available at https://tvm.apache.org/docs/install/from_source.html. After installation, switch to the specific TVM version known to reproduce the found bugs by executing git checkout 124813f
within your TVM repository and then rebuild TVM. This ensures you are testing against the same TVM version where HirGen identified the reported issues.
By leveraging HirGen and understanding its options, developers and researchers can significantly enhance their deep learning compiler testing and contribute to building more robust and reliable DL software stacks.