NvidiaBuilderConfig

class qlip.compiler.nvidia.NvidiaBuilderConfig(*, builder_flags=<factory>, io_dtype=None, io_dtype_per_tensor=<factory>, normalization_dtype=None, builder_optimization_level=3, profiling_verbosity=None, avg_timing_iterations=None, timing_cache=b'', output_timing_cache='', rewrite_timing_cache=False)

Bases: BuilderConfig

Holds configuration options for the Nvidia builder.

Variables

  • builder_flags (set[str]) – A set of flags that can be used to control the builder’s behavior. Each flag should be one of the trt.BuilderFlag enum options, such as “FP16” or “INT8” for mixed precision or integer precision mode, by default an empty set. This attribute is protected from external modification after initialization. Use add_builder_flag() to add flags.

  • io_dtype (Optional[str]) – Specifies the data type of the input and output tensors. It supports “FP16” or “BF16”. When set, the I/O tensors will be cast to the specified type. When equal to base, the type is deduced from builder_flags, by default None.

  • io_dtype_per_tensor (Dict[str, str]) – Specifies the data type of the input and output tensors, per tensor. It supports “FP16” or “BF16”. When set, the I/O tensors will be cast to the specified type. When equal to base, the type is deduced from builder_flags, by default an empty dictionary.

  • normalization_dtype (str) – Specifies the precision of the normalization layer. It supports “FP16” or “BF16”. When set, the normalization layer will be cast to the specified type. When equal to base, the type is deduced from builder_flags, by default None (which means FP32).

  • builder_optimization_level (int) – Controls the optimization level of the engine build. Higher values can enable more aggressive optimizations for performance, but may result in longer build times, by default 3.

  • profiling_verbosity (Optional[str]) – Sets the verbosity of profiling information during engine building. Supported values are “none”, “default”, and “detailed”, which control the level of logging, by default None.

  • avg_timing_iterations (int) – Specifies the number of iterations used to average the timing results when building the engine. This can help to improve the accuracy of the timing results, by default 1.

  • timing_cache (str | bytes) – Path to the timing cache file or serialized timing cache. If provided, the builder will use the timing cache to speed up engine building, by default b””.

  • output_timing_cache (str) – Path to save the timing cache file. If provided, the timing cache will be saved to the specified path after engine building, by default “”.

  • rewrite_timing_cache (bool) – If True, the timing cache file will be rewritten with the updated timing cache, by default False.

add_builder_flag(flag)

Add a builder flag if possible.

Parameters

  • flag (str) – The builder flag to add.

set_strongly_typed()

Set the STRONGLY_TYPED creation flag.