NvidiaCompileManager¶

class qlip.compiler.nvidia.NvidiaCompileManager(model, workspace, use_timing_cache=True, **kwargs)¶

Bases: CompileManager

Nvidia manager for compilation.

Parameters

model (torch.nn.Module) – Model to compile.
workspace (Path) – Workspace to use for compilation.
use_timing_cache (bool) – Whether to use timing cache for compilation. By default True.

Variables

backend (Backend) – Backend for compilation.
model (nn.Module) – Model for compilation.
workspace (Path) – Workspace for compilation.
modules (List[CompiledModule]) – Modules for compilation.

Examples

>>> cmanager = NvidiaCompileManager(model, workspace)
>>> cmanager.setup_modules(modules=["encoder", "decoder"])
>>> cmanager.compile()

backend¶: alias of NvidiaBackend

setup_modules(*, modules=None, module_types=None, builder_config=None, component=None, component_name=None, onnx_model=None, dtype=None, adapter_type='auto')¶

Setup modules for compilation.

Parameters

modules (Optional[Iterable[str]]) – Names of modules to setup.
module_types (Optional[Iterable[type]]) – Types of modules to setup.
builder_config (Optional[BuilderConfig]) – Builder configuration.
component (Optional[str]) – Component name. Mandatory for DiffusionPipeline from diffusers: actual component name. Optional for other models: name of any submodule.
component_name (Optional[str]) – Custom pretty name for the component.
onnx_model (Optional[Path]) – ONNX model to use for compilation.
dtype (torch.dtype) – Data type of compiled modules.
adapter_type (str) – Type of adapter to use, by default ‘auto’. Possible values are: ‘auto’, ‘default’, ‘hf_adapter’, ‘hf_unet_adapter’.

Examples

>>> # The whole `text_encoder` component
>>> cmanager.setup_modules(component="text_encoder")
>>> # Submodules of type `FluxTransformerBlock` within the `transformer` component
>>> cmanager.setup_modules(module_types=["FluxTransformerBlock"], component="transformer")
>>> # Specific modules
>>> cmanager.setup_modules(modules=["encoder", "decoder"])

setup_model(*, builder_config=None, onnx_model=None, dtype=None, adapter_type='auto')¶

Setup the model for compilation.

Parameters

builder_config (Optional[BuilderConfig]) – Builder configuration.
onnx_model (Optional[Path]) – ONNX model to use for compilation.
dtype (torch.dtype) – Data type of compiled modules.
adapter_type (str) – Type of adapter to use, by default ‘auto’. Possible values are: ‘auto’, ‘default’, ‘hf_adapter’, ‘hf_unet_adapter’.

compile(device='cuda', original_device='meta', cpu_offload=False, recompile_existing=False, dump_onnx=False, keep_compiled=True, save_compiled=True, dynamo=None, fuse_weights_quantizers=None)¶

Compile modules.

Parameters

device (str) – Device to compile the modules.
original_device (str) – Original device of the model.
cpu_offload (bool) – Whether to cpu offload the model.
recompile_existing (bool) – Whether to recompile the existing modules.
dump_onnx (bool) – Whether to dump the ONNX model.
keep_compiled (bool) – Whether to keep the compiled modules.
save_compiled (bool) – Whether to save the compiled modules.
dynamo (Optional[bool] = None,) – Whether to use dynamo for ONNX export. By default True for PyTorch version 2.9 or higher, False otherwise.
fuse_weights_quantizers (bool) – Whether to fuse weights quantizers. By default True for dynamo ONNX export.

collect_inputs()¶

Create a collect inputs manager to collect inputs for export.

This is useful when using fake mode where shapes cannot be properly collected but real inputs are needed.

Returns

collect_inputs_manager (CollectInputsManager) – Collect inputs manager.

enable(val=True)¶

Enable or disable compiled computation.

Parameters

val (bool) – Whether to enable or disable compiled computation.

remove()¶: Extract original modules from compiled modules.

set_axes_names(mappings)¶

Set axes names with mapping.

Parameters

mappings (Dict[type | str, Dict[str, str]]) – Mapping dictionary for every module defined by name or type.

Examples

Mapping by module type:

{
    torch.nn.Linear: {
        "input_0": "batch_size"
    }
}

Mapping by module name:

{
    "decoder": {
        "sample_2": "width",
        "sample_3": "height",
    }
}

shape_profile(type='static', opt='mode', fake_mode=False)¶

Create a shape profile manager to control shape collection.

Parameters

type (str) – Create static or dynamic axes profiles from collected shapes.
opt (str) – Optimal shape for dynamic axes profile. Can be “mode”, “min” or “max”. Default is “mode”.
fake_mode (bool) – Whether to enable FakeTensorMode for shape inference without actual computation. Default is False.

Returns

shape_profile_manager (ShapeProfileManager) – Shape profile manager.

NvidiaCompileManager
- NvidiaCompileManager