NvidiaCompileManager¶
- class qlip.compiler.nvidia.NvidiaCompileManager(model, workspace, use_timing_cache=True, **kwargs)¶
Bases:
CompileManagerNvidia manager for compilation.
Parameters
model (
torch.nn.Module) – Model to compile.workspace (
Path) – Workspace to use for compilation.use_timing_cache (
bool) – Whether to use timing cache for compilation. By default True.
Variables
backend (
Backend) – Backend for compilation.model (
nn.Module) – Model for compilation.workspace (
Path) – Workspace for compilation.modules (
List[CompiledModule]) – Modules for compilation.
Examples
>>> cmanager = NvidiaCompileManager(model, workspace) >>> cmanager.setup_modules(modules=["encoder", "decoder"]) >>> cmanager.compile()
- backend¶
alias of
NvidiaBackend
- setup_modules(*, modules=None, module_types=None, builder_config=None, component=None, component_name=None, onnx_model=None, dtype=None, adapter_type='auto')¶
Setup modules for compilation.
Parameters
modules (
Optional[Iterable[str]]) – Names of modules to setup.module_types (
Optional[Iterable[type]]) – Types of modules to setup.builder_config (
Optional[BuilderConfig]) – Builder configuration.component (
Optional[str]) – Component name. Mandatory for DiffusionPipeline from diffusers: actual component name. Optional for other models: name of any submodule.component_name (
Optional[str]) – Custom pretty name for the component.onnx_model (
Optional[Path]) – ONNX model to use for compilation.dtype (
torch.dtype) – Data type of compiled modules.adapter_type (
str) – Type of adapter to use, by default ‘auto’. Possible values are: ‘auto’, ‘default’, ‘hf_adapter’, ‘hf_unet_adapter’.
Examples
>>> # The whole `text_encoder` component >>> cmanager.setup_modules(component="text_encoder") >>> # Submodules of type `FluxTransformerBlock` within the `transformer` component >>> cmanager.setup_modules(module_types=["FluxTransformerBlock"], component="transformer") >>> # Specific modules >>> cmanager.setup_modules(modules=["encoder", "decoder"])
- setup_model(*, builder_config=None, onnx_model=None, dtype=None, adapter_type='auto')¶
Setup the model for compilation.
Parameters
builder_config (
Optional[BuilderConfig]) – Builder configuration.onnx_model (
Optional[Path]) – ONNX model to use for compilation.dtype (
torch.dtype) – Data type of compiled modules.adapter_type (
str) – Type of adapter to use, by default ‘auto’. Possible values are: ‘auto’, ‘default’, ‘hf_adapter’, ‘hf_unet_adapter’.
- compile(device='cuda', original_device='meta', cpu_offload=False, recompile_existing=False, dump_onnx=False, keep_compiled=True, save_compiled=True, dynamo=None, fuse_weights_quantizers=None)¶
Compile modules.
Parameters
device (
str) – Device to compile the modules.original_device (
str) – Original device of the model.cpu_offload (
bool) – Whether to cpu offload the model.recompile_existing (
bool) – Whether to recompile the existing modules.dump_onnx (
bool) – Whether to dump the ONNX model.keep_compiled (
bool) – Whether to keep the compiled modules.save_compiled (
bool) – Whether to save the compiled modules.dynamo (
Optional[bool] = None,) – Whether to use dynamo for ONNX export. By default True for PyTorch version 2.9 or higher, False otherwise.fuse_weights_quantizers (
bool) – Whether to fuse weights quantizers. By default True for dynamo ONNX export.
- collect_inputs()¶
Create a collect inputs manager to collect inputs for export.
This is useful when using fake mode where shapes cannot be properly collected but real inputs are needed.
Returns
collect_inputs_manager (
CollectInputsManager) – Collect inputs manager.
- enable(val=True)¶
Enable or disable compiled computation.
Parameters
val (
bool) – Whether to enable or disable compiled computation.
- remove()¶
Extract original modules from compiled modules.
- set_axes_names(mappings)¶
Set axes names with mapping.
Parameters
mappings (
Dict[type | str,Dict[str,str]]) – Mapping dictionary for every module defined by name or type.
Examples
Mapping by module type:
{ torch.nn.Linear: { "input_0": "batch_size" } }
Mapping by module name:
{ "decoder": { "sample_2": "width", "sample_3": "height", } }
- shape_profile(type='static', opt='mode', fake_mode=False)¶
Create a shape profile manager to control shape collection.
Parameters
type (
str) – Create static or dynamic axes profiles from collected shapes.opt (
str) – Optimal shape for dynamic axes profile. Can be “mode”, “min” or “max”. Default is “mode”.fake_mode (
bool) – Whether to enable FakeTensorMode for shape inference without actual computation. Default is False.
Returns
shape_profile_manager (
ShapeProfileManager) – Shape profile manager.
- NvidiaCompileManager
NvidiaCompileManagerNvidiaCompileManager.backendNvidiaCompileManager.setup_modules()NvidiaCompileManager.setup_model()NvidiaCompileManager.compile()NvidiaCompileManager.collect_inputs()NvidiaCompileManager.enable()NvidiaCompileManager.remove()NvidiaCompileManager.set_axes_names()NvidiaCompileManager.shape_profile()