CompiledModule

class qlip.inference.module.CompiledModule(model, *, backend=None, session_config=None, adapter_type='auto')

Bases: Module

Compiled module for inference.

Parameters

  • model (nn.Module) – PyTorch model to compile

  • backend (str | Backend) – Backend.

  • session_config (SessionConfig) – Configuration for inference session

  • adapter_type (str) – Type of adapter to use, by default ‘auto’. Possible values are: ‘auto’, ‘default’, ‘hf_adapter’, ‘hf_unet_adapter’.

forward(*args, **kw)

Forward pass.

load(engine_path, device='cuda', original_device='meta', **kwargs)

Load session into compiled module.

Parameters

  • engine_path (str | Path) – Path to engine file.

  • device (str) – Device to load session to.

  • original_device (str) – Move original model to this device.

  • **kwargs (dict) – Additional keyword arguments, backend specific.

unload()

Unload compiled engine.

set_inference_config(config)

Set inference configuration for lazy initialization before loading engine.

Parameters

  • config (SessionConfig) – Configuration for inference session.

property original_device: torch.device

Device of the original model.

extra_repr()

extra_repr is used by torch.nn.Module to print the model structure.