CompiledModule¶
- class qlip.inference.module.CompiledModule(model, *, backend=None, session_config=None, adapter_type='auto')¶
Bases:
ModuleCompiled module for inference.
Parameters
model (
nn.Module) – PyTorch model to compilebackend (
str | Backend) – Backend.session_config (
SessionConfig) – Configuration for inference sessionadapter_type (
str) – Type of adapter to use, by default ‘auto’. Possible values are: ‘auto’, ‘default’, ‘hf_adapter’, ‘hf_unet_adapter’.
- forward(*args, **kw)¶
Forward pass.
- load(engine_path, device='cuda', original_device='meta', **kwargs)¶
Load session into compiled module.
Parameters
engine_path (
str | Path) – Path to engine file.device (
str) – Device to load session to.original_device (
str) – Move original model to this device.**kwargs (
dict) – Additional keyword arguments, backend specific.
- unload()¶
Unload compiled engine.
- set_inference_config(config)¶
Set inference configuration for lazy initialization before loading engine.
Parameters
config (
SessionConfig) – Configuration for inference session.
- property original_device: torch.device¶
Device of the original model.
- extra_repr()¶
extra_repr is used by torch.nn.Module to print the model structure.