nip.language_model_server.types.VllmQuantization

Contents

nip.language_model_server.types.VllmQuantization#

nip.language_model_server.types.VllmQuantization#

The quantization method to use for the vLLM server.

alias of Literal[‘bitsandbytes’, ‘none’]