nip.language_model_server.client.LanguageModelClient#
- class nip.language_model_server.client.LanguageModelClient(server_url: str = 'http://localhost:5000')[source]#
- A client for interacting with the language model server. - This client provides methods to interact with the language model server, allowing for controlling the vLLM server and performing language model training tasks. - Parameters:
- server_url (str, default=”http://localhost:5000”) – The URL of the language model server. This should include the protocol (http or https) and the port number if applicable. 
 - Methods Summary - __init__([server_url])- cancel_training_job(job_id)- Cancel a training job by its ID. - Check the server version against the client version. - create_training_job(training_config, dataset)- Create a new training job with the specified configuration. - Get the version of the language model server. - get_training_job(job_id)- Get the details of a specific training job by its ID. - Get the list of training jobs currently managed by the server. - Get the current status of the vLLM language model server. - Check if the language model server is accepting connections. - start_vllm_server(model_name[, quantization])- Start the vLLM language model server with the specified model. - stop_vllm_server([ignore_not_running, timeout])- Stop the vLLM language model server. - Validate the server version against the client version. - Wait for the language model server to start accepting connections. - wait_for_vllm_server([timeout])- Wait for the vLLM server to be online. - Methods - async cancel_training_job(job_id: str)[source]#
- Cancel a training job by its ID. - Parameters:
- job_id (str) – The ID of the training job to cancel. 
- Raises:
- HTTPStatusError – If the server returns an error status code while cancelling the training job. 
 
 - async check_server_version() Literal['ok', 'major', 'minor', 'patch'][source]#
- Check the server version against the client version. - Returns:
- status (str) – A string indicating the status of the server version: - ”ok” if the server version matches the client version. 
- ”major” if the server version differs by a major version. 
- ”minor” if the server version differs by a minor version. 
- ”patch” if the server version differs by a patch version. 
 
- Raises:
- BadResponseError – If the server returns an invalid response or if the response does not contain the expected ‘version’ field. 
 
 - async create_training_job(training_config: LmTrainingConfig, dataset: list[DpoDatasetItem], job_name: str | None = None) TrainingJobInfo[source]#
- Create a new training job with the specified configuration. - Parameters:
- training_config (LmTrainingConfig) – The configuration for the training job, including model name and training parameters. 
- dataset (list[DpoDatasetItem]) – The dataset to be used for training. This should be a list of dictionaries where each dictionary represents a single data point in the dataset. 
- job_name (Optional[str], default=None) – An optional name for the job, to make it more recognizable. 
 
- Returns:
- training_job (TrainingJobInfo) – An object containing the details of the created training job, including its ID, status, and configuration. 
- Raises:
- HTTPStatusError – If the server returns an error status code while creating the training job. 
- BadResponseError – If the server returns an invalid response or if the response does not contain the expected data. 
 
 
 - async get_server_version() str[source]#
- Get the version of the language model server. - Returns:
- version (str) – The version of the language model server, as a string. 
- Raises:
- BadResponseError – If the server returns an invalid response or if the response does not contain the expected ‘version’ field. 
 
 - async get_training_job(job_id: str) TrainingJobInfo[source]#
- Get the details of a specific training job by its ID. - Parameters:
- job_id (str) – The ID of the training job to retrieve. 
- Returns:
- training_job (TrainingJobInfo) – An object containing the details of the training job, including its ID, status, and configuration. 
- Raises:
- TrainingJobNotFoundClientError – If the training job with the specified ID does not exist on the server. 
- HTTPStatusError – If the server returns an error status code while creating the training job. 
- BadResponseError – If the server returns an invalid response or if the response does not contain the expected data. 
 
 
 - async get_training_jobs() list[TrainingJobInfo][source]#
- Get the list of training jobs currently managed by the server. - Returns:
- training_jobs (list[TrainingJobInfo]) – A list of - TrainingJobInfoobjects, each containing information about a training job, including its ID, status, and configuration.
- Raises:
- HTTPStatusError – If the server returns an error status code while creating the training job. 
- BadResponseError – If the server returns an invalid response or if the response does not contain the expected data. 
 
 
 - async get_vllm_server_status() Literal['online', 'not_started', 'crashed', 'not_accepting_connections', 'timeout', 'server_error', 'other_error'][source]#
- Get the current status of the vLLM language model server. - Returns:
- vllm_server_status (ServerStatus) – The current status of the vLLM server. See the documentation for - ServerStatusfor possible values.
- Raises:
- BadResponseError – If the server returns an invalid response or if the response does not contain the expected ‘status’ field, or if the status is not a valid ServerStatus. 
 
 - async lm_server_accepting_connections() bool[source]#
- Check if the language model server is accepting connections. - This method will attempt to make a request to the server’s version endpoint. If the server is online and responds successfully, it returns True. If the server is not online or if there is a connection error, it returns False. - Returns:
- accepting_connections (bool) – True if the language model server is accepting connections, False otherwise. 
 
 - async start_vllm_server(model_name: str, quantization: Literal['bitsandbytes', 'none'] = 'none') str[source]#
- Start the vLLM language model server with the specified model. - Parameters:
- model_name (str) – The name of the model to be served by vLLM. This should match a model that is available in the vLLM installation. 
- quantization (VllmQuantization, default="no") – The quantization method to use for the model. 
 
- Returns:
- success_message (str) – A message indicating that the vLLM server has been started successfully, or was already running. 
- Raises:
- BadResponseError – If the server returns an invalid response or if the response does not contain the expected data. 
 
 - async stop_vllm_server(ignore_not_running: bool = False, timeout: float = 15.0)[source]#
- Stop the vLLM language model server. - Parameters:
- ignore_not_running (bool, default=False) – If True, the server will not raise an error if it is not running. Instead, it will log a warning and return a success message indicating that the server was not running and is being ignored. 
- timeout (float, default=15.0) – The maximum time to wait for the vLLM server to stop, in seconds. If the server does not stop within this time, a timeout error will be raised. The server will attempt to terminate gracefully for max(timeout - 5.0, 1.0) seconds, after which it will be forcefully killed if it is still running. 
 
- Raises:
- HTTPStatusError – If the server returns an error status code while stopping the vLLM server. 
 
 - async validate_server_version()[source]#
- Validate the server version against the client version. - This method checks if the server version matches the client version. If they differ by a major version, it raises a RuntimeError. If they differ by a minor or patch version, it issues a warning. - Raises:
- RuntimeError – If the server version differs from the client version by a major version. 
- UserWarning – If the server version differs from the client version by a minor or patch version. 
 
 
 - async wait_for_lm_server_to_accept_connections(timeout: float = 300)[source]#
- Wait for the language model server to start accepting connections. - This method will repeatedly check if the server is online by making a request to the server’s version endpoint. If the server is not online, it will raise a ClientTimeoutError after the specified timeout period. - Parameters:
- timeout (float, default=300) – The maximum time to wait for the language model server to start accepting connections, in seconds. 
- Raises:
- ClientTimeoutError – If the language model server does not become online within the specified timeout. 
 
 - async wait_for_vllm_server(timeout: float = 900)[source]#
- Wait for the vLLM server to be online. - Parameters:
- timeout (float, default=900) – The maximum time to wait for the vLLM server to be online, in seconds. 
- Raises:
- ClientTimeoutError – If the vLLM server does not become online within the specified timeout.