nip.language_model_server.server

nip.language_model_server.server#

A server which allows for controlling vLLM and doing language model training.

The server is a FastAPI application with the following endpoints:

  • version: Returns the nip package version.

  • /vllm/start: Starts a vLLM server with the specified model.

  • /vllm/stop: Stops the vLLM server.

  • /vllm/status: Returns the status of the vLLM server.

  • /training/jobs: Create or list fine-tuning jobs.

  • /training/jobs/<job_id>: Get info about a fine-tuning job or cancel it.

Example

>>> from nip.language_model_server.server import LanguageModelServer
>>> from quart import Quart
>>> app = Quart(__name__)
>>> async with LanguageModelServer(app, vllm_port=8000):
...     app.run(port=8080)

Functions

_raise_server_error_as_http_exception(error)

Raise an HTTPException based on a LanguageModelServerError.

cancel_training_job(job_id)

Cancel a training job.

create_training_job(request)

Create a new training job.

get_package_version()

Get the version of the language model server.

get_training_job(job_id)

Get info about a training job.

get_training_jobs()

List all training jobs managed by the server.

get_vllm_server_status()

Get the status of the vLLM server.

lifespan(app)

Lifespan context manager for the FastAPI application.

start_vllm_server(request)

Start the vLLM server with the specified model.

stop_vllm_server(request)

Stop the vLLM server.