nip.language_model_server.server#

A server which allows for controlling vLLM and doing language model training.

The server is a FastAPI application with the following endpoints:

version: Returns the nip package version.
/vllm/start: Starts a vLLM server with the specified model.
/vllm/stop: Stops the vLLM server.
/vllm/status: Returns the status of the vLLM server.
/training/jobs: Create or list fine-tuning jobs.
/training/jobs/<job_id>: Get info about a fine-tuning job or cancel it.

Example

>>> from nip.language_model_server.server import LanguageModelServer
>>> from quart import Quart
>>> app = Quart(__name__)
>>> async with LanguageModelServer(app, vllm_port=8000):
...     app.run(port=8080)

Functions

`_raise_server_error_as_http_exception`(error)	Raise an HTTPException based on a LanguageModelServerError.
`cancel_training_job`(job_id)	Cancel a training job.
`create_training_job`(request)	Create a new training job.
`get_package_version`()	Get the version of the language model server.
`get_training_job`(job_id)	Get info about a training job.
`get_training_jobs`()	List all training jobs managed by the server.
`get_vllm_server_status`()	Get the status of the vLLM server.
`lifespan`(app)	Lifespan context manager for the FastAPI application.
`start_vllm_server`(request)	Start the vLLM server with the specified model.
`stop_vllm_server`(request)	Stop the vLLM server.