nip.code_validation.agents.NonFinetunableSharedModelGroup#

class nip.code_validation.agents.NonFinetunableSharedModelGroup(hyper_params: HyperParameters, settings: ExperimentSettings, protocol_handler: ProtocolHandler, agent_wholes: Iterable[PureTextWholeAgent], group_name: str)[source]#

A group of code validation agents sharing a model which can’t be fine-tuned.

This class is used for models accessed through an API which does not allow fine-tuning.

Methods Summary

__init__(hyper_params, settings, ...)

_raise_not_implemented_error()

Raise a NotImplementedError because fine-tuning is not supported.

agent_ids_and_names()

Get an iterable of agent IDs and names.

create_dpo_fine_tune_job(*args, **kwargs)

Create a DPO fine-tune job for the agent group given sampled timesteps.

create_supervised_fine_tune_job(*args, **kwargs)

Create a supervised fine-tune job for the agent.

eval()

Set the agent group to evaluation mode.

fine_tune_job_failed()

Check if the fine-tune job has failed.

get_fine_tune_job_error_repr()

Get a string representation of the error for the fine-tune job.

get_fine_tune_job_status()

Get the status of the fine-tune job.

get_state()

Get the state of the shared model group.

get_state_dict()

Get the state of the shared model group as a dict.

set_state(checkpoint)

Set the state of the shared model group from a checkpoint.

switch_to_next_model()

Switch to the next model after fine-tuning.

train()

Set the agent group to training mode.

wait_for_ready([timeout])

Wait for the agent group to be ready.

Attributes

is_trainable

lora_alpha

The computed LoRA alpha value for the group.

max_message_rounds

The maximum number of message rounds in the protocol.

model_name

The current model name, which may be the base model or a fine-tuned model.

num_epochs

The number of epochs to train the model for.

rl_learning_rate

The learning rate for this group when using reinforcement learning.

Methods

__init__(hyper_params: HyperParameters, settings: ExperimentSettings, protocol_handler: ProtocolHandler, agent_wholes: Iterable[PureTextWholeAgent], group_name: str)[source]#
_raise_not_implemented_error()[source]#

Raise a NotImplementedError because fine-tuning is not supported.

This method is not supported for non-fine-tunable models.

agent_ids_and_names() Iterable[tuple[int, str]][source]#

Get an iterable of agent IDs and names.

Yields:
  • agent_id (int) – The ID of the agent.

  • agent_name (str) – The name of the agent.

async create_dpo_fine_tune_job(*args, **kwargs)[source]#

Create a DPO fine-tune job for the agent group given sampled timesteps.

This method is not supported for non-fine-tunable models.

async create_supervised_fine_tune_job(*args, **kwargs)[source]#

Create a supervised fine-tune job for the agent.

This method is not supported for non-fine-tunable models.

async eval()[source]#

Set the agent group to evaluation mode.

This method may be overridden by subclasses if anything needs to be done when the agent group is set to evaluation mode.

async fine_tune_job_failed() bool[source]#

Check if the fine-tune job has failed.

Returns:

failed (bool) – True if the fine-tune job has failed, False otherwise.

async get_fine_tune_job_error_repr() str[source]#

Get a string representation of the error for the fine-tune job.

This method is not supported for non-fine-tunable models.

async get_fine_tune_job_status() Literal['pending', 'running', 'succeeded', 'failed', 'cancelled', 'not_found'][source]#

Get the status of the fine-tune job.

This method is not supported for non-fine-tunable models.

get_state() PureTextSharedModelGroupState[source]#

Get the state of the shared model group.

get_state_dict() dict[source]#

Get the state of the shared model group as a dict.

This method returns an empty dictionary because the model cannot be fine-tuned and therefore has not state.

Returns:

state_dict (dict) – The state of the shared model group.

set_state(checkpoint: PureTextSharedModelGroupState)[source]#

Set the state of the shared model group from a checkpoint.

This method should be overridden by subclasses to restore the state of the shared model group from a checkpoint.

Parameters:

checkpoint (AgentCheckpoint) – The checkpoint to restore the state from.

async switch_to_next_model()[source]#

Switch to the next model after fine-tuning.

This method is not supported for non-fine-tunable models.

async train()[source]#

Set the agent group to training mode.

This method may be overridden by subclasses if anything needs to be done when the agent group is set to training mode.

async wait_for_ready(timeout: float = 300.0)[source]#

Wait for the agent group to be ready.

Parameters:

timeout (float, default=300.0) – The maximum time to wait for the agent group to be ready, in seconds.

Raises:

TimeoutError – If the agent group is not ready within the timeout period.