nip.code_validation.agents.NonFinetunableSharedModelGroup#
- class nip.code_validation.agents.NonFinetunableSharedModelGroup(hyper_params: HyperParameters, settings: ExperimentSettings, protocol_handler: ProtocolHandler, agent_wholes: Iterable[PureTextWholeAgent], group_name: str)[source]#
A group of code validation agents sharing a model which can’t be fine-tuned.
This class is used for models accessed through an API which does not allow fine-tuning.
Methods Summary
__init__
(hyper_params, settings, ...)Raise a NotImplementedError because fine-tuning is not supported.
Get an iterable of agent IDs and names.
create_dpo_fine_tune_job
(*args, **kwargs)Create a DPO fine-tune job for the agent group given sampled timesteps.
create_supervised_fine_tune_job
(*args, **kwargs)Create a supervised fine-tune job for the agent.
eval
()Set the agent group to evaluation mode.
Check if the fine-tune job has failed.
Get a string representation of the error for the fine-tune job.
Get the status of the fine-tune job.
Get the state of the shared model group.
Get the state of the shared model group as a dict.
set_state
(checkpoint)Set the state of the shared model group from a checkpoint.
Switch to the next model after fine-tuning.
train
()Set the agent group to training mode.
wait_for_ready
([timeout])Wait for the agent group to be ready.
Attributes
is_trainable
lora_alpha
The computed LoRA alpha value for the group.
max_message_rounds
The maximum number of message rounds in the protocol.
model_name
The current model name, which may be the base model or a fine-tuned model.
num_epochs
The number of epochs to train the model for.
rl_learning_rate
The learning rate for this group when using reinforcement learning.
Methods
- __init__(hyper_params: HyperParameters, settings: ExperimentSettings, protocol_handler: ProtocolHandler, agent_wholes: Iterable[PureTextWholeAgent], group_name: str)[source]#
- _raise_not_implemented_error()[source]#
Raise a NotImplementedError because fine-tuning is not supported.
This method is not supported for non-fine-tunable models.
- agent_ids_and_names() → Iterable[tuple[int, str]][source]#
Get an iterable of agent IDs and names.
- Yields:
agent_id (int) – The ID of the agent.
agent_name (str) – The name of the agent.
- async create_dpo_fine_tune_job(*args, **kwargs)[source]#
Create a DPO fine-tune job for the agent group given sampled timesteps.
This method is not supported for non-fine-tunable models.
- async create_supervised_fine_tune_job(*args, **kwargs)[source]#
Create a supervised fine-tune job for the agent.
This method is not supported for non-fine-tunable models.
- async eval()[source]#
Set the agent group to evaluation mode.
This method may be overridden by subclasses if anything needs to be done when the agent group is set to evaluation mode.
- async fine_tune_job_failed() → bool[source]#
Check if the fine-tune job has failed.
- Returns:
failed (bool) – True if the fine-tune job has failed, False otherwise.
- async get_fine_tune_job_error_repr() → str[source]#
Get a string representation of the error for the fine-tune job.
This method is not supported for non-fine-tunable models.
- async get_fine_tune_job_status() → Literal['pending', 'running', 'succeeded', 'failed', 'cancelled', 'not_found'][source]#
Get the status of the fine-tune job.
This method is not supported for non-fine-tunable models.
- get_state() → PureTextSharedModelGroupState[source]#
Get the state of the shared model group.
- get_state_dict() → dict[source]#
Get the state of the shared model group as a dict.
This method returns an empty dictionary because the model cannot be fine-tuned and therefore has not state.
- Returns:
state_dict (dict) – The state of the shared model group.
- set_state(checkpoint: PureTextSharedModelGroupState)[source]#
Set the state of the shared model group from a checkpoint.
This method should be overridden by subclasses to restore the state of the shared model group from a checkpoint.
- Parameters:
checkpoint (AgentCheckpoint) – The checkpoint to restore the state from.
- async switch_to_next_model()[source]#
Switch to the next model after fine-tuning.
This method is not supported for non-fine-tunable models.
- async train()[source]#
Set the agent group to training mode.
This method may be overridden by subclasses if anything needs to be done when the agent group is set to training mode.
- async wait_for_ready(timeout: float = 300.0)[source]#
Wait for the agent group to be ready.
- Parameters:
timeout (float, default=300.0) – The maximum time to wait for the agent group to be ready, in seconds.
- Raises:
TimeoutError – If the agent group is not ready within the timeout period.