nip.code_validation.dataset_generation#
Module for generating datasets used in code validation systems.
A code validation dataset is generated by taking the APPS dataset and modifying solutions to create buggy solutions, using language models.
A CodeValidationDatasetConfig
class is provided to configure the generation of buggy
solutions for a dataset of problems. The generate_and_save_cv_dataset
function is
used to generate buggy solutions for a given dataset of problems and save the combined
dataset to disk.
Functions
Create an empty code validation dataset with the required columns. |
|
|
Extract the modified solution and problematic inputs from the model output. |
|
Generate buggy solutions for a given datum and append them to the result list. |
|
Get completions from the OpenAI API for a chat model. |
|
Send a POST request to the OpenRouter API to get responses from a chat model. |
|
Load an existing code validation dataset or create an empty one. |
|
Test a buggy solution against a correct solution using provided inputs and datum. |
|
Generate buggy solutions by modifying a fraction of the provided solutions. |
|
Generate a code validation dataset and save it to disk. |
Classes
|
A configuration class for generating datasets used in code validation systems. |