nip.utils.plotting.decision_spectrum.get_thresholded_performance

nip.utils.plotting.decision_spectrum.get_thresholded_performance#

nip.utils.plotting.decision_spectrum.get_thresholded_performance(rollouts: NestedArrayDict, hyper_params: HyperParameters) DataFrame[source]#

Compute the performance of the verifier at different thresholds.

When the verifier outputs a decision on a scale, we can threshold it to get a binary decision at different levels. This function computes the performance of the verifier at different thresholds.

Parameters:
  • rollouts (NestedArrayDict) – The rollouts to be analysed. Each rollout is a NestedArrayDict containing the verifier decisions.

  • hyper_params (HyperParameters) – The hyperparameters of the experiment. This is used to determine the decision scale used by the verifier.

Returns:

performance (pd.DataFrame) – The performance of the verifier at different thresholds. This is a pandas DataFrame with the following columns:

  • ”threshold_text”: The text value of the threshold used to compute the performance.

  • ”threshold_float”: The threshold used to compute the performance as a float between -1 and 1.

  • ”accuracy”: The accuracy of the verifier at this threshold.

  • ”true_positive_rate”: The true positive rate at this threshold.

  • ”false_positive_rate”: The false positive rate at this threshold.

  • ”true_negative_rate”: The true negative rate at this threshold.

  • ”false_negative_rate”: The false negative rate at this threshold.

  • ”precision”: The precision of the verifier at this threshold.