Class: Google::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1SchemaModelevaluationMetricsPairwiseTextGenerationEvaluationMetrics
- Inherits:
-
Object
- Object
- Google::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1SchemaModelevaluationMetricsPairwiseTextGenerationEvaluationMetrics
- Includes:
- Core::Hashable, Core::JsonObjectSupport
- Defined in:
- lib/google/apis/aiplatform_v1beta1/classes.rb,
lib/google/apis/aiplatform_v1beta1/representations.rb,
lib/google/apis/aiplatform_v1beta1/representations.rb
Overview
Metrics for general pairwise text generation evaluation results.
Instance Attribute Summary collapse
-
#accuracy ⇒ Float
Fraction of cases where the autorater agreed with the human raters.
-
#baseline_model_win_rate ⇒ Float
Percentage of time the autorater decided the baseline model had the better response.
-
#cohens_kappa ⇒ Float
A measurement of agreement between the autorater and human raters that takes the likelihood of random agreement into account.
-
#f1_score ⇒ Float
Harmonic mean of precision and recall.
-
#false_negative_count ⇒ Fixnum
Number of examples where the autorater chose the baseline model, but humans preferred the model.
-
#false_positive_count ⇒ Fixnum
Number of examples where the autorater chose the model, but humans preferred the baseline model.
-
#human_preference_baseline_model_win_rate ⇒ Float
Percentage of time humans decided the baseline model had the better response.
-
#human_preference_model_win_rate ⇒ Float
Percentage of time humans decided the model had the better response.
-
#model_win_rate ⇒ Float
Percentage of time the autorater decided the model had the better response.
-
#precision ⇒ Float
Fraction of cases where the autorater and humans thought the model had a better response out of all cases where the autorater thought the model had a better response.
-
#recall ⇒ Float
Fraction of cases where the autorater and humans thought the model had a better response out of all cases where the humans thought the model had a better response.
-
#true_negative_count ⇒ Fixnum
Number of examples where both the autorater and humans decided that the model had the worse response.
-
#true_positive_count ⇒ Fixnum
Number of examples where both the autorater and humans decided that the model had the better response.
Instance Method Summary collapse
-
#initialize(**args) ⇒ GoogleCloudAiplatformV1beta1SchemaModelevaluationMetricsPairwiseTextGenerationEvaluationMetrics
constructor
A new instance of GoogleCloudAiplatformV1beta1SchemaModelevaluationMetricsPairwiseTextGenerationEvaluationMetrics.
-
#update!(**args) ⇒ Object
Update properties of this object.
Constructor Details
#initialize(**args) ⇒ GoogleCloudAiplatformV1beta1SchemaModelevaluationMetricsPairwiseTextGenerationEvaluationMetrics
Returns a new instance of GoogleCloudAiplatformV1beta1SchemaModelevaluationMetricsPairwiseTextGenerationEvaluationMetrics.
25860 25861 25862 |
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25860 def initialize(**args) update!(**args) end |
Instance Attribute Details
#accuracy ⇒ Float
Fraction of cases where the autorater agreed with the human raters.
Corresponds to the JSON property accuracy
25788 25789 25790 |
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25788 def accuracy @accuracy end |
#baseline_model_win_rate ⇒ Float
Percentage of time the autorater decided the baseline model had the better
response.
Corresponds to the JSON property baselineModelWinRate
25794 25795 25796 |
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25794 def baseline_model_win_rate @baseline_model_win_rate end |
#cohens_kappa ⇒ Float
A measurement of agreement between the autorater and human raters that takes
the likelihood of random agreement into account.
Corresponds to the JSON property cohensKappa
25800 25801 25802 |
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25800 def cohens_kappa @cohens_kappa end |
#f1_score ⇒ Float
Harmonic mean of precision and recall.
Corresponds to the JSON property f1Score
25805 25806 25807 |
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25805 def f1_score @f1_score end |
#false_negative_count ⇒ Fixnum
Number of examples where the autorater chose the baseline model, but humans
preferred the model.
Corresponds to the JSON property falseNegativeCount
25811 25812 25813 |
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25811 def false_negative_count @false_negative_count end |
#false_positive_count ⇒ Fixnum
Number of examples where the autorater chose the model, but humans preferred
the baseline model.
Corresponds to the JSON property falsePositiveCount
25817 25818 25819 |
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25817 def false_positive_count @false_positive_count end |
#human_preference_baseline_model_win_rate ⇒ Float
Percentage of time humans decided the baseline model had the better response.
Corresponds to the JSON property humanPreferenceBaselineModelWinRate
25822 25823 25824 |
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25822 def human_preference_baseline_model_win_rate @human_preference_baseline_model_win_rate end |
#human_preference_model_win_rate ⇒ Float
Percentage of time humans decided the model had the better response.
Corresponds to the JSON property humanPreferenceModelWinRate
25827 25828 25829 |
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25827 def human_preference_model_win_rate @human_preference_model_win_rate end |
#model_win_rate ⇒ Float
Percentage of time the autorater decided the model had the better response.
Corresponds to the JSON property modelWinRate
25832 25833 25834 |
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25832 def model_win_rate @model_win_rate end |
#precision ⇒ Float
Fraction of cases where the autorater and humans thought the model had a
better response out of all cases where the autorater thought the model had a
better response. True positive divided by all positive.
Corresponds to the JSON property precision
25839 25840 25841 |
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25839 def precision @precision end |
#recall ⇒ Float
Fraction of cases where the autorater and humans thought the model had a
better response out of all cases where the humans thought the model had a
better response.
Corresponds to the JSON property recall
25846 25847 25848 |
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25846 def recall @recall end |
#true_negative_count ⇒ Fixnum
Number of examples where both the autorater and humans decided that the model
had the worse response.
Corresponds to the JSON property trueNegativeCount
25852 25853 25854 |
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25852 def true_negative_count @true_negative_count end |
#true_positive_count ⇒ Fixnum
Number of examples where both the autorater and humans decided that the model
had the better response.
Corresponds to the JSON property truePositiveCount
25858 25859 25860 |
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25858 def true_positive_count @true_positive_count end |
Instance Method Details
#update!(**args) ⇒ Object
Update properties of this object
25865 25866 25867 25868 25869 25870 25871 25872 25873 25874 25875 25876 25877 25878 25879 |
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25865 def update!(**args) @accuracy = args[:accuracy] if args.key?(:accuracy) @baseline_model_win_rate = args[:baseline_model_win_rate] if args.key?(:baseline_model_win_rate) @cohens_kappa = args[:cohens_kappa] if args.key?(:cohens_kappa) @f1_score = args[:f1_score] if args.key?(:f1_score) @false_negative_count = args[:false_negative_count] if args.key?(:false_negative_count) @false_positive_count = args[:false_positive_count] if args.key?(:false_positive_count) @human_preference_baseline_model_win_rate = args[:human_preference_baseline_model_win_rate] if args.key?(:human_preference_baseline_model_win_rate) @human_preference_model_win_rate = args[:human_preference_model_win_rate] if args.key?(:human_preference_model_win_rate) @model_win_rate = args[:model_win_rate] if args.key?(:model_win_rate) @precision = args[:precision] if args.key?(:precision) @recall = args[:recall] if args.key?(:recall) @true_negative_count = args[:true_negative_count] if args.key?(:true_negative_count) @true_positive_count = args[:true_positive_count] if args.key?(:true_positive_count) end |