evaluate.py

1 F821: undefined name 'get_metrics'
- 103 metrics: List[str] = get_metrics(dataset_name)
22 E501: line too long (96 > 79 characters) (and 13 similar)
- Line too long (96 > 79 characters):
  20 python examples/evaluate.py <Path to directory with predictions> <Path to output directory>
- Line too long (124 > 79 characters):
  21 python examples/evaluate.py <Path to directory with predictions> <Path to output directory> --dataset <A WILDS dataset>
- Line too long (82 > 79 characters):
  26 def evaluate_all_benchmarks(predictions_dir: str, output_dir: str, root_dir: str):
- Line too long (83 > 79 characters):
  31 predictions_dir (str): Path to the directory with predictions. Can be a URL
- Line too long (85 > 79 characters):
  39 dataset, os.path.join(predictions_dir, dataset), output_dir, root_dir
- Line too long (99 > 79 characters):
  57 dataset_name (str): Name of the dataset. See datasets.py for the complete list of datasets.
- Line too long (84 > 79 characters):
  58 predictions_dir (str): Path to the directory with predictions. Can be a URL.
- Line too long (88 > 79 characters):
  63 Metrics as a dictionary with metrics as the keys and metric values as the values
- Line too long (83 > 79 characters):
  88 f"Could not find CSV or pth prediction file that starts with {run_id}."
- Line too long (106 > 79 characters):
  116 f"Processing split={split}, replicate={replicate}, predictions_file={predictions_file}..."
- Line too long (102 > 79 characters):
  120 # GlobalWheat's predictions are a list of dictionaries, so it has to be handled separately
- Line too long (86 > 79 characters):
  122 metric_results: Dict[str, float] = evaluate_replicate_for_globalwheat(
- Line too long (80 > 79 characters):
  131 replicates_results[split][metric].append(metric_results[metric])
- Line too long (85 > 79 characters):
  139 replicates_metric_values: List[float] = replicates_results[split][metric]
- Line too long (81 > 79 characters):
  143 aggregated_results[split][metric] = np.mean(replicates_metric_values)
- Line too long (82 > 79 characters):
  147 with open(os.path.join(output_dir, f"{dataset_name}_results.json"), "w") as f:
- Line too long (88 > 79 characters):
  165 Metrics as a dictionary with metrics as the keys and metric values as the values
- Line too long (81 > 79 characters):
  191 path (str): Path to the file that has the predicted labels. Can be a URL.
- Line too long (86 > 79 characters):
  203 predicted_labels = [literal_eval(line.rstrip()) for line in data if line.rstrip()]
- Line too long (82 > 79 characters):
  224 print("A dataset was not specified. Evaluating for all WILDS datasets...")
- Line too long (85 > 79 characters):
  225 evaluate_all_benchmarks(args.predictions_dir, args.output_dir, args.root_dir)
- Line too long (113 > 79 characters):
  253 help="The directory where the datasets can be found (or should be downloaded to, if they do not exist).",
1 E303: too many blank lines (3)
- 93 # Dataset will only be downloaded if it does not exist
1 E722: do not use bare 'except'
- 214 except: