fix format problems in evaluation code; update ceval extraction rules

2026-05-21 00:45:48 +08:00 · 2023-08-25 22:44:07 +08:00
parent 1a9a04a91e
commit 4864f7b278
11 changed files with 1507 additions and 808 deletions
--- a/eval/EVALUATION.md
+++ b/eval/EVALUATION.md
@@ -34,6 +34,19 @@ pip install thefuzz
 python evaluate_chat_mmlu.py -d data/mmlu/data/
 ```

+- CMMLU
+
+```Shell
+wget https://huggingface.co/datasets/haonan-li/cmmlu/resolve/main/cmmlu_v1_0_1.zip
+mkdir data/cmmlu
+mv cmmlu_v1_0_1.zip data/cmmlu
+cd data/cmmlu; unzip cmmlu_v1_0_1.zip
+cd ../../
+
+# Qwen-7B
+python evaluate_cmmlu.py -d data/cmmlu/
+```
+
 - HumanEval

 Get the HumanEval.jsonl file from [here](https://github.com/openai/human-eval/tree/master/data)