mirror of
https://github.com/QwenLM/Qwen.git
synced 2026-05-21 00:45:48 +08:00
284 lines
8.8 KiB
Plaintext
284 lines
8.8 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "6e6981ab-2d9a-4280-923f-235a166855ba",
|
||
"metadata": {},
|
||
"source": [
|
||
"# QLoRA Fine-Tuning Qwen-Chat Large Language Model (Single GPU)\n",
|
||
"\n",
|
||
"Tongyi Qianwen is a large language model developed by Alibaba Cloud based on the Transformer architecture, trained on an extensive set of pre-training data. The pre-training data is diverse and covers a wide range, including a large amount of internet text, specialized books, code, etc. In addition, an AI assistant called Qwen-Chat has been created based on the pre-trained model using alignment mechanism.\n",
|
||
"\n",
|
||
"This notebook uses Qwen-1.8B-Chat as an example to introduce how to QLoRA fine-tune the Qianwen model using Deepspeed.\n",
|
||
"\n",
|
||
"## Environment Requirements\n",
|
||
"\n",
|
||
"Please refer to **requirements.txt** to install the required dependencies.\n",
|
||
"\n",
|
||
"## Preparation\n",
|
||
"\n",
|
||
"### Download Qwen-1.8B-Chat\n",
|
||
"\n",
|
||
"First, download the model files. You can choose to download directly from ModelScope.\n",
|
||
"\n",
|
||
"Note that we use the Int4 version of the models for QLoRA training."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "248488f9-4a86-4f35-9d56-50f8e91a8f11",
|
||
"metadata": {
|
||
"ExecutionIndicator": {
|
||
"show": true
|
||
},
|
||
"tags": []
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"from modelscope.hub.snapshot_download import snapshot_download\n",
|
||
"model_dir = snapshot_download('Qwen/Qwen-1_8B-Chat-Int4', cache_dir='.', revision='master')"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "7b2a92b1-f08e-4413-9f92-8f23761e6e1f",
|
||
"metadata": {},
|
||
"source": [
|
||
"### Download Example Training Data\n",
|
||
"\n",
|
||
"Download the data required for training; here, we provide a tiny dataset as an example. It is sampled from [Belle](https://github.com/LianjiaTech/BELLE).\n",
|
||
"\n",
|
||
"Disclaimer: the dataset can be only used for the research purpose."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "ce195f08-fbb2-470e-b6c0-9a03457458c7",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"!wget https://atp-modelzoo-sh.oss-cn-shanghai.aliyuncs.com/release/tutorials/qwen_recipes/Belle_sampled_qwen.json"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "7226bed0-171b-4d45-a3f9-b3d81ec2bb9f",
|
||
"metadata": {},
|
||
"source": [
|
||
"You can also refer to this format to prepare the dataset. Below is a simple example list with 1 sample:\n",
|
||
"\n",
|
||
"```json\n",
|
||
"[\n",
|
||
" {\n",
|
||
" \"id\": \"identity_0\",\n",
|
||
" \"conversations\": [\n",
|
||
" {\n",
|
||
" \"from\": \"user\",\n",
|
||
" \"value\": \"你好\"\n",
|
||
" },\n",
|
||
" {\n",
|
||
" \"from\": \"assistant\",\n",
|
||
" \"value\": \"我是一个语言模型,我叫通义千问。\"\n",
|
||
" }\n",
|
||
" ]\n",
|
||
" }\n",
|
||
"]\n",
|
||
"```\n",
|
||
"\n",
|
||
"You can also use multi-turn conversations as the training set. Here is a simple example:\n",
|
||
"\n",
|
||
"```json\n",
|
||
"[\n",
|
||
" {\n",
|
||
" \"id\": \"identity_0\",\n",
|
||
" \"conversations\": [\n",
|
||
" {\n",
|
||
" \"from\": \"user\",\n",
|
||
" \"value\": \"你好,能告诉我遛狗的最佳时间吗?\"\n",
|
||
" },\n",
|
||
" {\n",
|
||
" \"from\": \"assistant\",\n",
|
||
" \"value\": \"当地最佳遛狗时间因地域差异而异,请问您所在的城市是哪里?\"\n",
|
||
" },\n",
|
||
" {\n",
|
||
" \"from\": \"user\",\n",
|
||
" \"value\": \"我在纽约市。\"\n",
|
||
" },\n",
|
||
" {\n",
|
||
" \"from\": \"assistant\",\n",
|
||
" \"value\": \"纽约市的遛狗最佳时间通常在早晨6点至8点和晚上8点至10点之间,因为这些时间段气温较低,遛狗更加舒适。但具体时间还需根据气候、气温和季节变化而定。\"\n",
|
||
" }\n",
|
||
" ]\n",
|
||
" }\n",
|
||
"]\n",
|
||
"```\n",
|
||
"\n",
|
||
"## Fine-Tune the Model\n",
|
||
"\n",
|
||
"You can directly run the prepared training script to fine-tune the model."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "7ab0581e-be85-45e6-a5b7-af9c42ea697b",
|
||
"metadata": {
|
||
"ExecutionIndicator": {
|
||
"show": true
|
||
},
|
||
"tags": []
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"!python ../../finetune.py \\\n",
|
||
" --model_name_or_path \"Qwen/Qwen-1_8B-Chat-Int4/\"\\\n",
|
||
" --data_path \"Belle_sampled_qwen.json\"\\\n",
|
||
" --bf16 \\\n",
|
||
" --output_dir \"output_qwen\" \\\n",
|
||
" --num_train_epochs 5 \\\n",
|
||
" --per_device_train_batch_size 1 \\\n",
|
||
" --per_device_eval_batch_size 1 \\\n",
|
||
" --gradient_accumulation_steps 16 \\\n",
|
||
" --evaluation_strategy \"no\" \\\n",
|
||
" --save_strategy \"steps\" \\\n",
|
||
" --save_steps 1000 \\\n",
|
||
" --save_total_limit 10 \\\n",
|
||
" --learning_rate 1e-5 \\\n",
|
||
" --weight_decay 0.1 \\\n",
|
||
" --adam_beta2 0.95 \\\n",
|
||
" --warmup_ratio 0.01 \\\n",
|
||
" --lr_scheduler_type \"cosine\" \\\n",
|
||
" --logging_steps 1 \\\n",
|
||
" --report_to \"none\" \\\n",
|
||
" --model_max_length 512 \\\n",
|
||
" --gradient_checkpointing \\\n",
|
||
" --lazy_preprocess \\\n",
|
||
" --use_lora \\\n",
|
||
" --q_lora \\\n",
|
||
" --deepspeed \"../../finetune/ds_config_zero2.json\""
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "0a50941d-3c3c-4ed2-9185-d4fe6172da2f",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Merge Weights\n",
|
||
"\n",
|
||
"The training of both LoRA and Q-LoRA only saves the adapter parameters. Note that you can not merge weights into quantized models. Instead, we can merge the weights based on the original chat model.\n",
|
||
"\n",
|
||
"You can load the fine-tuned model and merge weights as shown below:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "909ff537-f851-488e-b1e8-1046f6852202",
|
||
"metadata": {
|
||
"ExecutionIndicator": {
|
||
"show": true
|
||
},
|
||
"tags": []
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"from modelscope.hub.snapshot_download import snapshot_download\n",
|
||
"snapshot_download('Qwen/Qwen-1_8B-Chat', cache_dir='.', revision='master')\n",
|
||
"\n",
|
||
"from transformers import AutoModelForCausalLM\n",
|
||
"from peft import PeftModel\n",
|
||
"import torch\n",
|
||
"\n",
|
||
"model = AutoModelForCausalLM.from_pretrained(\"Qwen/Qwen-1_8B-Chat/\", torch_dtype=torch.float16, device_map=\"auto\", trust_remote_code=True)\n",
|
||
"model = PeftModel.from_pretrained(model, \"output_qwen/\")\n",
|
||
"merged_model = model.merge_and_unload()\n",
|
||
"merged_model.save_pretrained(\"output_qwen_merged\", max_shard_size=\"2048MB\", safe_serialization=True)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "7969df6e-ba8a-45f5-8b44-e1cbe74a8ef6",
|
||
"metadata": {},
|
||
"source": [
|
||
"The tokenizer files are not saved in the new directory in this step. You can copy the tokenizer files or use the following code:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "c01b6a3f-036f-4b7c-b5a6-76a7b6894d4e",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"from transformers import AutoTokenizer\n",
|
||
"\n",
|
||
"tokenizer = AutoTokenizer.from_pretrained(\n",
|
||
" \"Qwen/Qwen-1_8B-Chat-Int4/\",\n",
|
||
" trust_remote_code=True\n",
|
||
")\n",
|
||
"\n",
|
||
"tokenizer.save_pretrained(\"output_qwen_merged\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "c2944b9b-89c7-4fb5-bd08-941d4706e943",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Test the Model\n",
|
||
"\n",
|
||
"After merging the weights, we can test the model as follows:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "b77abbb1-5b29-4eb1-8a6c-e2e146b8d33d",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"from transformers import AutoModelForCausalLM, AutoTokenizer\n",
|
||
"from transformers.generation import GenerationConfig\n",
|
||
"\n",
|
||
"tokenizer = AutoTokenizer.from_pretrained(\"output_qwen_merged\", trust_remote_code=True)\n",
|
||
"model = AutoModelForCausalLM.from_pretrained(\n",
|
||
" \"output_qwen_merged\",\n",
|
||
" device_map=\"auto\",\n",
|
||
" trust_remote_code=True\n",
|
||
").eval()\n",
|
||
"\n",
|
||
"response, history = model.chat(tokenizer, \"你好\", history=None)\n",
|
||
"print(response)"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3 (ipykernel)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.10.13"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|