init commit of recipes (#1027)

Add recipes
This commit is contained in:
chywang
2024-01-30 01:57:09 -06:00
committed by GitHub
parent d275e5b91a
commit ee01f36ed9
30 changed files with 5146 additions and 0 deletions

View File

@@ -0,0 +1,198 @@
## Introduction
[SWIFT](https://github.com/modelscope/swift) (Scalable lightWeight Infrastructure for Fine-Tuning) is an extensible framwork designed to faciliate lightweight model fine-tuning and inference. It integrates implementations for various efficient fine-tuning methods, by embracing approaches that is parameter-efficient, memory-efficient, and time-efficient. SWIFT integrates seamlessly into ModelScope ecosystem and offers the capabilities to finetune various models, with a primary emphasis on LLMs and vision models. Additionally, SWIFT is fully compatible with PEFT, enabling users to leverage the familiar Peft interface to finetune ModelScope models.
## Installation
```shell
# Set the global pip mirror
pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/
# Install ms-swift
git clone https://github.com/modelscope/swift.git
cd swift
pip install -e .[llm]
# If you want to use deepspeed
pip install deepspeed -U
# If you want to use qlora training based on auto_gptq (recommended, performs better than bnb)
# Models supporting auto_gptq: `https://github.com/modelscope/swift/blob/main/docs/source/LLM/支持的模型和数据集.md#模型`
# There's a version correspondence between auto_gptq and cuda; refer to `https://github.com/PanQiWei/AutoGPTQ#quick-installation` for selecting the appropriate version
pip install auto_gptq -U
# If you want to use qlora training based on bnb
pip install bitsandbytes -U
# Environment alignment (run the following commands if you encounter errors; the repository is tested with the latest environment)
pip install -r requirements/framework.txt -U
pip install -r requirements/llm.txt -U
```
## WebUI Usage
Run the following command to start the webui and conduct model training and inference through the graphical interface:
```shell
swift web-ui
```
A screenshot example can be found at:
![image](https://modelscope.oss-cn-beijing.aliyuncs.com/resource/swift_webui.jpg)
## Fine-tuning
```python
# Experimental environment: A10, 3090, V100, ...
# GPU memory requirement: 20GB
CUDA_VISIBLE_DEVICES=0 \
swift sft \
--model_id_or_path qwen/Qwen-7B-Chat \
--dataset blossom-math-zh \
--output_dir output \
# Use your own dataset
CUDA_VISIBLE_DEVICES=0 \
swift sft \
--model_id_or_path qwen/Qwen-7B-Chat \
--custom_train_dataset_path chatml.jsonl \
--output_dir output \
# Using DDP (Distributed Data Parallel)
# Experimental environment: 2 * 3090
# GPU memory requirement: 2 * 23GB
CUDA_VISIBLE_DEVICES=0,1 \
NPROC_PER_NODE=2 \
swift sft \
--model_id_or_path qwen/Qwen-7B-Chat \
--dataset blossom-math-zh \
--output_dir output \
# Multi-machine multi-GPU setup
# node0
CUDA_VISIBLE_DEVICES=0,1,2,3 \
NNODES=2 \
NODE_RANK=0 \
MASTER_ADDR=127.0.0.1 \
NPROC_PER_NODE=4 \
swift sft \
--model_id_or_path qwen/Qwen-7B-Chat \
--dataset blossom-math-zh \
--output_dir output \
# node1
CUDA_VISIBLE_DEVICES=0,1,2,3 \
NNODES=2 \
NODE_RANK=1 \
MASTER_ADDR=xxx.xxx.xxx.xxx \
NPROC_PER_NODE=4 \
swift sft \
--model_id_or_path qwen/Qwen-7B-Chat \
--dataset blossom-math-zh \
--output_dir output \
```
For more fine-tuning methods, please refer to [here](https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM%E5%BE%AE%E8%B0%83%E6%96%87%E6%A1%A3.md#%E5%BE%AE%E8%B0%83).
Examples
| 模型名称 | 训练方法 |
|:-------------------|:---------------------------------------------------------------------------------------------------------------------------|
| qwen_14b | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b/lora_ddp_ds) |
| qwen_14b | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b/qlora) |
| qwen_14b | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b/qlora_ddp_ds) |
| qwen_14b_chat | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat/lora_ddp_ds) |
| qwen_14b_chat | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat/qlora) |
| qwen_14b_chat | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat/qlora_ddp_ds) |
| qwen_14b_chat_int4 | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat_int4/qlora) |
| qwen_14b_chat_int4 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat_int4/qlora_ddp_ds) |
| qwen_14b_chat_int8 | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat_int8/qlora) |
| qwen_14b_chat_int8 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat_int8/qlora_ddp_ds) |
| qwen_1_8b_chat | [full](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_1_8b_chat/full) |
| qwen_1_8b_chat | [full_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_1_8b_chat/full_ddp) |
| qwen_72b_chat | [lora_mp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat/lora_mp) |
| qwen_72b_chat | [lora_mp_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat/lora_mp_ddp) |
| qwen_72b_chat | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat/qlora) |
| qwen_72b_chat_int4 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat_int4/qlora_ddp_ds) |
| qwen_72b_chat_int8 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat_int8/qlora_ddp_ds) |
| qwen_7b | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b/lora_ddp_ds) |
| qwen_7b | [qlora_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b/qlora_ddp) |
| qwen_7b_chat | [full](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/full) |
| qwen_7b_chat | [full_freeze_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/full_freeze_ddp) |
| qwen_7b_chat | [full_mp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/full_mp) |
| qwen_7b_chat | [full_mp_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/full_mp_ddp) |
| qwen_7b_chat | [lora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/lora) |
| qwen_7b_chat | [lora_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/lora_ddp) |
| qwen_7b_chat | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/lora_ddp_ds) |
| qwen_7b_chat | [lora_mp_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/lora_mp_ddp) |
| qwen_7b_chat | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/qlora) |
| qwen_7b_chat | [qlora_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/qlora_ddp) |
| qwen_7b_chat | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/qlora_ddp_ds) |
| qwen_7b_chat_int4 | [qalora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat_int4/qalora) |
| qwen_7b_chat_int4 | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat_int4/qlora) |
| qwen_7b_chat_int4 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat_int4/qlora_ddp_ds) |
| qwen_7b_chat_int8 | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat_int8/qlora) |
| qwen_7b_chat_int8 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat_int8/qlora_ddp_ds) |
| qwen_audio_chat | [full_mp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_audio_chat/full_mp) |
| qwen_audio_chat | [full_mp_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_audio_chat/full_mp_ddp) |
| qwen_audio_chat | [lora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_audio_chat/lora) |
| qwen_audio_chat | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_audio_chat/lora_ddp_ds) |
| qwen_vl | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl/lora_ddp_ds) |
| qwen_vl_chat | [full_mp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat/full_mp) |
| qwen_vl_chat | [full_mp_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat/full_mp_ddp) |
| qwen_vl_chat | [lora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat/lora) |
| qwen_vl_chat | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat/lora_ddp_ds) |
| qwen_vl_chat | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat/qlora) |
| qwen_vl_chat_int4 | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat_int4/qlora) |
| qwen_vl_chat_int4 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat_int4/qlora_ddp_ds) |
## Inference
```python
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
from swift.llm import (
get_model_tokenizer, get_template, inference, ModelType, get_default_template_type,
)
from swift.utils import seed_everything
model_type = ModelType.qwen_7b_chat
template_type = get_default_template_type(model_type)
print(f'template_type: {template_type}') # template_type: qwen
kwargs = {}
# kwargs['use_flash_attn'] = True # Use flash_attn if desired
model, tokenizer = get_model_tokenizer(model_type, model_kwargs={'device_map': 'auto'}, **kwargs)
# Modify max_new_tokens
model.generation_config.max_new_tokens = 128
template = get_template(template_type, tokenizer)
seed_everything(42)
query = 'What is the provincial capital of Zhejiang?'
response, history = inference(model, template, query)
print(f'query: {query}')
print(f'response: {response}')
query = 'What delicious food can be found here?'
response, history = inference(model, template, query, history)
print(f'query: {query}')
print(f'response: {response}')
print(f'history: {history}')
"""Output[0]:
query: What is the provincial capital of Zhejiang?
response: The provincial capital of Zhejiang is Hangzhou.
query: What delicious food can be found here?
response: Hangzhou has many famous delicacies, such as West Lake Vinegar Fish, Longjing Shrimp, Sweet and Sour Spare Ribs, and Maodu. Additionally, there are unique Hangzhou-style pastries like Osmanthus Cake, Lotus Paste Pastry, and Aiwo Steamed Rice Cakes.
history: [('What is the provincial capital of Zhejiang?', 'The provincial capital of Zhejiang is Hangzhou.'), ('What delicious food can be found here?', 'Hangzhou has many famous delicacies, such as West Lake Vinegar Fish, Longjing Shrimp, Sweet and Sour Spare Ribs, and Maodu. Additionally, there are unique Hangzhou-style pastries like Osmanthus Cake, Lotus Paste Pastry, and Aiwo Steamed Rice Cakes.')]
"""
# Streaming dialogue output with verbose mode
inference(model, template, 'What was the first question?', history, verbose=True, stream=True)
"""Output[1]:
[PROMPT]
You asked your first question, "What is the provincial capital of Zhejiang?"
[OUTPUT] Your first question was “What is the provincial capital of Zhejiang?”
"""
For more on inference usage, please refer to [here](https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM_Inference_Guide.md).

View File

@@ -0,0 +1,203 @@
## 介绍
[SWIFT](https://github.com/modelscope/swift)Scalable lightWeight Infrastructure for Fine-Tuning是一个可扩展的轻量级一站式训练、推理深度学习框架。它集成了各种高效的微调方法如LoRA、QLoRA、阿里云自研的ResTuning-Bypass等以及开箱即用的训练推理脚本使开发者可以在单张商业级显卡上微调推理LLM&AIGC模型。此外SWIFT与PEFT完全兼容使开发者可以在ModelScope模型体系中使用PEFT的能力。
## 安装
```shell
# 设置pip全局镜像
pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/
# 安装ms-swift
git clone https://github.com/modelscope/swift.git
cd swift
pip install -e .[llm]
# 如果你想要使用deepspeed.
pip install deepspeed -U
# 如果你想要使用基于auto_gptq的qlora训练. (推荐, 效果优于bnb)
# 支持auto_gptq的模型: `https://github.com/modelscope/swift/blob/main/docs/source/LLM/支持的模型和数据集.md#模型`
# auto_gptq和cuda版本有对应关系请按照`https://github.com/PanQiWei/AutoGPTQ#quick-installation`选择版本
pip install auto_gptq -U
# 如果你想要使用基于bnb的qlora训练.
pip install bitsandbytes -U
# 环境对齐 (如果你运行错误, 可以跑下面的代码, 仓库使用最新环境测试)
pip install -r requirements/framework.txt -U
pip install -r requirements/llm.txt -U
```
## webui使用
执行如下命令启动webui通过界面方式进行模型训练推理
```shell
swift web-ui
```
界面示例如下
![image](https://modelscope.oss-cn-beijing.aliyuncs.com/resource/swift_webui.jpg)
## 微调
```python
# Experimental environment: A10, 3090, V100, ...
# 20GB GPU memory
CUDA_VISIBLE_DEVICES=0 \
swift sft \
--model_id_or_path qwen/Qwen-7B-Chat \
--dataset blossom-math-zh \
--output_dir output \
# 使用自己的数据集
CUDA_VISIBLE_DEVICES=0 \
swift sft \
--model_id_or_path qwen/Qwen-7B-Chat \
--custom_train_dataset_path chatml.jsonl \
--output_dir output \
# 使用DDP
# Experimental environment: 2 * 3090
# 2 * 23GB GPU memory
CUDA_VISIBLE_DEVICES=0,1 \
NPROC_PER_NODE=2 \
swift sft \
--model_id_or_path qwen/Qwen-7B-Chat \
--dataset blossom-math-zh \
--output_dir output \
# 多机多卡
# node0
CUDA_VISIBLE_DEVICES=0,1,2,3 \
NNODES=2 \
NODE_RANK=0 \
MASTER_ADDR=127.0.0.1 \
NPROC_PER_NODE=4 \
swift sft \
--model_id_or_path qwen/Qwen-7B-Chat \
--dataset blossom-math-zh \
--output_dir output \
# node1
CUDA_VISIBLE_DEVICES=0,1,2,3 \
NNODES=2 \
NODE_RANK=1 \
MASTER_ADDR=xxx.xxx.xxx.xxx \
NPROC_PER_NODE=4 \
swift sft \
--model_id_or_path qwen/Qwen-7B-Chat \
--dataset blossom-math-zh \
--output_dir output \
```
更多微调方法参考[这里](https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM%E5%BE%AE%E8%B0%83%E6%96%87%E6%A1%A3.md#%E5%BE%AE%E8%B0%83)
已有微调代码示例
| 模型名称 | 训练方法 |
|:-------------------|:---------------------------------------------------------------------------------------------------------------------------|
| qwen_14b | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b/lora_ddp_ds) |
| qwen_14b | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b/qlora) |
| qwen_14b | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b/qlora_ddp_ds) |
| qwen_14b_chat | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat/lora_ddp_ds) |
| qwen_14b_chat | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat/qlora) |
| qwen_14b_chat | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat/qlora_ddp_ds) |
| qwen_14b_chat_int4 | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat_int4/qlora) |
| qwen_14b_chat_int4 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat_int4/qlora_ddp_ds) |
| qwen_14b_chat_int8 | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat_int8/qlora) |
| qwen_14b_chat_int8 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat_int8/qlora_ddp_ds) |
| qwen_1_8b_chat | [full](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_1_8b_chat/full) |
| qwen_1_8b_chat | [full_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_1_8b_chat/full_ddp) |
| qwen_72b_chat | [lora_mp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat/lora_mp) |
| qwen_72b_chat | [lora_mp_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat/lora_mp_ddp) |
| qwen_72b_chat | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat/qlora) |
| qwen_72b_chat_int4 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat_int4/qlora_ddp_ds) |
| qwen_72b_chat_int8 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat_int8/qlora_ddp_ds) |
| qwen_7b | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b/lora_ddp_ds) |
| qwen_7b | [qlora_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b/qlora_ddp) |
| qwen_7b_chat | [full](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/full) |
| qwen_7b_chat | [full_freeze_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/full_freeze_ddp) |
| qwen_7b_chat | [full_mp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/full_mp) |
| qwen_7b_chat | [full_mp_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/full_mp_ddp) |
| qwen_7b_chat | [lora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/lora) |
| qwen_7b_chat | [lora_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/lora_ddp) |
| qwen_7b_chat | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/lora_ddp_ds) |
| qwen_7b_chat | [lora_mp_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/lora_mp_ddp) |
| qwen_7b_chat | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/qlora) |
| qwen_7b_chat | [qlora_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/qlora_ddp) |
| qwen_7b_chat | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/qlora_ddp_ds) |
| qwen_7b_chat_int4 | [qalora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat_int4/qalora) |
| qwen_7b_chat_int4 | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat_int4/qlora) |
| qwen_7b_chat_int4 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat_int4/qlora_ddp_ds) |
| qwen_7b_chat_int8 | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat_int8/qlora) |
| qwen_7b_chat_int8 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat_int8/qlora_ddp_ds) |
| qwen_audio_chat | [full_mp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_audio_chat/full_mp) |
| qwen_audio_chat | [full_mp_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_audio_chat/full_mp_ddp) |
| qwen_audio_chat | [lora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_audio_chat/lora) |
| qwen_audio_chat | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_audio_chat/lora_ddp_ds) |
| qwen_vl | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl/lora_ddp_ds) |
| qwen_vl_chat | [full_mp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat/full_mp) |
| qwen_vl_chat | [full_mp_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat/full_mp_ddp) |
| qwen_vl_chat | [lora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat/lora) |
| qwen_vl_chat | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat/lora_ddp_ds) |
| qwen_vl_chat | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat/qlora) |
| qwen_vl_chat_int4 | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat_int4/qlora) |
| qwen_vl_chat_int4 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat_int4/qlora_ddp_ds) |
## 推理
```python
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
from swift.llm import (
get_model_tokenizer, get_template, inference, ModelType, get_default_template_type,
)
from swift.utils import seed_everything
model_type = ModelType.qwen_7b_chat
template_type = get_default_template_type(model_type)
print(f'template_type: {template_type}') # template_type: qwen
kwargs = {}
# kwargs['use_flash_attn'] = True # 使用flash_attn
model, tokenizer = get_model_tokenizer(model_type, model_kwargs={'device_map': 'auto'}, **kwargs)
# 修改max_new_tokens
model.generation_config.max_new_tokens = 128
template = get_template(template_type, tokenizer)
seed_everything(42)
query = '浙江的省会在哪里?'
response, history = inference(model, template, query)
print(f'query: {query}')
print(f'response: {response}')
query = '这有什么好吃的?'
response, history = inference(model, template, query, history)
print(f'query: {query}')
print(f'response: {response}')
print(f'history: {history}')
"""Out[0]
query: 浙江的省会在哪里?
response: 浙江省的省会是杭州。
query: 这有什么好吃的?
response: 杭州市有很多著名的美食,例如西湖醋鱼、龙井虾仁、糖醋排骨、毛血旺等。此外,还有杭州特色的点心,如桂花糕、荷花酥、艾窝窝等。
history: [('浙江的省会在哪里?', '浙江省的省会是杭州。'), ('这有什么好吃的?', '杭州市有很多著名的美食,例如西湖醋鱼、龙井虾仁、糖醋排骨、毛血旺等。此外,还有杭州特色的点心,如桂花糕、荷花酥、艾窝窝等。')]
"""
# 流式输出对话模板
inference(model, template, '第一个问题是什么', history, verbose=True, stream=True)
"""Out[1]
[PROMPT]<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
浙江的省会在哪里?<|im_end|>
<|im_start|>assistant
浙江省的省会是杭州。<|im_end|>
<|im_start|>user
这有什么好吃的?<|im_end|>
<|im_start|>assistant
杭州市有很多著名的美食,例如西湖醋鱼、龙井虾仁、糖醋排骨、毛血旺等。此外,还有杭州特色的点心,如桂花糕、荷花酥、艾窝窝等。<|im_end|>
<|im_start|>user
第一个问题是什么<|im_end|>
<|im_start|>assistant
[OUTPUT]你的第一个问题是“浙江的省会在哪里?”<|im_end|>
"""
```
更多推理使用请参考[这里](https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM%E6%8E%A8%E7%90%86%E6%96%87%E6%A1%A3.md)