add french readme

This commit is contained in:
JustinLin610
2023-10-17 20:28:36 +08:00
parent 93963f8d1f
commit e6d8deb975
4 changed files with 1150 additions and 16 deletions

View File

@@ -1,5 +1,5 @@
<p align="left">
<a href="README_CN.md">中文</a>&nbsp &nbspEnglish&nbsp &nbsp<a href="README_JA.md">日本語</a>
<a href="README_CN.md">中文</a>&nbsp &nbspEnglish&nbsp &nbsp<a href="README_JA.md">日本語</a> &nbsp<a href="README_FR.md">Français</a>
</p>
<br><br>
@@ -29,10 +29,11 @@ In brief, we have strong base language models, which have been stably pretrained
In this repo, you can figure out:
* Quickstart with Qwen, and enjoy the simple inference.
* Details about the quantization models, including usage, memory, inference speed. For comparison, we also provide the statistics of the BF16 models.
* Details about the quantization models, including GPTQ and KV cache quantization.
* Statistics of inference performance, including speed and memory.
* Tutorials on finetuning, including full-parameter tuning, LoRA, and Q-LoRA.
* Instructions on building demos, including WebUI, CLI demo, etc.
* Instructions on building an OpenAI-style API for your model.
* Introduction to DashScope API service, as well as the instructions on building an OpenAI-style API for your model.
* Information about Qwen for tool use, agent, and code interpreter
* Statistics of long-context understanding evaluation
* License agreement
@@ -341,7 +342,7 @@ We illustrate the model performance of both BF16, Int8 and Int4 models on the be
| Qwen-7B-Chat (Int8) | 55.4 | 59.4 | 48.3 | 34.8 |
| Qwen-7B-Chat (Int4) | 55.1 | 59.2 | 49.7 | 29.9 |
| Qwen-14B-Chat (BF16) | 64.6 | 69.8 | 60.1 | 43.9 |
| Qwen-14B-Chat (Int8) | 63.6 | 68.6 | 60.0 | 48.2 |
| Qwen-14B-Chat (Int8) | 63.6 | 68.6 | 60.0 | 48.2 |
| Qwen-14B-Chat (Int4) | 63.3 | 69.0 | 59.8 | 45.7 |
### Quantization of KV cache
@@ -556,14 +557,15 @@ The finetuning scripts allow you to perform:
- LoRA
- Q-LoRA
Full-parameter parameter finetuning requires updating all parameters in the whole training process. To launch your training, run the following script:
Full-parameter finetuning requires updating all parameters in the whole training process. To launch your training, run the following script:
```bash
# Distributed training. We do not provide single-GPU training script as the insufficient GPU memory will break down the training.
sh finetune/finetune_ds.sh
```
Remember to specify the correct model name or path, the data path, as well as the output directory in the shell scripts. Another thing to notice is that we use DeepSpeed ZeRO 3 in this script. If you want to make changes, just remove the argument `--deepspeed` or make changes in the DeepSpeed configuration json file based on your requirements. Additionally, this script supports mixed-precision training, and thus you can use `--bf16 True` or `--fp16 True`. Empirically we advise you to use bf16 to make your training consistent with our pretraining and alignment if your machine supports bf16, and thus we use it by default.
Remember to specify the correct model name or path, the data path, as well as the output directory in the shell scripts. Another thing to notice is that we use DeepSpeed ZeRO 3 in this script. If you want to make changes, just remove the argument `--deepspeed` or make changes in the DeepSpeed configuration json file based on your requirements. Additionally, this script supports mixed-precision training, and thus you can use `--bf16 True` or `--fp16 True`. Remember to use DeepSpeed when you use fp16 due to mixed precision training.
Empirically we advise you to use bf16 to make your training consistent with our pretraining and alignment if your machine supports bf16, and thus we use it by default.
Similarly, to run LoRA, use another script to run as shown below. Before you start, make sure that you have installed `peft`. Also, you need to specify your paths to your model, data, and output. We advise you to use absolute path for your pretrained model. This is because LoRA only saves the adapter and the absolute path in the adapter configuration json file is used for finding out the pretrained model to load. Also, this script support both bf16 and fp16.
@@ -831,7 +833,7 @@ If you suffer from lack of GPU memory and you would like to run the model on mor
## Tool Usage
Qwen-Chat has been optimized for tool usage and function calling capabilities. Users can develop agents, LangChain applications, and even agument Qwen with a Python Code Interpreter.
Qwen-Chat has been optimized for tool usage and function calling capabilities. Users can develop agents, LangChain applications, and even augment Qwen with a Python Code Interpreter.
We provide documentation on how to implement tool calls based on the principle of ReAct Prompting, please refer to [the ReAct example](examples/react_prompt.md). Based on this principle, we provide support for function calling in [openai_api.py](openai_api.py).