Update README.md

This commit is contained in:
Junyang Lin
2023-10-20 21:50:15 +08:00
committed by GitHub
parent 3e63f107fa
commit d082c2c926

View File

@@ -683,7 +683,8 @@ We profile the GPU memory and training speed of both LoRA (LoRA (emb) refers to
### vLLM ### vLLM
For deployment and fast inference, we suggest using vLLM with FastChat. Install the packages first: For deployment and fast inference, we suggest using vLLM with FastChat. Install the packages first:
```bash ```bash
pip install vllm fastchat pip install vllm
pip install "fschat[model_worker,webui]"
``` ```
Or you can install them from source by `git clone` and `pip install -e .`. We advise you to read their documents if you meet problems in installation. Or you can install them from source by `git clone` and `pip install -e .`. We advise you to read their documents if you meet problems in installation.