当前位置: 首页 > news >正文

深度学习:大模型Decoding+MindSpore NLP分布式推理详解

大模型推理流程

1. 用户输入提示词(Prompt)

假设用户输入为:“从前,有一只小猫,它喜欢……”

我们的目标是让模型生成一段完整的故事。

2. 模型处理用户输入

2.1 分词:输入提示被分词为模型可以理解的子词(subword)或单词(token)。

例如:

"从前,有一只小猫,它喜欢……" 可能被分词为:

["从前", ",", "有", "一只", "小猫", ",", "它", "喜欢", "……"]

这些 token 会被映射为模型词汇表中的索引(ID)。也就是Tokenizer分词器返回的input_ids

2.2 将IDs转为embeddings

每个 token 被转换为一个高维向量(embedding),这些向量包含了语义信息。模型通过嵌入层将 token 索引映射为向量。

用户输入的input_ids形状为:(1, 9),表示batch中有一个样本,样本序列长度为9。

嵌入层(Embedding Layer)将每个 token 索引映射为一个高维向量。这个向量的维度是 hidden_size,即模型的隐藏层维度。hidden_size为模型的超参数,由设计者决定。

经过嵌入层后,输入的形状会从 (batch_size, seq_length) 变为 (batch_size, seq_length, hidden_size)。例如,(1, 9) 会变为 (1, 9, hidden_size)

2.3 对张量加入位置编码

为了保留输入序列的顺序信息,模型会为每个 token 添加位置编码。这些编码与 token 嵌入相加,形成最终的输入表示。

位置编码(Positional Encoding) 的张量维度大小与 输入嵌入(Input Embedding) 的维度大小完全相同,并且它们会直接在最后一个维度上相加。

  • 输入嵌入的形状
    输入嵌入的输出形状是 (batch_size, seq_length, hidden_size),其中 hidden_size 是每个 token 的嵌入维度。

  • 位置编码的形状
    位置编码的形状也是 (batch_size, seq_length, hidden_size),与输入嵌入的形状完全一致。

位置编码可以保留语句的顺序信息,直接将位置信息注入语句中。

3. 前向传播

将处理过的用户输入张量输入模型进行前向计算。

4. 生成输出

在自回归生成任务中,模型会逐步生成 token,每次生成一个 token。因此,输出结果的形状会随着生成过程而变化。

  • 输入形状:(1, 9)

  • 模型输出的概率分布形状:(1, 9, vocab_size)

  • 生成下一个 token 的形状:(1, 1)

4.1 输出概率分布

最后一层 Transformer 的输出会通过一个线性层和 softmax 函数,生成每个可能 token 的概率分布。例如,模型可能会预测下一个 token 是“玩耍”的概率为 0.4,“睡觉”的概率为 0.3,等等。

4.2 解码策略(Decoding Strategy)

模型根据概率分布选择下一个 token。常见的解码策略包括:

  • 贪婪搜索(Greedy Search)
    选择概率最高的 token。例如,选择“玩耍”作为下一个 token。

  • 束搜索(Beam Search)
    保留多个候选序列,选择整体概率最高的序列。

  • 采样(Sampling)
    根据概率分布随机采样下一个 token。

输出的概率分布 和 随机采样的概率分布 之间有直接的联系!随机采样是基于模型输出的概率分布进行的,因此两者密切相关。

  • 随机采样的基础
    随机采样直接依赖于模型输出的概率分布。概率分布决定了每个 token 被采样的可能性。

  • 概率分布的作用
    概率分布反映了模型对每个 token 的“信心”或“偏好”。高概率的 token 更有可能被采样,而低概率的 token 也有可能被采样到(尤其是在多样性较高的场景中)。

  • 采样结果的不确定性
    由于采样是随机的,即使概率分布相同,每次采样的结果也可能不同。这与贪婪搜索(总是选择最高概率的 token)形成对比。

Top-K和Top-P策略可以与温度Temperature结合使用。

5. 迭代生成

5.1 递归生成

模型将生成的 token 重新作为输入,继续生成下一个 token。例如:

  1. 输入提示:“从前,有一只小猫,它喜欢……”

  2. 模型生成:“玩耍”

  3. 新输入:“从前,有一只小猫,它喜欢玩耍”

  4. 模型继续生成:“,每天……”

生成过程会持续,直到达到最大生成长度或生成特殊的终止 token(如 <EOS>)。

6. 最终输出

最终,模型生成的完整故事可能是:
“从前,有一只小猫,它喜欢玩耍,每天都会在花园里追逐蝴蝶。有一天,它遇到了一只小鸟……”

LLM模型不是直接使用贪心解码策略(选择概率最高的token作为输出),如果使用贪心解码册啰,对于相同输入序列LLM模型每次都会给出相同回复(推理模式下参数固定,不存在随机性)。所以, 

不同的大模型解码策略

假设模型正在预测“The cat”的下一个token,模型预测结果如下:

• sat  (0.5)

• jumped  (0.3)

• is  (0.1)

• slept  (0.05)

• runs  (0.05)

1. Top-k 采样

Top-k 采样将随机性引入解码过程,通过限制输出token的集合在Top-k个概率最高的token。下一个输出的token将在Top-k个token中随机采样生成。

在案例中,Top-k 采样会选出概率最高的sat(0.5)和jumped(0.3),随后从这两个token中随机采样出下一个预测的token作为模型的输出。

2. Top-p 采样

Top-p 采样首先通过设置一个限制值P,随后按照概率大小选取n个token,直至token累计的概率达到P。随后对n个token进行随机采样。

在案例中,Top-p 采样回选出sat(0.5),jumped(0.3)和is(0.1),随后对这三个token进行随机采样出下一个token。

3. 温度采样

温度Temperature,作为一个超参数,可以控制选择token的概率分布。预测的概率分布会被因子 1/T进行缩放,T则是温度。

  • 当T = 1时,概率分布不发生变化。
  • 当T > 1时,模型输出变得更为随机,小概率的token更容易出现。
  • 当T < 1时,模型输出变得更有确定性,高概率的token更容易得到选择。

温度高时,模型会变得“更有创造性”;温度较低时,模型变得“更加精准”。

4. 束搜索

束搜索是更加精密的贪心搜索策略,它会保留top-k个序列同时进行扩展。

  • 在每一步,模型生成 top-k 个最可能的词汇,并继续解码每一个 k 个序列。
  • 参数 beam width(k)决定了每一步保留多少个候选序列。
  • 在每一步之后,模型根据累积概率对 k 个序列进行排序,并保留概率最高的 k 个序列用于进一步扩展。

在案例中,假设beam的数量为2。那么我们将会选出概率最高的2个token用于后续生成。

“The cat sat”(累计概率:0.5)

“The cat jumped”(累计概率:0.3)

模型继续扩充两个序列,如:

“The cat sat on the mat”

“The cat jumped over the fence”

Beam-Search后续发展有Diverse Beam-Search

不同解码策略的使用场景

  • 贪婪解码(Greedy Decoding
  • 当需要快速生成文本且对生成质量要求不是特别高时,贪婪解码是一个简单且计算效率高的选择。它选择 具有最大logit值的token作为下一个输出,适用于需要快速响应的场景,如聊天机器人的初步响应生成。
  • 束搜索(Beam Search):
  • 适用于需要精确控制输出质量的场景,如机器翻译或问答系统。束搜索通过考虑多个候选序列来生成文本, 可以提高翻译的准确性和流畅性。
  • 抽样解码(Sampling Decoding
  • 适用于需要多样性输出的场景,如创意写作或开放性问题的回答。抽样解码从词汇表中根据概率分布选择 token,可以通过调整参数如温度(Temperature)来控制随机性。
  • Top-KTop-P
  • 适用于需要控制输出长度和提高生成质量的场景。Top-KTop-P通过限制候选token的数量来提高生 成的连贯性和减少重复,适用于需要高质量输出的任务。
  • 温度采样(Temperature Sampling
  • 适用于需要在生成过程中增加随机性的场景,如创意写作或探索性任务。温度参数可以调整输出的随 机度,较低的温度值会使采样更接近确定性解码,而较高的温度值则增加随机性。

MindSpore进行解码推理

创建Notebook

mindspore==2.3.0, cann==8.0

更新mindspore

pip install --upgrade mindspore

克隆mindnlp

git clone https://github.com/mindspore-lab/mindnlp.git

 更新mindnlp

cd mindnlp
bash scripts/build_and_reinstall.sh

卸载mindformers

pip uninstall mindformers

加载模型与转换输入

import mindspore
from mindnlp.transformers import AutoTokenizer, AutoModelForCausalLMmodel_id = "LLM-Research/Meta-Llama-3-8B-Instruct"
# 下载Llama 3的分词器
tokenizer = AutoTokenizer.from_pretrained(model_id, mirror="modelscope")# 下载Llama 3模型
model = AutoModelForCausalLM.from_pretrained(model_id,ms_dtype=mindspore.float16,mirror="modelscope"
)# 输入信息
messages = [{"role": "system", "content": "You are a psychological counsellor, who is good at emotional comfort"},{"role": "user", "content": "I don't sleep well for a long time."}
]
# 将输入信息转为input_ids
input_ids = tokenizer.apply_chat_template(messages,add_generation_prompt=True,return_tensors="ms"
)
# 声明预测的终止token
terminators = [tokenizer.eos_token_id,tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
# 模型生产结果
outputs = model.generate(input_ids, # 输入tokenmax_new_tokens=50, # 限制输出长度eos_token_id=terminators, # 声明终止符do_sample=True, # 是否对输出进行概率分布采样top_p=1.0 # 声明top-p值
)

贪心策略

# 贪心策略
# 模型生产结果
outputs = model.generate(input_ids, # 输入tokenmax_new_tokens=1000, # 限制输出长度eos_token_id=terminators, # 声明终止符do_sample=False, # 是否对输出进行概率分布采样
)response = outputs[0][input_ids.shape[-1]:]
tokenizer.decode(response, skip_special_tokens=True)

模型输出:

"I'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect many aspects of your daily life. Can you tell me a bit more about what's been going on? What's been keeping you awake at night? Is it stress, anxiety, or something else?\n\nAlso, have you noticed any patterns or triggers that might be contributing to your insomnia? For example, do you find yourself lying awake for hours, or do you wake up multiple times during the night?\n\nRemember, I'm here to listen and support you, and I want you to feel comfortable sharing as much or as little as you'd like."

重复多次模型输出结果未发生变化 

Temperature参数

temperature控制文本生成的随机性和多样性,控制输出张量的概率分布。

import mindspore
from mindspore import Tensor
import numpy as np
import mindspore.ops as opslogits = Tensor(np.array([[0.5, 1.2, -1.0, 0.1]]), mindspore.float32)probs = ops.softmax(logits, axis=-1)
# low temp = 0.5
# 分布更为集中(陡峭)
probs_low = ops.softmax(logits / 0.5, axis=-1)
# high temp = 2
# 分布更为分散(平缓)
probs_high = ops.softmax(logits / 2, axis=-1)probs, probs_low, probs_high
(Tensor(shape=[1, 4], dtype=Float32, value=[[ 2.55937576e-01,  5.15394986e-01,  5.71073927e-02,  1.71560094e-01]]),Tensor(shape=[1, 4], dtype=Float32, value=[[ 1.80040166e-01,  7.30098903e-01,  8.96367151e-03,  8.08972642e-02]]),Tensor(shape=[1, 4], dtype=Float32, value=[[ 2.69529819e-01,  3.82481009e-01,  1.27316862e-01,  2.20672339e-01]]))

可以看出温度越高,分布越平缓,温度越低,分布越集中

temerature=1

# 模型生产结果
outputs = model.generate(input_ids, # 输入tokenmax_new_tokens=1000, # 限制输出长度eos_token_id=terminators, # 声明终止符do_sample=True, # 是否对输出进行概率分布采样temperature=1
)
# 标准温度输出
response = outputs[0][input_ids.shape[-1]:]
tokenizer.decode(response, skip_special_tokens=True)

输出1:

"I'm so sorry to hear that you're struggling with sleep. It can be really tough to deal with insomnia or disrupted sleep patterns. Can you tell me a bit more about what's been going on? What's been on your mind lately that might be keeping you awake? Has anything changed in your life that could be contributing to this difficulty?"

输出2:

"I'm so sorry to hear that you're struggling with sleep. It can be such a frustrating and debilitating experience. Can you tell me a bit more about what's been going on for you? What's been making it hard for you to fall asleep or stay asleep? Is it racing thoughts, stress, anxiety, or something else?\n\nAlso, how long have you been experiencing this sleep difficulty? Has it been a recent development or has it been going on for a while?"

temperature=0.1

输出1:

"I'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect many aspects of your daily life. Can you tell me a bit more about what's been going on? What are some of the things that make it hard for you to fall asleep or stay asleep? Is it stress, anxiety, physical discomfort, or something else?\n\nAlso, have you noticed any patterns or triggers that seem to make it worse? For example, do you tend to have trouble sleeping on certain nights of the week, or after certain events or activities?\n\nRemember, I'm here to listen and support you, and I want you to feel comfortable sharing as much or as little as you'd like."

输出2:

"I'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect many aspects of your daily life. Can you tell me a bit more about what's been going on? What's been on your mind lately, and how have you been feeling when you wake up in the morning?"

输出3:

"I'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect many aspects of your daily life. Can you tell me a bit more about what's been going on? What's been on your mind lately that might be making it hard for you to fall asleep or stay asleep?"

temperature=2

# 模型生产结果
outputs = model.generate(input_ids, # 输入tokenmax_new_tokens=1000, # 限制输出长度eos_token_id=terminators, # 声明终止符do_sample=True, # 是否对输出进行概率分布采样temperature=2.0
)
# 高温度输出->概率分布更为分散
response = outputs[0][input_ids.shape[-1]:]
tokenizer.decode(response, skip_special_tokens=True)

输出1:

"I'm so sorry to hear that. Not getting proper sleep can be really wearing on your emotional and physical well-being. Can you tell me a little bit more about how this lack of sleep is affecting you? Are you feeling constantly exhausted, irritable, or struggling to concentrate? Have you noticed any changes in your relationships or daily routine because of it?\n\nMost importantly, I'm here for you, and I believe that by exploring this together, we can find ways to improve your sleep and improve your overall well-being.\n\nIt might be helpful for me to share that sometimes, lack of sleep can be a sign of underlying anxiety, stress, or even unprocessed emotions. If we can identify the root cause, I may have some suggestions on how to ease your path to better sleep.\n\nWould you like me to offer you some coping strategies to help you relax and unwind before bedtime? Sometimes, a simple change in routine or relaxation techniques can make a world of difference."

 输出2:

"It can be really frustrating and worrying when sleep evade you, making it hard to wake up feeling refreshed and energized. I'm listening, and I want you to know that I'm here to support you. It's important to recognize that this is a tough and normal experience, even if it can be tough to bear right now.\n\nWould you like to talk more about what's going on when you have trouble sleeping? Is there anything in particular that bothers you or stress you out?"

输出3:

"It can be really distressing to deal with chronic sleep issues, not getting the rest you need and feeling tired and exhausted all the time. Can you tell me a little bit more about how you've been feeling? Have you noticed any patterns or triggers that might be contributing to the issue? And how has it been affecting other aspects of your life?\n\nAlso, I want you to know that as your listener, my main goal right now is just to support and provide comfort. Whatever you share, I'm here for you. No judgments, no critiques, just a gentle and compassionate space for you to express yourself.\n\nRemember, it takes a lot of courage to share vulnerable thoughts and feelings with someone like me, and I want to assure you that your feelings are completely normal and valid. Okay?"

 束搜索(Beam Search)

# Beam Search 束搜索
beam_outputs = model.generate(input_ids, # 输入tokenmax_new_tokens=100, # 限制输出长度eos_token_id=terminators, # 声明终止符num_return_sequences=5,early_stopping=True,
)for i, beam_output in enumerate(beam_outputs):print("{}: {}".format(i, tokenizer.decode(beam_output, skip_special_tokens=True)))
输出1:
0: systemYou are a psychological counsellor, who is good at emotional comfortuserI don't sleep well for a long time.assistantI'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect so many areas of your life. Can you tell me a bit more about what's been going on? When did you start having trouble sleeping, and what do you usually do when you try to go to bed? Do you find yourself lying awake for hours, or do you fall asleep but then wake up multiple times throughout the night?输出2:
1: systemYou are a psychological counsellor, who is good at emotional comfortuserI don't sleep well for a long time.assistantI'm so sorry to hear that you're struggling with sleep. It can be such a frustrating and unsettling experience, affecting not just your physical health but also your emotional well-being.First of all, please know that you're not alone in this struggle. Many people face difficulties with sleep from time to time, and it's not uncommon for it to be a persistent issue for some. It's okay to acknowledge that you're struggling, and it takes a lot of courage to reach out and talk about输出3:
2: systemYou are a psychological counsellor, who is good at emotional comfortuserI don't sleep well for a long time.assistantI'm so sorry to hear that you're struggling with sleep. It can be such a frustrating and isolating experience. It's like your body is refusing to cooperate with your mind, and it can leave you feeling exhausted, irritable, and just plain miserable.First of all, let me just acknowledge that it's okay to not be okay. It's okay to struggle with sleep, and it's okay to feel overwhelmed and stuck. I'm here to listen, and I want you to know输出4:
3: systemYou are a psychological counsellor, who is good at emotional comfortuserI don't sleep well for a long time.assistantI'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect many aspects of your daily life. Can you tell me a bit more about what's been going on? What are some common things that happen when you try to fall asleep, and what do you usually do when you wake up during the night?输出5:
4: systemYou are a psychological counsellor, who is good at emotional comfortuserI don't sleep well for a long time.assistantI'm so sorry to hear that you're struggling with sleep. It can be such a frustrating and exhausting experience. Can you tell me a bit more about what's been going on for you? When did you first start noticing that your sleep was affected, and what do you think might be contributing to it? Is it stress, anxiety, or something else entirely?Remember, everything we discuss is completely confidential and a safe space for you to express yourself. I'm here to listen and support you,
​​​​​
# Beam Search 束搜索
beam_outputs = model.generate(input_ids, # 输入tokenmax_new_tokens=100, # 限制输出长度eos_token_id=terminators, # 声明终止符num_return_sequences=5,early_stopping=True,no_repeat_ngram_size=2 # 设置此参数可以避免多句之间存在重复词组
)for i, beam_output in enumerate(beam_outputs):print("{}: {}".format(i, tokenizer.decode(beam_output, skip_special_tokens=True)))
输出1:
0: systemYou are a psychological counsellor, who is good at emotional comfortuserI don't sleep well for a long time.assistantIt can be really tough to deal with sleepless nights, and it's completely normal to feel frustrated, anxious, or even a bit hopeless. Can you tell me a little bit more about what's been going on? What's making it hard for you to fall asleep, do you think? Is it stress, worries, physical discomfort, something else, a combination of things?Also, how have you been coping with the lack of sleep? Have you noticed any changes in your daily life, mood输出2:
1: systemYou are a psychological counsellor, who is good at emotional comfortuserI don't sleep well for a long time.assistantSweetheart, I'm so sorry to hear that you're struggling with sleep. It can be really tough to deal with, both physically and emotionally. Can you tell me a bit more about what's been going on for you? What's making it hard foryou to fall asleep or stay asleep? Is it stress, anxiety, or something else entirely?Remember, everything we discuss is completely confidential and a safe space for us to explore your feelings. I want you to know that I believe in your输出3:
2: systemYou are a psychological counsellor, who is good at emotional comfortuserI don't sleep well for a long time.assistantSweetheart, I'm so sorry to hear that you're struggling with sleep. It can be such a frustrating and exhausting experience. Can you tell me a bit more about what's been going on? When did you start noticing that your sleep was affected, and what are some of the things that make it hard for you to fall asleep or stay asleep?输出4:
3: systemYou are a psychological counsellor, who is good at emotional comfortuserI don't sleep well for a long time.assistantIt sounds like you're struggling with insomnia or difficulty sleeping, and that's really tough. Not getting enough sleep can affect so many aspects of our lives, from our mood to our energy levels to even our physical health.First of all, I want you to know that you don’t have to go through this alone. I'm here to listen and support you. Can you tell me a bit more about what's been going on? When did you start noticing trouble sleeping? Is it a sudden change输出5:
4: systemYou are a psychological counsellor, who is good at emotional comfortuserI don't sleep well for a long time.assistantSweetheart, I'm so sorry to hear that you're struggling with sleep. It can be such a frustrating and debilitating experience, feeling like you can't get a good night's rest. Can you tell me a bit more about what's been going on for you? When did you first start noticing that your sleep was affected, and what are some of the things that keep you awake at night?

Top-K 采样

# top-k采样
outputs = model.generate(input_ids, # 输入tokenmax_new_tokens=100, # 限制输出长度eos_token_id=terminators, # 声明终止符do_sample=True,top_k=5,num_return_sequences=3
)for i, output in enumerate(outputs):# response = output[0][input_ids.shape[-1]:]print("{}: {}".format(i+1, tokenizer.decode(response, skip_special_tokens=True)))

k=5,只选取前5个概率最高的值进行采样,结果会缺乏创意性。

输出1:
"I'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect many aspects of your daily life. Can you tell me a bit more about what's been going on? When did you start having trouble sleeping, and what's been making it hard for you to fall asleep or stay asleep? Is it stress, anxiety, physical discomfort, or something else entirely?\n\nAlso, how have you been feeling during the day? Are you feeling tired, irritable, or just"

  输出2:

"I'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect so many aspects of your daily life. Can you tell me a bit more about what's been going on? What are some of the things that are making it hard for you to fall asleep or stay asleep? Is it stress, anxiety, or something else entirely?\n\nAlso, have you noticed any patterns or triggers that seem to make it worse? For example, do you tend to have trouble sleeping during"

 输出3:

"I'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect so many areas of your life. Can you tell me a bit more about what's been going on? Have you noticed any patterns or triggers that might be contributing to your insomnia? And how have you been feeling during the day when you're not getting a good night's sleep?"

k = 1000,采样具有随机性

# top-k采样
outputs = model.generate(input_ids, # 输入tokenmax_new_tokens=100, # 限制输出长度eos_token_id=terminators, # 声明终止符do_sample=True,top_k=1000
)response = outputs[0][input_ids.shape[-1]:]
tokenizer.decode(response, skip_special_tokens=True)

输出1:

"I'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect many areas of your life. Can you tell me a bit more about what's been going on? What's been keeping you awake at night? Is it stress, anxiety, or something else entirely?\n\nRemember, I'm here to listen and offer support. I'm not here to judge or try to fix the problem right away. Just talking about it can sometimes help you feel a bit better.\n\nAlso"

输出2:

"I'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect many aspects of your daily life. Can you tell me a bit more about what's been going on? Is it just difficulty falling asleep, or are you having trouble staying asleep or experiencing restless nights? And have you noticed any patterns or triggers that might be contributing to your sleep issues?\n\nAlso, I want you to know that it's completely normal to struggle with sleep from time to time, and it"

输出3:

"I'm so sorry to hear that you're struggling with sleep. It can be such a frustrating and exhausting experience. Can you tell me a bit more about what's been going on? What's been keeping you awake at night? Is it stress, anxiety, or something else entirely?\n\nRemember, everything we discuss is confidential and a safe space for you to share your feelings. I'm here to listen and offer support.\n\nAlso, I want you to know that you're not alone in this struggle."

Top-P 采样 

# top-p采样
outputs = model.generate(input_ids, # 输入tokenmax_new_tokens=100, # 限制输出长度eos_token_id=terminators, # 声明终止符do_sample=True,top_p=0.5
)response = outputs[0][input_ids.shape[-1]:]
tokenizer.decode(response, skip_special_tokens=True)

p = 0.5 

输出1:

"I'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect many aspects of your daily life. Can you tell me a bit more about what's been going on? What are some of the things that make it hard for you to fall asleep or stay asleep? Is it stress, anxiety, physical discomfort, or something else?\n\nAlso, have you noticed any patterns or triggers that seem to make it worse? For example, do you tend to have trouble sleeping on"

输出2:

"I'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect many aspects of your daily life. Can you tell me a bit more about what's been going on? What are some of the things that are making it hard for you to fall asleep or stay asleep? Is it stress, anxiety, physical discomfort, or something else?\n\nAlso, have you noticed any patterns or triggers that seem to make it worse? For example, do you tend to have trouble sleeping"

输出3:

"I'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect many aspects of your daily life. Can you tell me a bit more about what's been going on? What are some of the things that make it hard for you to fall asleep or stay asleep? Is it stress, anxiety, physical discomfort, or something else?\n\nAlso, have you noticed any patterns or triggers that seem to make it worse? For example, do you tend to have trouble sleeping on"

p = 0.95

输出1:

"I'm so sorry to hear that you're struggling with sleep. It can be such a frustrating and exhausting experience. Can you tell me a bit more about what's been going on? When did you first start noticing that you weren't sleeping well, and what are some of the things that you've tried to help you get a good night's rest?\n\nIt's also important to acknowledge that it's okay to not be okay. It takes a lot of courage to admit when we're struggling, and"

输出2:

"I'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect many aspects of your daily life. Can you tell me a bit more about what's been going on? Have you noticed any patterns or triggers that might be contributing to your insomnia?"

输出3:

"I'm so sorry to hear that you're struggling with sleep. It can be really frustrating and affect so many aspects of your daily life. Can you tell me a bit more about what's been going on? What are some of the things that are making it hard for you to fall asleep or stay asleep? Is it stress, anxiety, or something else?"

MindNLP并行推理——多进程多卡

示例代码地址:

https://github.com/mindspore-lab/mindnlp/tree/master/llm/inference/llama3

仓库中的 readme 文件说明了多卡推理的使用方法

注意:选择modelarts中贵阳区域的镜像:mindspore==2.3.0 cann==8.0.0 启动后不需要升级mindspore版本,否则hccl通信算子库将无法兼容。

推荐:使用msrun命令

msrun是mindspore定义的一个多进程并行命令,使用该命令可以获得最佳性能。

msrun --worker_num=2 --local_worker_num=2 --master_port=8118 --join=True run_llama3_distributed.py
# 具体数量根据你有多少张卡进行执行
  1. --worker_num=2:

    指定总共有 2 个工作节点(worker)参与任务。这些工作节点可以是不同的机器或不同的进程。
  2. --local_worker_num=2:

    指定在当前机器上启动 2 个工作节点。这意味着在当前机器上会有 2 个进程参与任务。
  3. --master_port=8118:

    指定主节点(master)的端口号为 8118。主节点负责协调各个工作节点的通信和任务分配。
  4. --join=True:

    表示工作节点在启动后会加入主节点的任务。通常用于确保所有工作节点都连接到主节点并准备好执行任务。

run_llama3_distributed.py文件具体如下:

# 导入 MindSpore 框架,用于深度学习任务
import mindspore# 从 MindSpore 的通信模块中导入 init 函数,用于初始化分布式训练环境
from mindspore.communication import init# 从 MindNLp 库中导入 AutoTokenizer 和 AutoModelForCausalLM 类,用于加载预训练模型和分词器
from mindnlp.transformers import AutoTokenizer, AutoModelForCausalLM# 定义模型 ID,这里使用的是 Meta-Llama-3-8B-Instruct 模型
model_id = "LLM-Research/Meta-Llama-3-8B-Instruct"# 初始化分布式训练环境,确保多机多卡之间的通信正常
init()# 使用 AutoTokenizer 从预训练模型加载分词器
# mirror='modelscope' 指定从 ModelScope 平台下载模型
tokenizer = AutoTokenizer.from_pretrained(model_id, mirror='modelscope')# 使用 AutoModelForCausalLM 从预训练模型加载语言模型
# ms_dtype=mindspore.float16 指定模型使用半精度浮点数(float16)进行计算
# mirror='modelscope' 指定从 ModelScope 平台下载模型
# device_map="auto" 自动分配模型到可用设备(如 GPU 或 CPU)
model = AutoModelForCausalLM.from_pretrained(model_id,ms_dtype=mindspore.float16,mirror='modelscope',device_map="auto"
)# 定义对话消息列表,包含系统提示和用户输入
messages = [{"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},{"role": "user", "content": "Who are you?"},
]# 使用分词器将对话消息转换为模型输入的张量
# add_generation_prompt=True 添加生成提示,确保模型知道需要生成回复
# return_tensors="ms" 返回 MindSpore 格式的张量
input_ids = tokenizer.apply_chat_template(messages,add_generation_prompt=True,return_tensors="ms"
)# 定义终止符列表,用于告诉模型何时停止生成文本
# 包括结束符(eos_token_id)和自定义的终止符(<|eot_id|>)
terminators = [tokenizer.eos_token_id,tokenizer.convert_tokens_to_ids("<|eot_id|>")
]# 使用模型生成文本
# input_ids 是输入的张量
# max_new_tokens=100 限制生成的最大 token 数量为 100
# eos_token_id=terminators 指定终止符列表
# do_sample=True 启用采样策略,而不是贪婪解码
# temperature=0.6 控制生成文本的随机性,值越低越确定
# top_p=0.9 使用核采样(nucleus sampling),保留概率质量最高的 90% 的 token
outputs = model.generate(input_ids,max_new_tokens=100,eos_token_id=terminators,do_sample=True,temperature=0.6,top_p=0.9,
)# 从生成的输出中提取模型生成的文本部分
# outputs[0] 是生成的完整序列,input_ids.shape[-1] 是输入的长度
# 通过切片操作获取生成的部分
response = outputs[0][input_ids.shape[-1]:]# 使用分词器将生成的 token 解码为可读的文本
# skip_special_tokens=True 跳过特殊 token(如终止符)
print(tokenizer.decode(response, skip_special_tokens=True))

同时,也可以使用mpirun命令

mpirun -n 2 python run_llama3_distributed.py

关于mindspore的组网方式具体可以参考:

分布式并行启动方式 — MindSpore master 文档

相关文章:

深度学习:大模型Decoding+MindSpore NLP分布式推理详解

大模型推理流程 1. 用户输入提示词&#xff08;Prompt&#xff09; 假设用户输入为&#xff1a;“从前&#xff0c;有一只小猫&#xff0c;它喜欢……” 我们的目标是让模型生成一段完整的故事。 2. 模型处理用户输入 2.1 分词&#xff1a;输入提示被分词为模型可以理解的…...

【JVM中的三色标记法是什么?】

JVM中的三色标记法是什么? 一、基本概念二、标记过程三、优势与问题四、漏标与多标的解决方案三色标记法(Tri-color Marking Algorithm)是Java虚拟机(JVM)中一种用于追踪对象存活状态的垃圾回收算法。 它基于William D. Hana和Mark S. McCulleghan在1976年提出的两色标记法…...

数据库服务体系结构

1. 数据库服务应用配置 服务进行配置有什么作用&#xff1f; 实现服务运行启动 实现某些功能 应用配置有三种方式&#xff1f; 利用编译安装进行配置 编写配置文件信息 ,.默认的配置文件: /etc/my.cnf 利用启动命令参数配置信息&#xff0c;mysqld_safe --skip-grant-tables --…...

vscode项目依赖问题

必读 一定要将前端下拉的项目备份一下&#xff0c;很容易运行导致依赖报错&#xff0c;重新下载 命令 使用幽灵分解器安装 pnpm install 替代 npm install 设置淘宝NPM镜像源 yarn config set registry https://registry.npmmirror.com 查看目前依赖包的版本 npm list ant-d…...

R数据分析:有调节的中介与有中介的调节的整体介绍

单独的有调节的中介或者有中介的调节好多同学还大概能看明白,但是两个东西一起说我发现大部分同学就懵逼了。今天我就尝试将两种方法一起讲讲,重点帮助大家厘清两种方法的异同。 先从整体上看下两者的概念: 有中介的调节首先落脚在调节,调节作用必须是显著的,并且这个调…...

RabbitMQ-消息可靠性以及延迟消息

目录 消息丢失 一、发送者的可靠性 1.1 生产者重试机制 1.2 生产者确认机制 1.3 实现生产者确认 &#xff08;1&#xff09;开启生产者确认 &#xff08;2&#xff09;定义ReturnCallback &#xff08;3&#xff09;定义ConfirmCallback 二、MQ的持久化 2.1 数据持久…...

Hack The Box-Starting Point系列Oopsie

一. 答案 With what kind of tool can intercept web traffic? (什么样的工具可以拦截Web流量?) proxyWhat is the path to the directory on the webserver that returns a login page?(Web服务器上返回登录页面的目录路径是什么?) /cdn-cgi/loginWhat can be modified …...

Linux运维篇-PAM安全模块配置

PAM是什么&#xff1f; PAM&#xff08;可插入认证模块&#xff09;是UNIX操作系统上一个实现模块化的身份验证的服务。当程序需要对用户进行身份验证时加载并执行。PAM文件通常位于/etc/pam.d目录中。 而Linux-PAM&#xff0c;是linux可插拔认证模块&#xff0c;是一套可定制…...

麒麟V10系统上安装Oracle

以下是在麒麟V10系统上安装Oracle数据库的详细步骤&#xff1a; 安装前准备 检查系统版本&#xff1a;使用uname -a、cat /etc/os-release等命令检查服务器是麒麟V10系统。 配置固定IP和本地yum源&#xff1a; 挂载麒麟V10的iso文件到/mnt目录&#xff0c;如mount -o loop Ky…...

项目开发实践——基于SpringBoot+Vue3实现的在线考试系统(七)

文章目录 一、题库管理模块实现1、新增题目功能实现1.1 页面设计1.2 前端功能实现1.3 后端功能实现1.4 效果展示2、题目列表功能实现2.1 页面设计2.2 前端功能实现2.3 后端功能实现2.3.1 后端查询题目列表接口实现2.3.2 后端编辑试题接口实现2.4 效果展示二、代码下载一、题库管…...

Elasticsearch:Jira 连接器教程第二部分 - 6 个优化技巧

作者&#xff1a;来自 Elastic Gustavo Llermaly 将 Jira 连接到 Elasticsearch 后&#xff0c;我们现在将回顾最佳实践以升级此部署。 在本系列的第一部分中&#xff0c;我们配置了 Jira 连接器并将对象索引到 Elasticsearch 中。在第二部分中&#xff0c;我们将回顾一些最佳实…...

Vulnhub Earth靶机攻击实战(一)

导语   首先需要我们进入到https://vulnhub.com/entry/the-planets-earth,755/地址去获取Earth靶机,然后导入到VMware中,如下所示。 文章目录 导入虚拟机信息收集路径扫描破解密码反射Shell提权总结导入虚拟机 下载并导入虚拟机,如下所示。 信息收集 首先我们通过arp-sc…...

51单片机——DS18B20温度传感器

由于DS18B20数字温度传感器是单总线接口&#xff0c;所以需要使用51单片机的一个IO口模拟单总线时序与DS18B20通信&#xff0c;将检测的环境温度读取出来 1、DS18B20模块电路 传感器接口的单总线管脚接至单片机P3.7IO口上 2、DS18B20介绍 2.1 DS18B20外观实物图 管脚1为GN…...

HTML5+Canvas实现的鼠标跟随自定义发光线条源码

源码介绍 HTML5Canvas实现的鼠标跟随自定义发光线条特效源码非常炫酷&#xff0c;在黑色的背景中&#xff0c;鼠标滑过即产生彩色变换的发光线条效果&#xff0c;且线条周围散发出火花飞射四溅的粒子光点特效。 效果预览 源码如下 <!DOCTYPE html PUBLIC "-//W3C//D…...

关于jwt和security

JSON Web Token&#xff08;缩写 JWT&#xff09; 目前最流行、最常见的跨域认证解决方案&#xff0c;前端后端都需要会使用的东西-腾讯云开发者社区-腾讯云 SpringBoot整合Security安全框架、控制权限让我们一起来看看Security吧&#xff01;我想每个写项目的人&#xff0c;都…...

统计学习算法——逻辑斯谛回归

内容来自B站Up主&#xff1a;动画讲编程https://www.bilibili.com/video/BV1CR4y1L7RC、风中摇曳的小萝卜https://www.bilibili.com/video/BV17r4y137bW&#xff0c;仅为个人学习所用。 极大似然估计 几率、概率与似然 几率是指某个事件发生的可能性与不发生的可能性之比&am…...

算法(蓝桥杯)贪心算法5——删数问题的解题思路

问题描述 给定一个高精度的正整数 n&#xff08;n≤1000 位&#xff09;&#xff0c;需要删除其中任意 s 个数字&#xff0c;使得剩下的数字按原左右顺序组成一个新的正整数&#xff0c;并且这个新的正整数最小。例如&#xff0c;对于数字 153748&#xff0c;删除 2 个数字后&a…...

数字孪生发展及应用

一、数字孪生的前世今生 &#xff08;一&#xff09;萌芽的种子&#xff1a;概念的首次提出 数字孪生的概念最早可追溯到 20 世纪 60 年代&#xff0c;美国国家航空航天局&#xff08;NASA&#xff09;在阿波罗计划中&#xff0c;为了训练宇航员和指挥控制人员&#xff0c;使用…...

MYSQL对表的增删改查

表的基本操作 创建表create table [if not exists] <tableName> (<columnName> <columnType> [constraints] [comment] , ...<columnName> <columnType> [constraints] [comment] ) ;删除表drop table [if exists] <tableName> ;…...

左神算法基础提升--4

文章目录 树形dp问题Morris遍历 树形dp问题 求解这个问题需要用到我们在基础班上学到的从节点的左子树和右子树上拿信息的方法。 求最大距离主要分为两种情况&#xff1a;1.当前节点参与最大距离的求解&#xff1b;2.当前节点不参与最大距离的求解&#xff1b; 1.当前节点参与最…...

【docker踩坑记录】

docker踩坑记录 踩坑记录(持续更新中.......)docker images 权限问题 踩坑记录(持续更新中…) docker images 权限问题 permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Head "http://%2Fvar%2Frun%2Fdocker.s…...

CloudberryDB(四)并行执行

要查看CloudberryDB & Greenplum数据库的并行度配置&#xff0c;可以使用以下几种方法&#xff1a; ### 方法一&#xff1a;使用SHOW命令 在Greenplum数据库中&#xff0c;可以使用SHOW命令来查看当前的并行度配置。例如&#xff1a; sql SHOW gp_parallel_degree ; SH…...

LARGE LANGUAGE MODELS ARE HUMAN-LEVEL PROMPT ENGINEERS

题目 大型语言模型是人类级别的提示工程师 论文地址&#xff1a;https://arxiv.org/abs/2211.01910 项目地址&#xff1a;https://github.com/keirp/automatic_prompt_engineer 摘要 通过对自然语言指令进行调节&#xff0c;大语言模型 (LLM) 显示了作为通用计算机的令人印象深…...

rabbitmq安装延迟队列

在RabbitMQ中&#xff0c;延迟队列是一种特殊的队列类型。当消息被发送到此类队列后&#xff0c;不会立即投递给消费者&#xff0c;而是会等待预设的一段时间&#xff0c;待延迟期满后才进行投递。这种队列在多种场景下都极具价值&#xff0c;比如可用于处理需要在特定时间触发…...

Kubernetes (K8s) 入门指南

Kubernetes (K8s) 入门指南 什么是Kubernetes&#xff1f; Kubernetes&#xff0c;通常简称为 K8s&#xff08;因为从 “K” 到 “s” 之间有八个字符&#xff09;&#xff0c;是一个开源的容器编排平台&#xff0c;用于自动化部署、扩展和管理容器化应用程序。它最初由谷歌设…...

Python 调用 Ollama 库:本地大语言模型使用详解

ollama 是一个用于调用本地大语言模型&#xff08;Large Language Models&#xff0c;LLMs&#xff09;的 Python 库&#xff0c;旨在提供简单、高效的 API 接口&#xff0c;以便开发者能够方便地与本地的大语言模型进行交互。以下是关于如何在 Python 中使用 ollama 库的详细介…...

python matplotlib绘图,显示和保存没有标题栏和菜单栏的图像

目录 1. 使用plt.savefig保存无边框图形 2. 显示在屏幕上&#xff0c;并且去掉窗口的标题栏和工具栏 3. 通过配置 matplotlib 的 backend 和使用 Tkinter&#xff08;或其他图形库&#xff09; 方法 1&#xff1a;使用 TkAgg 后端&#xff0c;并禁用窗口的工具栏和标题栏 …...

无人机(Unmanned Aerial Vehicle, UAV)路径规划介绍

无人机&#xff08;Unmanned Aerial Vehicle, UAV&#xff09;是无人驾驶飞行器的简称。凭借其体积小巧、操作简便、生存能力强等诸多优势&#xff0c;无人机在军事、电力巡检、航空航天与科学研究等诸多领域得到了广泛应用。在执行任务时&#xff0c;无人机可搭载多种传感器设…...

python爬虫入门(实践)

python爬虫入门&#xff08;实践&#xff09; 一、对目标网站进行分析 二、博客爬取 获取博客所有h2标题的路由 确定目标&#xff0c;查看源码 代码实现 """ 获取博客所有h2标题的路由 """url "http://www.crazyant.net"import re…...

于灵动的变量变幻间:函数与计算逻辑的浪漫交织(下)

大家好啊&#xff0c;我是小象٩(๑ω๑)۶ 我的博客&#xff1a;Xiao Xiangζั͡ޓއއ 很高兴见到大家&#xff0c;希望能够和大家一起交流学习&#xff0c;共同进步。 这一节我们主要来学习单个函数的声明与定义&#xff0c;static和extern… 这里写目录标题 一、单个函数…...

用织梦做网站能练技术吗/google app

阿甘正传 src url&#xff1a;http://blog.csdn.net/VC_Tony/archive/2010/03/25/5417566.aspx 今天早上看了阿甘正传觉得&#xff0c;我们还是得抽点时间看看电影&#xff0c;特别是像《阿甘正传》这种有韵味的电影&#xff0c;里面的人生哲学真的很值得人们去深思。我已经好…...

花生壳域名可以做网站域名吗/百度网首页

一、效果预览 用的图标都是网上到处拷贝的&#xff0c;仅仅做个示例使用。 截图 gif 二、思路 首先我们需要一个 FilePickerActivity 去显示页面。里面包含一个标题栏&#xff08;ToolBar&#xff09;、路径文本&#xff08;TextView&#xff09;和文件列表&#xff08;Re…...

免费外贸网站模板下载/正规网站优化推广

linux内核是一个整体是结构。因此向内核添加任何东西。或者删除某些功能 ,都十分困难。为了解决这个问题。引入了内核机制。从而可以动态的想内核中添加或者删除模块。模块不被编译在内核中,因而控制了内核的大小。然而模块一旦被插入内核,他就和内核其他部分一样。这样一来 就…...

小榄网站/万网

每周一、五晚8:00-11:00学习计算机体系结构&#xff0c;每周三、日8:00-11:00学习操作系统。 计算机体系结构&#xff1a; Computer Organization and Design: The Hardware/Software Interface, 3ed 计算机体系结构&#xff1a;量化研究方法 3ed操作系统&#xff1a; 操作系统…...

广告联盟上怎么做网站/搜狗收录查询

[关于统计学专业的学习进阶] Introductory 1.1 Introduction to Statistical Reasoning   统计学概念&#xff08;实验设计、描述统计、相关和回归、概率、抽样、机会模型、显著性检验等&#xff09;   Textbook: Seeing Through Statistics, Jessica M. Utts, 2008 1.2 In…...

樟木头镇做网站/手机如何建立网站

Oracle 体系结构&#xff08;29&#xff09;—— Oracle 的数据字典之&#xff08;三&#xff09;&#xff1a;和用户管理有关的数据字典 Oracle 和用户管理相关的数据字典主要有三个&#xff1a;DBA_USERS、USER_USERS、ALL_USERS 一、DBA_USERS 该数据字典用于查询 Oracle…...