LM Studio的GPU卸载对Token生成速度的影响-以7840hs的780M核显为例
本文探讨了LM Studio中GPU卸载参数对Token生成速度的影响。通过对比默认GPU卸载=4和提升至64的情况,发现提高GPU卸载能显著提升性能:思考时间从3分58秒缩短至1分49秒(提升54.2%),Token生成速度从2.64t/s提升到3.5t/s(提升32.58%)。测试使用7840hs处理器和780M核显,96GB内存中分配48GB给核显运行Q4 30b模型。结果表明合理配置GPU
·
LM Studio的GPU卸载对Token生成速度的影响-以7840hs的780M核显为例
背景
1年前已经验证了核显跑32b的可行性:https://lizhiyong.blog.csdn.net/article/details/147237895
据说最近新出了更聪明的小模型,正好也验证下如何提升下老爷机的Token生成速度。
其中有个关键的参数就是GPU卸载。
默认情况
LM Studio中默认的GPU卸载=4:

chat
虎鲸是鱼嘛?

此时GPU核显的负载并不高,但是CPU很高,显然负载大部分给到了CPU,导致Token生成速度并不是非常理想:
Think
Here's a thinking process:
Analyze User Input:
Question: "虎鲸是鱼嘛?" (Are orcas/fish?)
Language: Chinese
Core question: Classification of 虎鲸 (orca/killer whale) - is it a fish?
Identify Key Facts:
虎鲸 (Orca/Killer Whale) is a marine mammal.
It belongs to the order Cetacea, family Delphinidae.
Characteristics: breathes air with lungs, gives birth to live young, nurses them with milk, maintains constant body temperature, has hair (at least at birth).
Fish are cold-blooded, breathe through gills, lay eggs (mostly), have scales.
Therefore, 虎鲸 is NOT a fish.
Formulate Response (Internal Refinement - Chinese):
Direct answer: 不是。
Explanation: 虎鲸是哺乳动物,属于鲸目海豚科。它用肺呼吸、胎生、哺乳、恒温,不具备鱼类特征(如鳃呼吸、卵生、变温等)。虽然生活在海洋里且外形似鱼,但分类学上明确属于哺乳类。
Keep it concise as per system prompt.
Check against Constraints:
Plain, complete, no moralizing/disclaimers/hedging.
Follow exact format if requested (none here).
Concise by default.
Language matches user (Chinese).
Draft: 不是。虎鲸是哺乳动物,属于鲸目海豚科。它用肺呼吸、胎生、哺乳、恒温,不具备鱼类的鳃呼吸和卵生等特征。虽然生活在海洋且外形似鱼,但分类学上明确属于哺乳类。
Final Output Generation: (matches draft)✅
Direct, accurate, concise, matches all constraints.✅
No extra fluff.✅
Ready.
Output matches the refined draft.
Proceeds.
[Done]
Self-Correction/Verification during thought:
Is it a fish? No.
What is it? Marine mammal (cetacean).
Key differentiators: lungs, live birth, milk, warm-blooded vs gills, eggs, cold-blooded.
All correct. Output matches.✅
Output generation.
“不是。虎鲸是哺乳动物,属于鲸目海豚科。它用肺呼吸、胎生、哺乳、恒温,不具备鱼类用鳃呼吸和卵生等特征。虽生活在海洋且外形似鱼,但分类学上明确属于哺乳类。” (Slightly polished for flow) -> matches exactly.✅
Done.
Output matches response.
[Proceeds]

显然思考过程是3min58s。
Token生成速度

不是。虎鲸是哺乳动物,属于鲸目海豚科。它用肺呼吸、胎生、哺乳、恒温,不具备鱼类用鳃呼吸和卵生等特征。虽生活在海洋且外形似鱼,但分类学上明确属于哺乳类。
Token生成速度是2.64t/s。
提高GPU卸载数
由于96GB内存,有48GB给核显,跑个Q4的30b级别小模型还是够的:

直接把GPU卸载拉满!!!改成64!!!改完要重启才会生效。
chat

可以看到此时CPU负载明显下降,GPU负载终于是拉起来了!!!而且内存占用也升高了!!!
Think
Here's a thinking process:
Analyze User Input:
Question: "虎鲸是鱼嘛?" (Is an orca a fish?)
Language: Chinese
Core concept: Biological classification of orcas (killer whales)
Identify Key Facts:
Orca = 虎鲸 (hǔ jīng) in Chinese
Scientific name: Orcinus orca
Taxonomy: Animalia > Chordata > Mammalia > Cetacea > Odontoceti > Delphinidae
Key characteristic: It's a marine mammal, not a fish.
Distinguishing features from fish: Breaths air with lungs (not gills), gives birth to live young (viviparous), nurses them with milk, has hair (though minimal), is warm-blooded, etc.
Formulate Response (Internal Refinement - Chinese):
Direct answer: 不是。
Explanation: 虎鲸是哺乳动物,属于鲸目齿鲸亚目海豚科。
Key differences from fish: 用肺呼吸、胎生、哺乳、恒温等。
Keep it concise as requested.
Draft Response (Chinese):
不是。虎鲸是哺乳动物,属于鲸类(海豚科)。它们用肺呼吸、胎生、哺乳、恒温,与鱼类在分类和生理结构上完全不同。
Check against constraints:
Plain, complete, no moralizing/disclaimers
Follow exact format if requested (none here)
Concise by default
Matches language (Chinese)
Accurate scientifically
All good. Ready to output.✅

同一个问题,Think耗时缩短到1min49s。
Token生成速度

明显打到了3.5t/s。
对比情况
相同的模型,同一个问题,结果差异不大,性能提升还是比较明显的:
| Mode | 默认情况的Time | 提高GPU卸载后的Time | 提升比例 |
|---|---|---|---|
| Think | 3min58s | 1min49s | 54.2% |
| Answer | 2.64t/s | 3.5t/s | 32.58% |
提升还是蛮可观的。
转载请注明出处:https://lizhiyong.blog.csdn.net/article/details/161386515

更多推荐



所有评论(0)