ChatGLM-6B加StableDiffusion复现文心一格灵感模式

简单通过ChatGLM-6B扩写提示词，然后生成图片。仅作为基于ChatGLM-6B的应用示例，效果可能较为欠缺，还请体谅~

AI Studio

431人浏览 · 2023-06-20 11:27:02

AI Studio · 2023-06-20 11:27:02 发布

★★★ 本文源自AlStudio社区精品项目，【点击此处】查看更多精品内容 >>>
文心一格是一个AI艺术和创意辅助平台，简单来说可以用来根据提示词画画。文心一格的功能“灵感模式”感觉像是通过对用户Prompt进行扩写，然后再进行生成图片。这里，简单通过ChatGLM-6B扩写提示词，然后生成图片。

仅作为基于ChatGLM-6B的应用示例，效果可能较为欠缺，还请体谅~

准备环境

# 更新环境
!pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/paddlenlp.html # 安装nlp分支最新包
!python -m pip install paddlepaddle-gpu==0.0.0.post112 -f https://www.paddlepaddle.org.cn/whl/linux/gpu/develop.html
!pip install -U ppdiffusers safetensors --user
# 安装完记得重启内核~

ChatGLM-6B模型

在本项目中，ChatGLM-6B用于将给定语句扩张为丰富的提示词

创建模型

import paddle
from paddlenlp.transformers import (
    ChatGLMConfig,
    ChatGLMForConditionalGeneration,
    ChatGLMTokenizer,
)

#读取原始的chatglm-6b模型
model_name_or_path = 'THUDM/chatglm-6b' # 使用该路径会自动下载和加载模型
# model_name_or_path = 'data/data217141' # 本地路径，无需下载，运行更快
tokenizer = ChatGLMTokenizer.from_pretrained(model_name_or_path)

config = ChatGLMConfig.from_pretrained(model_name_or_path)
paddle.set_default_dtype(config.paddle_dtype)

model = ChatGLMForConditionalGeneration.from_pretrained(
    model_name_or_path,
    tensor_parallel_degree=paddle.distributed.get_world_size(),
    tensor_parallel_rank=0,
    load_state_as_np=True,
    dtype=config.paddle_dtype,
)

model.eval()

对话推理

本部分从ChatGLM-6B应用测试，包括全部安装步骤，封装好了调用代码及图形界面摘抄了函数，更多封装后的函数，如多轮对话请参考原项目。

# 辅助函数创建
# 函数定义，用于一问一答
# 输入参数：初始prompt, 最长输入长度，最长输出长度
def glm_single_QA(model,tokenizer,next_inputs,input_length,output_length):
    # 输入格式转换
    inputs = tokenizer(
        next_inputs,
        return_tensors="np",
        padding=True,
        max_length=input_length,
        truncation=True,
        truncation_side="left",
    )
    input_map = {}
    for key in inputs:
        input_map[key] = paddle.to_tensor(inputs[key])

    # 获取结果
    infer_result = model.generate(
        **input_map,
        decode_strategy="sampling",
        top_k=1,
        # top_p =5,
        max_length=output_length,
        use_cache=True,
        use_fast=True,
        use_fp16_decoding=True,
        repetition_penalty=1,
        temperature = 0.95,
        length_penalty=1,
    )[0]

    # 结果转换
    output = ''
    result = []
    for x in infer_result.tolist():
        res = tokenizer.decode(x, skip_special_tokens=True)
        res = res.strip("\n")
        result.append(res)
        output = output + res
    return output

# 要求
Q = "cat"
# 问题模板
Q_motif = f"对{Q}使用英语进行扩句，不需要回复中文"
print("Q:"+Q_motif)
# 获取结果，result的部分会作为之后的画画模型的Prompt
result=glm_single_QA(model,tokenizer,Q_motif,2048,2048)
print("A:"+result)

Q:对cat使用英语进行扩句，不需要回复中文
A:Sure, I'd be happy to expand on the phrase "cat" in English. Here are some possible sentences:

- The cat is sleeping on the bed.
- I saw a cat outside the window.
- My cat always loves to play with the dog.
- The cat is black and white.
- The cat is on the table, looking at us with its bright eyes.
- The cat is in the corner, looking sad.
- The cat is on the mat, meowing loudly.

I hope these sentences give you some ideas about how to use "cat" in English.

StableDiffusion

通过ppdiffusers加载画画模型

加载模型

这里加载预训练模型"Linaqruf/anything-v3.0"，更多模型加载选项参考PPDiffuser，加载自己训练的模型可以参考我之前的项目~

特别说明，Linaqruf/anything-v3.0等模型仅支持英文，如果前面的GLM生成中文，则需要使用中文模型。

from ppdiffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("Linaqruf/anything-v3.0")

预测

简单来画一张吧

# 生成
# 用之前扩张的提示词进行绘画
image = pipe(result, num_inference_steps=50,guidance_scale=7.5).images[0]
# 保存
image.save("/home/aistudio/test.jpg")
# 展示图片
image.show()

The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens: ['. - the cat is in the corner, looking sad. - the cat is on the mat, meowing loudly. i hope these sentences give you some ideas about how to use " cat " in english.']



  0%|          | 0/50 [00:00<?, ?it/s]

在这里插入图片描述

低配版灵感模式

# 用户提示词
Q = "cat"

## ChatGLM部分 ##
# 问题模板
Q_motif = f"对{Q}使用英语进行扩句，不需要回复中文"
print("Q:"+Q_motif)
# 获取结果，result的部分会作为之后的画画模型的Prompt
result=glm_single_QA(model,tokenizer,Q_motif,2048,2048)
print("A:"+result)

## StableDiffusion部分 ##
image = pipe(result, num_inference_steps=50,guidance_scale=7.5).images[0]
# 保存
image.save("/home/aistudio/test.jpg")
# 展示图片
image.show()

Q:对cat使用英语进行扩句，不需要回复中文
A:Sure, I'd be happy to expand on the phrase "cat" in English. Here are some possible sentences:

- I have a cat named Max that I keep as a pet.
- My neighbor has a cat that always comes to visit her.
- The cat is a common pet that is found in many households.
- Cat eyes are known for their beautiful and mesmerizing colors.
- Cat behavior can be fascinating to observe, as they often have unique habits and preferences.
- Cat听力非常灵敏, they can hear a lot of things that we can't even imagine.
- The cat is a symbol of luxury and elegance in many cultures.


The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens: [". - cat behavior can be fascinating to observe, as they often have unique habits and preferences. - cat听力非常灵敏, they can hear a lot of things that we can 't even imagine. - the cat is a symbol of luxury and elegance in many cultures."]

n hear a lot of things that we can 't even imagine. - the cat is a symbol of luxury and elegance in many cultures."]