Stability.ai 一周多前发布了 Stable Diffusion 2.0 模型。这是继 8 月 Stable Diffusion 1.4 版本以来最大的更新。但在 AI 图像生成模型激烈的竞争局面下,看起来社区并不买账。SD 2.0 在 Reddit 上招来群嘲,人们抱怨,SD 旧版本的 prompt,在 2.0 下不仅不再管用,甚至效果明显有倒退,生物体结构扭曲错乱,质感奇怪。拿来跟讨巧又低门槛的 Midjourney v4 一比较,简直是场噩梦。

社区甚至有了 “阴谋” 的猜想,先于官方发布的 2.0 开源模型是 Emad / SD Team 放出来的非常基础的模型版本,它们还有一个艺术超模型集 hypernetwork/model set,但不会公开,而是用于自有商业服务 DreamStudio 或拿来卖 API。社区想用好东西,得靠自己动手 finetune 2a3725b0a15b07f0f91a8431d798d570.png9d10e21021135b9a34e2a1134b54f14f.png

我对 SD2 的第一印象也跟社区差不多,不小的挫败和失望。过去珍藏的prompt 跑完能看的不多。但抛弃旧思路,经过几组的 prompt 实验后,我又信心大振,发现了 Stable Diffusion 2.0 的很多亮点和优势。

975fc73d2553bcc77bc9b56e9041a605.png

fine-art photography of a Clear crystal cube, floating highly on the sky, the tumultuous sea, Arctic Ocean, sunset, magic time, HDR, Minimalism, artistic, atmospheric, Centered symmetrical composition, conceptual design, futuristic, cinematic, hyper-detailed, 8K wallpaper -H 960 -S 3033668822

ad9a496e875c997b60cd7eb080491e79.png

fine-art landscape and nature photography of ocean, Stunning Photos of breaking Ocean Wave, close-up view, High-speed photography, HDR, artistic, Minimalism Photography, cloudy sky, magic time, sunset, golden shining, atmospheric, depressing, masterpiece, golden ratio composition, 8K, wallpaper, -H 1024 -S 6820342731

55cdbcc54065274d5be1deb7ee7518c9.png

fine-art Photography a beautiful eye, blue and golden pupil, super close-up view, dark clear background, Minimalism, artistic, atmospheric, masterpiece, HDR, golden ratio composition, hyper-detailed, 500px, -W 960 -S 4972926877

70e722c9637660c4ea3dc6d7a0b7133d.png

fine-art underwater photography of swiming pool, Stunning Photos of running horse underwater, HDR, artistic, magic time, atmospheric, masterpiece, golden ratio composition, 500px, 8K, wallpaper, -H 1024 -S 9854093032

下面是我花了大概 4 个小时实验结果和经验分享。我使用的生成服务用的是 我和家属 @virushuo 一起开发的 DFserver(基于 Huggingface Diffusers 实现的分布式 backend AI pieline server)的 discord bot。

本文中每张图都提供了 prompt 和 seed (见 image caption), 都是我原创的,欢迎大家在其基础上还原生成,做更多探索。需要注意的是,我用的是 diffusers + 2.0 模型, 同样的 seed 在 Dreamstudio 上可能结果会不一样。

所有结果都是纯 prompt 生成,无 init image,无后期,也没使用 negative prompt (用了可能更好玩)。

所有图的生成参数:

  • CGS: 9

  • Steps: 25

  • Size: 769 * 1024 or 768 * 960

SD2.0 最大改进,基础模型提供了更高的分辨率 (从 512 增加到 768 px),用更少的步数就能达到很好的结果(从 50 steps 减少到 25 ),图像质量和细节的丰富程度上也有了显著的提升。尤其突出的是对 光源、阴影、投影、物体表面的漫反射及环境反射、景深这些指标的处理,超越目前市面上的所有模型。

比如下面这三张海面上的透明晶体,橙色落日的光照如何在水面和晶体表面及内部形成漂亮的反射及折射,如何不同地作用于高透明水晶体和半透明的冰块,以及透明水晶球上准确的球面化变形处理。

4b51a264ac6eebede344311aee8ee86e.png

fine-art photography of a Clear crystal cube, reflecting the seaface, floating on the tumultuous sea, Arctic Ocean, sunset, magic time, by Andreas Rocha, Minimalism, artistic, atmospheric, masterpiece, golden ratio composition, hyper-detailed, 8K wallpaper -H 960 -S 1717647526

63902f96b15cc666e2d7238a462aae4e.png

fine-art photography of a small Clear crystal ice cube, floating above the horizon, the tumultuous sea, Sea floating with broken ice, sunset, magic time, HDR, Minimalism, artistic, atmospheric, Centered symmetrical composition, conceptual design, futuristic, cinematic, hyper-detailed, 8K wallpaper -H 960 -S 3033668822

c1d1a31ccb69831bb05864d707530de3.png

fine-art photography of a Clear crystal ball, floating highly on the sky, the tumultuous sea, Arctic Ocean, sunset, magic time, HDR, Minimalism, artistic, atmospheric, Centered symmetrical composition, conceptual design, futuristic, cinematic, hyper-detailed, 8K wallpaper -H 960 -S 3033668822

下面一组实验是水下场景的生成。水下场景的渲染和水体仿真在 CG 领域是皇冠级别的难度。AI 生成 能做到这个程度令我很吃惊。抛开复杂的光照处理和水波反射,水下奔马那张甚至能看出来浮力的影响。

可能你会觉得目前为止跑出来的结果都有一些过饱和的倾向,过于 HDR 了,但这个问题还是可以通过调整 prompt、使用 negative prompt、或后期处理一下,拉低曲线或饱和度。

2cd54001f05d0adf7ccac31bf513ddb5.png

fine-art underwater photography of swiming pool, Stunning Photos of running horse underwater, HDR, artistic, magic time, atmospheric, masterpiece, golden ratio composition, 8K, wallpaper, -H 1024 -S 6548921907

b02109b6c7bbe8d9653a91590afff21f.png

fine-art underwater photography of swiming pool, Stunning Photos of a smiling baby swiming underwater, High-speed photography, HDR, artistic, Minimalism Photography, magic time, sunset, golden shining, atmospheric, depressing, masterpiece, golden ratio composition, 8K, wallpaper, -H 1024 -S 2625550821

c6027110f7296d766c11a0e32ef6a637.png

fine-art underwater photography of Sea Palace underwater, HDR, artistic, magic time, atmospheric, masterpiece, golden ratio composition, 8K, wallpaper, -S 8516022692

f7d0282a4f6a6c5354837af6cfac9e3d.png

fine-art underwater photography of swiming pool, Stunning Photos of Sea Palace underwater, HDR, artistic, magic time, atmospheric, masterpiece, golden ratio composition, 8K, wallpaper, -H 1024 -S 7506193104

13991b98069a4c79fe03220c0e95c2ce.png

stunning fine-art underwater photography of a sunken pirate ship, close-up view, the Tyndall effect,HDR, artistic, magic time, atmospheric, masterpiece, golden ratio composition, brown and Ultramarine color, 8K, wallpaper, Fantasy style -W 1024 -S 2567103212

SD 1.5 的 prompt 照搬到 2.0 后,能幸存的很少。所以 SD2 的 prompt engineering 可能需要不同的尝试思路。很明显,过短和过长的 prompt 在 SD2 里都是不好用的。你不可能用像在 Midjourney v4 里那样,用 “Fire fox chibi” 这么短的词就跑出来漂亮结果;也无需采用之前常见的做法,靠大量堆叠 “修饰词” 或 “参考艺术家” 来拼盘随机出一个结果。

也可以不再使用 trending on artstation, 500px 这类 “向AI神灵的祈祷词”,亲测加不加对结果没啥影响。

我实验下来的感受是,SD2 对修饰词的响应,较之前版本,更为敏感和准确。这意味着它能提供更高的可控性,更精细。这让带着目标性的 prompt  设计变得更可行,更有的放矢,从蒙眼炼金的时代走出。这对于喜欢挑战的玩家,无疑是个礼物。

下面四张是我实验黑色液体金属材质(liquid metal, dark)纹理的生成。

第一张看起来像打了强光的亮光厚涂丙烯媒介,不是很符合我预期。

9e3a3ba6478ad09ca71cdadf7f8334df.png

liquid metal, dark, close-up view, hyper-detailed, photorealistic, studio light, amazing texture, -S 9368172487

第二张,我加上了修饰词 flowing, Ribbon-like shine,感觉有点丝滑过头了。

58e576a92c1add346320475dccac7dad.png

liquid metal, flowing, dark, Ribbon-like shine, hyper-detailed, photorealistic, studio light, amazing texture -S 9363724119

下面两张我又增加了修饰词 Solidified lava,比较接近我想要的效果。

我感觉 SD2 对 这三次修饰词增加的响应还挺敏感的,肉眼可见的改变还挺明显。

此外,我也没有堆叠 rendering 类的修饰词,没加上一堆 3D 引擎。

7dceec6403c59d786244b8ac11c661ce.png

liquid metal, flowing, dark, Solidified lava, Ribbon-like shine hyper-detailed, photorealistic, studio light, amazing texture -S 8857093629

5cb77f26dc1c3e1f94569e9480888be6.png

liquid metal, flowing, dark, Solidified lava, Ribbon-like shine hyper-detailed, photorealistic, studio light, amazing texture -S 2378293576

下面三张是我对一张黑白沙丘摄影 prompt 的渐进优化,种子是相同的。第一张出来的构图我很喜欢,想保留。但沙浪的对比太假了。我就加了 “perfect brightness and contrast balance” 试试,出于意料的管用了(第二张)。但沙浪的曲线又抖动了,我又加了 “ Extremely artistic curve ” (第三张)。

实验次数不多,可能添加这两个修饰词的改善效果有运气成分。但的确让我看到 精细 editting 的可能性。

2f7a336643e3872c20b1af41488caecf.png

wild sand dune, epic, sand wave, night, coast, black and white photography, by adam ansel, hyper-detailed, masterpiece, Golden ratio composition, -S 6206530932

e03e8395168cb5b69987ba9f86bb5672.png

wild sand dune, epic, sand wave, night, coast, black and white photography, by adam ansel, hyper-detailed, masterpiece, Golden ratio composition, perfect brightness and contrast balance, -S 6206530932

33adfbb213fee3494fba2802edf26c95.png

wild sand dune, epic, sand wave, night, coast, black and white photography, by adam ansel, hyper-detailed, masterpiece, Golden ratio composition, Extremely artistic curve, perfect brightness and contrast balance, -S 6206530932

下面这组实验是我观察对不同艺术家风格的响应。6 张同主题冰山风景画,只更换了艺术家。

  • Michael Whelan 是色彩明快构图简洁的奇幻题材插画大师。

  • Bruce_Pennington 是风格复古、喜欢浓墨重彩的科幻插画艺术家。

  • Chesley Bonestell 是异星地貌和太空题材的插画家,笔触豪放。

  • Andreas Rocha 则是游戏和概念设定领域的数绘艺术家,风格更现代轻快(我很喜欢用他)

新版对艺术家风格响应还是挺敏感的,对用什么艺术家可能出什么效果变得更可预测,这都让有目的性的 prompt 实验及设计都变得可行。嗯,所以 SD2  里,我就没再使用过 3 位以上的艺术家啦。

ec8a477a0d77276b6d12eb3231f473bb.png

fine-art oil landscape painting of Iceberg cliffs, Arctic Ocean, lonely island, sunset, magic time, by Michael Whelan, Minimalism, epic perspective, artistic, atmospheric, masterpiece, vivid color, HDR, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 6753514390

bf297cb3973fa094e8f82a2487dfb11a.png

fine-art oil landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Michael Whelan, Minimalism, artistic, atmospheric, masterpiece, vivid color, HDR, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 3297248311

3829d97385afcf41ca216e4f225902ba.png

fine-art oil landscape painting of Iceberg cliffs, Arctic Ocean, lonely island, sunset, magic time, by Bruce_Pennington, Minimalism, epic perspective, artistic, atmospheric, masterpiece, vivid color, HDR, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 6753514390

4a67cb1d5690235449f6792fbb93fcc6.png

fine-art oil landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Bruce Pennington, artistic, atmospheric, masterpiece, vivid color, HDR, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 5695592645

881f60aafc11d306dc783dd5fba95448.png

fine-art oil landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Chesley Bonestell, artistic, atmospheric, masterpiece, vivid color, HDR, golden ratio composition, hyper-detailed -W 960 -S 3071271062

949dbbbf57e18290310382287b822866.png

fine-art watercolor landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Andreas Rocha, Minimalism, artistic, atmospheric, masterpiece, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 9252202207

下面这一组测试的是配色修饰词,艺术家参考都是 Kaethe Butcher 的钢笔肖像画。随便写了 红蓝、黄蓝、青 vs 熟赭 这几个撞色风格,结果意想不到的准确,而且艺术感很强呢。

作为肖像,面部解剖的准确度不错,竖幅也没跑出上下两张脸。下面4张结果是从总共不到 20 次生成里挑选的。

3b6d8668d17a0ba2a8025d059efb77a3.png

fine-art pen portrait drawing by Kaethe Butcher, artistic, atmospheric, masterpiece, vivid color, HDR, golden ratio composition, hyper-detailed blue and red color, -H 960 -S 2295938736

5f42d97568071c4bfd622509f66052ff.png

fine-art pen portrait drawing by Kaethe Butcher, artistic, atmospheric, masterpiece, blue and yellow color, vivid color, HDR, golden ratio composition, hyper-detailed -H 960 -S 8374475260

3e6ee87f9f08fccb17445d04496fc36a.png

fine-art pen portrait drawing, side view, sad pretty face, young lady, by Kaethe Butcher, artistic, atmospheric, masterpiece, vivid color, HDR, golden ratio composition, hyper-detailed, -H 960 -S 8239847668

bd91102dbe6c50b85220ce479eae8492.png

fine-art pen portrait drawing, side view, sad pretty face, young lady, by Kaethe Butcher, artistic, atmospheric, masterpiece, vivid color, HDR, golden ratio composition, hyper-detailed, -H 960 -S 8239847668

下面两组实验的是干湿两种绘画媒介,油画和水彩。不同媒介的笔触属性和边缘渲染特征、对画布/纸表面的模拟,都挺惊艳的。对透明玻璃器皿和铜器的描绘我很喜欢。

在油画媒介上,柠檬表皮模拟了油画颜料的龟裂纹理。而水彩媒介,最后一张上,干湿画法的模拟都很到位。

6bceddfa6459bd663bfb29d2f6ecf61b.png

fine-art oil painting of still life, glass bowl and lemons, close-up view, clear dark background, by dan mumford, by james Jean, artistic, atmospheric, masterpiece, vivid color, dark shadow, high contrast, HDR, golden ratio composition, hyper-detailed -W 1024 -S 6786310330

7c00341fe720b43cd70d369ac01df6ac.png

fine-art oil painting of still life, glass bowl and lemons, close-up view, Minimalism, clear dark background, by dan mumford, by james Jean, artistic, atmospheric, masterpiece, vivid color, dark shadow, high contrast, golden ratio composition, hyper-detailed -H 1024 -S 2895457491

5bfbb15315ba69c482d120b0ecd849b5.png

fine-art watercolor painting of still life, glass goblets and copper teapot, lemons and dead flowers, Minimalism, clear dark background, by John Singer Sargent, by Sherree Valentine Daines, artistic, atmospheric, masterpiece, blue and yellow color, vivid color, HDR, golden ratio composition, hyper-detailed -H 1024 -S 8362409777

我自己画水彩的,反正下面这张我很难看起来是原作扫描件还是AI生成的。

72477e33f7cc316609e650d9e781154c.png

fine-art watercolor painting of still life, glass goblets and copper teapot, lemons and dead flowers, Minimalism, clear dark background, by John Singer Sargent, by Sherree Valentine Daines, artistic, atmospheric, masterpiece, blue and yellow color, vivid color, HDR, golden ratio composition, hyper-detailed -H 1024 -S 7401752247

这组还是油画 vs 水彩这两种古典 fine-art 媒介的对比,风景主题的。虽然参考的艺术家 Andreas Rocha 是只画数绘的大师

985c4c4ac1703e1f6ecf187353f681f6.png

fine-art oil landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Andreas Rocha, Minimalism, artistic, atmospheric, masterpiece, vivid color, HDR, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 8104947253

b31f930ee94b46dd51b0def64bdd13b4.png

fine-art watercolor landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Andreas Rocha, Minimalism, artistic, atmospheric, masterpiece,darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 1950272985

7c63eeddf0e8345a889b9f477f45f765.png

fine-art watercolor landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Andreas Rocha, Minimalism, artistic, atmospheric, masterpiece, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 5369511481

5a62af78588adc13584bf656190358ce.png

fine-art watercolor landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Andreas Rocha, Minimalism, artistic, atmospheric, masterpiece, darker shadow, high contrast, golden ratio composition, hyper-detailed -W 960 -S 1950272985

af4b28c9800fab051cd934925ccdd5bd.png

fine-art watercolor landscape painting of Iceberg cliffs, Arctic Ocean, sunset, magic time, by Andreas Rocha, Minimalism, artistic, atmospheric, masterpiece, darker shadow, high contrast, golden ratio composition, hyper-detailed

SD2 发布后的一个争议是,社区发现其训练集集里移除了有争议的名人肖像。用名人作为关键词生成的肖像特征不再明显,(是的,可能 在 2.0 里你们再也跑不出来长着美人鱼尾巴的 Emma Watson 或 Gal Gadot 了,但奥巴马好像还是可以的)。

但我想是,如果需要的话,需要任何一个人的肖像特征生成,都是很容易通过自定义 finetune 来取得的。作为一个基石型的开放模型,我个人认同 SD 的做法,在伦理争议多考虑一点,把有争议的数据从训练集里越早排除掉越好。

我对名人再加工没什么兴趣,但倒是版画风格试了艺术史上几张著名的脸,特征鲜明得很,一看就能猜出来它们都是谁。

0b0bc5c7e28365f33de236f52f9c659f.png

fine-art woodcut colorful printmaking of portraint of Christ ,with the crown of thorns, close-up view, Minimalism, clear dark background, by dan mumford, by aaron horkey, artistic, atmospheric, masterpiece, vivid color, dark shadow, high contrast, golden ratio composition, hyper-detailed -H 960

98a0c4439b64ea13fd4bf7ddab952b7f.png

fine-art woodcut colorful pringmaking of close-up portraint of Frankenstein, Minimalism, clear dark background, by dan mumford, by aaron horkey, by bernie wrightson, artistic, atmospheric, masterpiece, vivid color, long shadow, high contrast, golden ratio composition, hyper-detailed -H 1024 -S 7356494580

43d7b574be4828d62863bd51243531c8.jpeg

fine-art woodcut colorful pringmaking of close-up portrait of van gogh, Minimalism, clear dark background, by dan mumford, by aaron horkey, artistic, atmospheric, masterpiece, vivid color, dark shadow, high contrast, golden ratio composition, hyper-detailed -H 960

e7549a7b91251137db7bc883d5f5021d.png

fine-art woodcut colorful pringmaking of close-up portrait of Mona Lisa, Minimalism, clear dark background, by dan mumford, by aaron horkey, artistic, atmospheric, masterpiece, vivid color, dark shadow, high contrast, golden ratio composition, hyper-detailed -H 960 -S 3140988994

这组实验了不同的自然材质的细节表现力:冰块、雪地、沙地、海浪、海浪的泡沫。

dc12670ac39fda2baa4cbb29f3b62990.png

fine-art photography of a small Clear crystal ice cube, floating above the horizon, snow field, sunset, magic time, HDR, Minimalism, artistic, atmospheric, Centered symmetrical composition, conceptual design, futuristic, cinematic, hyper-detailed, 8K wallpaper -H 960 -S 3033668822

600fa854104f86b04b6b6c4f38b4c813.png

fine-art landscape and nature photography of Undulating sand dunes in Death Valley, HDR, artistic, Minimalism Photography, sand wave, cloudy sky, magic time, golden shining, atmospheric, depressing, photography by Erez Marom, golden and Ultramarine color, masterpiece, golden ratio composition, 500px, 8K, wallpaper, -H 1024 -S 3097727685

865beb130f5c419585863d12c1ad6055.png

fine-art landscape and nature photography of ocean, Stunning Photos of breaking Ocean Wave, close-up view, High-speed photography, HDR, artistic, Minimalism Photography, cloudy sky, magic time, sunset, golden shining, atmospheric, depressing, masterpiece, golden ratio composition, 8K, wallpaper, -H 1024

aebbb000aaaf05c736c4fe7a9b44d175.png

fine-art landscape and nature photography of ocean, Stunning Photos of breaking Ocean Wave, bird view, High-speed photography, HDR, artistic, Minimalism Photography, magic time, sunset, golden shining, atmospheric, depressing, masterpiece, golden ratio composition, 8K, wallpaper, -H 1024 -S 6820342731

接下来我还会接着实验 SD 2.0 更多风格的生成,以及 depth2img、inpainting 模型和自定义 finetune,分享给大家。

AI 生成模型想要作为专业化工具进入更严肃应用领域,能使用草稿图引导来控制配色及构图、迭代时需要的精细编辑功能,低门槛的模型 finetune,在这三个方向上的成熟,是重要条件。

934540376546cb7d5fb5a946267e45c0.png

Grand Canyon, moonrise, dead old trees, black and white photography, by adam ansel, hyper-detailed, masterpiece -S 9477247668

最后以一张 不朽的 Adam Ansel 的 月升大峡谷收尾,谢谢观看,这是我用 SD2.0 跑出来的第一张成功结果。

上一次更新里,我提到了我刚发布了一个 专为 AI 艺术家和爱好者们设计的 APP —— Kalos.art。访问文章链接:AI 终于能为我挣钱了 

我今天发布的图片都发布在了我的 Kalos 账号,大家如果需要购买这些作品的使用授权,欢迎点击阅读原文。或者 只是支持一下,来我充个电、点个赞哦。

f3421d67ad9761c5b948b6a8cf48c447.png‍‍


其它有用的链接:

我跟家属开发的 开源分布式 AI 模型 pipeline 后端服务—— DFserver : https://github.com/huo-ju/dfserver

Stable Diffusion 2.0 已开源的模型:https://github.com/Stability-AI/stablediffusion

Stability.ai 的官方付费 AI 图像生成在线服务: https://beta.dreamstudio.ai

Huggingface Diffusers: https://huggingface.co/docs/diffusers/index

Logo

长江两岸老火锅,共聚山城开发者!We Want You!

更多推荐