Giter Site home page Giter Site logo

DALL-E 3 about blog HOT 1 OPEN

WangShuXian6 avatar WangShuXian6 commented on May 26, 2024
DALL-E 3

from blog.

Comments (1)

WangShuXian6 avatar WangShuXian6 commented on May 26, 2024

实现 DALL-E 3 角色一致性, 99% Character Consistency

在 DALL-E 3 中,任何基于之前图像的修改都很可能导致新图像的巨大差异。

很多人以为通过给角色添加一段很长的描述,加上姓名,或者指定一个 seed 值就解决了这个问题。实际上可能是小概率事件,或者角色的差异看上去并不明显。如果角色是真人的话,实现起来就更困难了。任何面部的细微差异都能被识别出来。

目前要达到 99.9% 角色一致性,只能在一张图里实现。然后通过裁切和放大再分成多张图片。

[媒介] [布局] [左上描述] [右上描述] [左下描述] [右下描述]

媒介:媒介就是要生成的图片类型,比如照片,水彩画,插画,卡通等等。

布局:通过布局类的关键词,让 DALL-E 3 生成多张图片的拼贴效果。上图用的关键词是 montage (蒙太奇),也可以用其他关键词,比如 grid (网格), arrangement (排列), collage (拼贴画), quad-diptych (四联画), storyboard (故事板), panorama (全景图), split-screen (分屏), mosaic (马赛克), film strip (胶片条), comic strip (连环画) 等。

描述:分别描述左上、右上、左下、右下四张拼贴图片中的人物

请注意这是一张图片,而不是 DALL-E 3 默认生成的四张图片
目前 DALL-E 3 对于这类拼贴图生成的效果并不完美。比如有些图中人物的手是残缺的,有些图则生成了六张图的拼贴。
超过四张的话可能出现的错误会更多。然而,如果只是需要变换人物姿势或表情,而不涉及动作的话,六张图也能取得比较好的效果

提示

Prompt: Photo montage of a middle-aged man with short hair. Top-left shows him laughing in casual attire. Top-right portrays him reading a book in glasses and a sweater. Bottom-left captures him jogging in sportswear with determination. Bottom-right depicts him playing guitar in a relaxed environment.

提示词: 短发中年男子的马赛克照片。
左上角是他穿着休闲装谈笑风生的样子。
右上角是他戴着眼镜、穿着毛衣看书的样子。
左下角为他穿着运动装,坚定地慢跑。
右下方是他在轻松的环境中弹吉他的画面。

提示 变换人物姿势或表情,而不涉及动作

Prompt: Wide photo arrangement featuring 6 frames of a 20-year-old Australian woman. She has platinum bob with dark roots. In each frame, she maintains consistent features but showcases different poses. Each frame is uniformly sized and evenly spaced for hassle-free cropping.

提示词: 宽幅照片布置,包含一名 20 岁澳大利亚女性的 6 幅照片。她留着深色发根的白金波波头。在每一帧照片中,她的五官保持一致,但展示了不同的姿势。每个相框大小一致,间距均匀,便于剪裁。

Prompt: Wide photo arrangement featuring 6 frames of a 20-year-old Chinese woman. She has tight, high ponytail. In each frame, she maintains consistent features but showcases different expressions, such as joyful, angry, sad, playful, worried, etc. Each frame is uniformly sized and evenly spaced for hassle-free cropping.

提示词: 宽幅照片布置,展示一位 20 岁**女性的 6 幅照片。她扎着高高的马尾辫。在每一帧照片中,她的五官保持一致,但表情各异,如喜悦、愤怒、悲伤、嬉戏、担忧等。每个相框大小一致,间距均匀,便于剪裁。

媒介

真人照片 (Photo)
卡通马赛克
连环画
蒙太奇插画
水彩全景图
故事板
卡通蒙太奇

只生成一张图片

Prompt: Always generate only one image in DALL-E 3.
提示词: 在 DALL-E 3 中始终只生成一个图像。
会失灵

@DM

在提示词前加上 “@dm” 即可,因为它已经表示了让 DALL-E 3 不要修改提示词。
会失灵

from blog.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.