Image
生成和处理图像内容的基础功能模块
Core functionality module for generating and processing image content
clawhub install imageBaoyu 图像生成:基于 Baoyu 模型的 AI 图像生成工具。
# 安装 Skill npx skills add jimliu/baoyu-skills@baoyu-image-gen # 安装后 Claude Code 会自动识别并使用
# 同样的安装命令,兼容所有支持 SKILL.md 的 AI 编程工具 npx skills add jimliu/baoyu-skills@baoyu-image-gen
基于宝玉 AI 引擎,需调用 API 并可能产生费用
{baseDir} = this SKILL.md file's directory{baseDir}/scripts/main.ts${BUN_X} runtime: if bun installed → bun; if npx available → npx -y bun; else suggest installing bun# macOS, Linux, WSL, Git Bash
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "xdg"
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"
# PowerShell (Windows)
if (Test-Path .baoyu-skills/baoyu-image-gen/EXTEND.md) { "project" }
$xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" }
if (Test-Path "$xdg/baoyu-skills/baoyu-image-gen/EXTEND.md") { "xdg" }
if (Test-Path "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md") { "user" }
default_model.[provider] is null → ask model only (Flow 2) |
| Not found | ⛔ Run first-time setup ([references/config/first-time-setup.md](references/config/first-time-setup.md)) → Save EXTEND.md → Then continue |.baoyu-skills/baoyu-image-gen/EXTEND.md | Project directory |
| $HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md | User home |references/config/preferences-schema.md# Basic
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png
# With aspect ratio
${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
# High quality
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
# From prompt files
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png
# With reference images (Google, OpenAI, OpenRouter, Replicate, or Seedream 4.0/4.5/5.0)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
# With reference images (explicit provider/model)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
# OpenRouter (recommended default model)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openrouter
# OpenRouter with reference images
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png
# Specific provider
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
# DashScope (阿里通义万象)
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
# DashScope Qwen-Image 2.0 Pro (recommended for custom sizes and text rendering)
${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张 21:9 横幅海报,包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872
# DashScope legacy Qwen fixed-size model
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928
# Replicate (google/nano-banana-pro)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Replicate with specific model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
# Batch mode with saved prompt files
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json
# Batch mode with explicit worker count
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json
{
"jobs": 4,
"tasks": [
{
"id": "hero",
"promptFiles": ["prompts/hero.md"],
"image": "out/hero.png",
"provider": "replicate",
"model": "google/nano-banana-pro",
"ar": "16:9",
"quality": "2k"
},
{
"id": "diagram",
"promptFiles": ["prompts/diagram.md"],
"image": "out/diagram.png",
"ref": ["references/original.png"]
}
]
}
promptFiles, image, and ref are resolved relative to the batch file's directory. jobs is optional (overridden by CLI --jobs). Top-level array format (without jobs wrapper) is also accepted.--prompt <text>, -p | Prompt text |
| --promptfiles <files...> | Read prompt from files (concatenated) |
| --image <path> | Output image path (required in single-image mode) |
| --batchfile <path> | JSON batch file for multi-image generation |
| --jobs <count> | Worker count for batch mode (default: auto, max from config, built-in default 10) |
| --provider google\|openai\|openrouter\|dashscope\|jimeng\|seedream\|replicate | Force provider (default: auto-detect) |
| --model <id>, -m | Model ID (Google: gemini-3-pro-image-preview; OpenAI: gpt-image-1.5; OpenRouter: google/gemini-3.1-flash-image-preview; DashScope: qwen-image-2.0-pro) |
| --ar <ratio> | Aspect ratio (e.g., 16:9, 1:1, 4:3) |
| --size <WxH> | Size (e.g., 1024x1024) |
| --quality normal\|2k | Quality preset (default: 2k) |
| --imageSize 1K\|2K\|4K | Image size for Google/OpenRouter (default: from quality) |
| --ref <files...> | Reference images. Supported by Google multimodal, OpenAI GPT Image edits, OpenRouter multimodal models, Replicate, and Seedream 5.0/4.5/4.0. Not supported by Jimeng, Seedream 3.0, or removed SeedEdit 3.0 |
| --n <count> | Number of images |
| --json | JSON output |OPENAI_API_KEY | OpenAI API key |
| OPENROUTER_API_KEY | OpenRouter API key |
| GOOGLE_API_KEY | Google API key |
| DASHSCOPE_API_KEY | DashScope API key (阿里云) |
| REPLICATE_API_TOKEN | Replicate API token |
| JIMENG_ACCESS_KEY_ID | Jimeng (即梦) Volcengine access key |
| JIMENG_SECRET_ACCESS_KEY | Jimeng (即梦) Volcengine secret key |
| ARK_API_KEY | Seedream (豆包) Volcengine ARK API key |
| OPENAI_IMAGE_MODEL | OpenAI model override |
| OPENROUTER_IMAGE_MODEL | OpenRouter model override (default: google/gemini-3.1-flash-image-preview) |
| GOOGLE_IMAGE_MODEL | Google model override |
| DASHSCOPE_IMAGE_MODEL | DashScope model override (default: qwen-image-2.0-pro) |
| REPLICATE_IMAGE_MODEL | Replicate model override (default: google/nano-banana-pro) |
| JIMENG_IMAGE_MODEL | Jimeng model override (default: jimeng_t2i_v40) |
| SEEDREAM_IMAGE_MODEL | Seedream model override (default: doubao-seedream-5-0-260128) |
| OPENAI_BASE_URL | Custom OpenAI endpoint |
| OPENROUTER_BASE_URL | Custom OpenRouter endpoint (default: https://openrouter.ai/api/v1) |
| OPENROUTER_HTTP_REFERER | Optional app/site URL for OpenRouter attribution |
| OPENROUTER_TITLE | Optional app name for OpenRouter attribution |
| GOOGLE_BASE_URL | Custom Google endpoint |
| DASHSCOPE_BASE_URL | Custom DashScope endpoint |
| REPLICATE_BASE_URL | Custom Replicate endpoint |
| JIMENG_BASE_URL | Custom Jimeng endpoint (default: https://visual.volcengineapi.com) |
| JIMENG_REGION | Jimeng region (default: cn-north-1) |
| SEEDREAM_BASE_URL | Custom Seedream endpoint (default: https://ark.cn-beijing.volces.com/api/v3) |
| BAOYU_IMAGE_GEN_MAX_WORKERS | Override batch worker cap |
| BAOYU_IMAGE_GEN_<PROVIDER>_CONCURRENCY | Override provider concurrency, e.g. BAOYU_IMAGE_GEN_REPLICATE_CONCURRENCY |
| BAOYU_IMAGE_GEN_<PROVIDER>_START_INTERVAL_MS | Override provider start gap, e.g. BAOYU_IMAGE_GEN_REPLICATE_START_INTERVAL_MS |<cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env--model <id>default_model.[provider]<PROVIDER>_IMAGE_MODEL (e.g., GOOGLE_IMAGE_MODEL)default_model.google: "gemini-3-pro-image-preview" and env var GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview exist, EXTEND.md wins.Using [provider] / [model]Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL--model qwen-image-2.0-pro or set default_model.dashscope / DASHSCOPE_IMAGE_MODEL when the user wants official Qwen-Image behavior.qwen-image-2.0-pro, qwen-image-2.0-pro-2026-03-03, qwen-image-2.0, qwen-image-2.0-2026-03-03size in 宽*高 format
- Total pixels must stay between 512*512 and 2048*2048
- Default size is approximately 1024*1024
- Best choice for custom ratios such as 21:9 and text-heavy Chinese/English layouts
qwen-image-max, qwen-image-max-2025-12-30, qwen-image-plus, qwen-image-plus-2026-01-09, qwen-image1664*928, 1472*1104, 1328*1328, 1104*1472, 928*1664
- Default size is 1664*928
- qwen-image currently has the same capability as qwen-image-plus
z-image-turbo, z-image-ultra, wanx-v1--size wins over --arqwen-image-2.0*, prefer explicit --size; otherwise infer from --ar and use the official recommended resolutions belowqwen-image-max/plus/image, only use the five official fixed sizes; if the requested ratio is not covered, switch to qwen-image-2.0-pro--quality is a baoyu-image-gen compatibility preset, not a native DashScope API field. Mapping normal / 2k onto the qwen-image-2.0* table below is an implementation inference, not an official API guaranteeqwen-image-2.0* sizes for common aspect ratios:normal | 2k |
|-------|----------|------|
| 1:1 | 1024*1024 | 1536*1536 |
| 2:3 | 768*1152 | 1024*1536 |
| 3:2 | 1152*768 | 1536*1024 |
| 3:4 | 960*1280 | 1080*1440 |
| 4:3 | 1280*960 | 1440*1080 |
| 9:16 | 720*1280 | 1080*1920 |
| 16:9 | 1280*720 | 1920*1080 |
| 21:9 | 1344*576 | 2048*872 |negative_prompt, prompt_extend, and watermark, but baoyu-image-gen does not expose them as dedicated CLI flags today.google/gemini-3.1-flash-image-preview (recommended, supports image output and reference-image workflows)google/gemini-2.5-flash-image-previewblack-forest-labs/flux.2-pro/chat/completions, not the OpenAI /images endpoints--ref is used, choose a multimodal model that supports image input and image output--imageSize maps to OpenRouter imageGenerationOptions.size; --size <WxH> is converted to the nearest OpenRouter size and inferred aspect ratio when possibleowner/name (recommended for official models), e.g. google/nano-banana-proowner/name:version (community models by version), e.g. stability-ai/sdxl:<version># Use Replicate default model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Override model explicitly
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
--ref provided + no --provider → auto-select Google first, then OpenAI, then OpenRouter, then Replicate (Jimeng and Seedream do not support reference images)--provider specified → use it (if --ref, must be google, openai, openrouter, or replicate)normal | 1K | 1024px | 1K | 1K | Quick previews |
| 2k (default) | 2K | 2048px | 2K | 2K | Covers, illustrations, infographics |--imageSize 1K|2K|4K1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1imageConfig.aspectRatioimageGenerationOptions.aspect_ratio; if only --size <WxH> is given, aspect ratio is inferred automaticallyaspect_ratio to model; when --ref is provided without --ar, defaults to match_input_image--batchfile contains 2 or more pending tasks, the script automatically enables parallel generation.--batchfile) | Reuses finalized prompts, applies shared throttling/retries, and gives predictable throughput |
| Each image still needs separate reasoning, prompt writing, or style exploration | Subagents | The work is still exploratory, so each image may need independent analysis before generation |
| Output comes from baoyu-article-illustrator with outline.md + prompts/ | Batch (build-batch.ts -> --batchfile) | That workflow already produces prompt files, so direct batch execution is the intended path |--jobs <count>```bash
npx skills add jimliu/baoyu-skills@baoyu-image-gen
skills baoyu-image-gen \
--prompt "红楼梦林黛玉形象" \
--style ink_painting \
--quality ultra \
--count 4
```图像托管:将图像上传到 img402.dev 获取公开链接,用于消息分享、文档嵌入或社交媒体发布。
Upload images to img402.dev and get a public URL. Free tier: 1MB max, 7-day retention, no auth. Use when the agent needs a hosted image URL — for sharing in messages, embedding in documents, posting to social platforms, or any context that requires a public link to an image file.
clawhub install image-hosting