基于造相Z-Image的电商产品图自动生成系统实战-深圳市維司達科技有限公司

基于造相Z-Image的电商产品图自动生成系统实战

做电商的朋友们，应该都经历过这样的场景：新品上架，需要拍一组高质量的产品主图，正面、侧面、细节、场景图一个都不能少。找摄影师？成本高、周期长。自己拍？设备、布光、后期都是门槛。更别提那些需要多角度展示的SKU，光是拍图就能耗掉大半天。

最近我在一个电商项目里，尝试用阿里的造相Z-Image模型搭建了一套产品图自动生成系统，效果出乎意料的好。今天就来分享一下这套系统的实战经验，从需求分析到代码实现，希望能给有类似需求的朋友一些参考。

1. 为什么选择Z-Image做电商图生成？

在开始技术细节之前，先说说为什么选Z-Image。我们对比过市面上几个主流的文生图模型，最终选择Z-Image主要基于这几个考虑：

硬件门槛低：Z-Image-Turbo版本只有6B参数，对显存要求不高。我们测试过，在16GB显存的消费级显卡上就能流畅运行，这对中小电商团队来说非常友好，不需要投入昂贵的专业设备。

中文理解强：毕竟是阿里自家的模型，对中文提示词的理解能力明显更好。我们测试过一些复杂的商品描述，比如“带有中国风刺绣的丝绸旗袍”，Z-Image能准确理解“中国风”、“刺绣”、“丝绸”这些关键词，生成符合预期的图片。

生成速度快：Turbo版本号称亚秒级生成，实际测试下来，一张1024x1024的产品图大概2-3秒就能出图。这个速度对于批量生成来说太重要了，想象一下要生成100个SKU的多角度图，速度慢的话根本没法用。

画质够用：虽然参数不大，但生成的产品图质量完全能满足电商平台的要求。细节清晰、色彩准确，特别是对材质的表现，比如金属的光泽、布料的纹理，都处理得不错。

2. 系统架构设计

我们的系统主要解决三个核心需求：多角度展示、场景适配、风格统一。下面这张图展示了系统的整体架构：

用户输入（商品描述） → 提示词优化 → Z-Image生成 → 后处理 → 成品图 ↑ ↑ ↑ ↑ 模板库 场景库 质量控制 批量导出

整个流程可以分解为几个关键模块：

2.1 提示词模板系统

这是系统的核心之一。我们发现，直接让Z-Image生成产品图，效果不稳定。但如果我们提供结构化的提示词模板，效果就稳定多了。

我们设计了一套电商专用的提示词模板，比如基础模板是这样的：

[产品主体]，[材质描述]，[颜色]，[细节特征]，放置在[背景/场景]中，[光线条件]，[拍摄角度]，[画质要求]

举个例子，要生成一个“陶瓷马克杯”的产品图，提示词可以这样写：

一个白色的陶瓷马克杯，表面光滑有光泽，带有简约的蓝色几何图案，放置在木质桌面上，旁边有一本打开的书和一杯咖啡，自然光从左侧窗户照射进来，45度俯拍角度，高清摄影，细节清晰，背景虚化

我们在代码里把模板做成了可配置的：

class ProductPromptTemplate: def __init__(self): self.templates = { 'main': "{product}, {material}, {color}, {details}, placed in {scene}, {lighting}, {angle}, {quality}", 'detail': "Close-up shot of {product} {detail_part}, showing {detail_feature}, {material} texture, {lighting}", 'scene': "{product} in use in a {scene_type} setting, {activity_description}, natural lifestyle photo" } def generate(self, template_type, **kwargs): template = self.templates.get(template_type, self.templates['main']) return template.format(**kwargs) # 使用示例 template = ProductPromptTemplate() prompt = template.generate('main', product="陶瓷马克杯", material="光面陶瓷", color="白色带蓝色几何图案", details="简约现代设计", scene="木质桌面，旁边有书和咖啡", lighting="自然侧光", angle="45度俯拍", quality="高清摄影，细节清晰，背景虚化") print(prompt)

2.2 多角度生成策略

电商产品图通常需要多个角度：正面、侧面、45度角、细节特写等。我们实现了一个角度控制系统：

class MultiAngleGenerator: def __init__(self, model_client): self.client = model_client self.angle_prompts = { 'front': "front view, directly facing the product, centered composition", 'side': "side view, showing the profile of the product", 'angle_45': "45 degree angle view, showing front and side simultaneously", 'top': "top-down view, looking directly down on the product", 'detail': "extreme close-up, focusing on specific details" } def generate_all_angles(self, base_description, product_name): """为同一产品生成所有角度的图片""" results = {} for angle_name, angle_desc in self.angle_prompts.items(): # 组合基础描述和角度描述 full_prompt = f"{base_description}, {angle_desc}, professional product photography" # 调用Z-Image生成 image = self.client.generate_image( prompt=full_prompt, negative_prompt="blurry, distorted, low quality, watermark, text", size="1024x1024", num_inference_steps=12 ) results[f"{product_name}_{angle_name}"] = image # 添加延迟，避免API限流 time.sleep(1) return results

2.3 场景适配模块

不同的产品适合不同的场景。我们建立了一个场景库，根据产品类型自动匹配合适的场景：

class SceneAdapter: def __init__(self): self.scene_mapping = { 'home_appliance': ['modern kitchen', 'living room', 'home office'], 'electronics': ['minimalist desk', 'tech workspace', 'dark background'], 'clothing': ['studio lighting', 'lifestyle setting', 'mannequin display'], 'food': ['restaurant table', 'kitchen counter', 'natural wood background'], 'beauty': ['vanity table', 'bathroom shelf', 'luxury display case'] } def get_scene_for_product(self, product_type, style="standard"): """根据产品类型获取场景""" scenes = self.scene_mapping.get(product_type, ['plain white background']) if style == "premium": # 高端风格添加额外修饰词 scenes = [f"luxury {scene}, professional lighting" for scene in scenes] elif style == "lifestyle": # 生活方式风格 scenes = [f"{scene} with natural elements, authentic feeling" for scene in scenes] return scenes def generate_scene_variations(self, product_desc, product_type, num_variations=3): """生成多个场景变体""" scenes = self.get_scene_for_product(product_type) variations = [] for i in range(min(num_variations, len(scenes))): scene_prompt = f"{product_desc}, placed in a {scenes[i]}, professional product photography" variations.append(scene_prompt) return variations

3. 实战代码：批量生成产品图

下面是一个完整的示例，展示如何批量生成一个产品系列的多角度、多场景图：

import os import time from datetime import datetime from PIL import Image import requests from io import BytesIO class ZImageProductGenerator: def __init__(self, api_key, base_url="https://dashscope.aliyuncs.com/api/v1"): self.api_key = api_key self.base_url = base_url self.headers = { "Authorization": f"Bearer {api_key}", "Content-Type": "application/json" } def call_zimage_api(self, prompt, size="1024x1024", num_images=1): """调用Z-Image API生成图片""" # 解析尺寸 width, height = map(int, size.split('x')) payload = { "model": "wan2.6-t2i", "input": { "messages": [{ "role": "user", "content": [{"text": prompt}] }] }, "parameters": { "size": f"{width}*{height}", "n": num_images, "negative_prompt": "blurry, distorted, low quality, watermark, text, people, ugly", "prompt_extend": True, "watermark": False } } try: response = requests.post( f"{self.base_url}/services/aigc/multimodal-generation/generation", headers=self.headers, json=payload, timeout=30 ) if response.status_code == 200: result = response.json() image_url = result['output']['choices'][0]['message']['content'][0]['image'] return image_url else: print(f"API调用失败: {response.status_code}, {response.text}") return None except Exception as e: print(f"调用API时出错: {str(e)}") return None def download_image(self, image_url, save_path): """下载图片到本地""" try: response = requests.get(image_url, timeout=10) if response.status_code == 200: with open(save_path, 'wb') as f: f.write(response.content) return True else: print(f"下载图片失败: {response.status_code}") return False except Exception as e: print(f"下载图片时出错: {str(e)}") return False def generate_product_set(self, product_config): """生成完整的产品图集""" product_name = product_config['name'] base_description = product_config['description'] product_type = product_config.get('type', 'general') print(f"开始生成产品图集: {product_name}") # 创建保存目录 timestamp = datetime.now().strftime("%Y%m%d_%H%M%S") save_dir = f"./output/{product_name}_{timestamp}" os.makedirs(save_dir, exist_ok=True) # 初始化场景适配器 scene_adapter = SceneAdapter() # 获取场景变体 scenes = scene_adapter.get_scene_for_product(product_type, style=product_config.get('style', 'standard')) results = [] # 为每个场景生成多角度图 for scene_idx, scene in enumerate(scenes[:3]): # 每个产品最多生成3个场景 scene_dir = os.path.join(save_dir, f"scene_{scene_idx+1}") os.makedirs(scene_dir, exist_ok=True) # 定义需要生成的角度 angles = [ ("front", "正面视角，产品居中，清晰展示整体设计"), ("angle_45", "45度角，展示产品的立体感和深度"), ("top", "俯视角度，展示产品顶部设计和布局"), ("detail", "细节特写，突出材质和工艺细节") ] for angle_name, angle_desc in angles: # 构建完整提示词 full_prompt = f"{base_description}, {angle_desc}, placed in {scene}, professional product photography, high detail, sharp focus" print(f"生成中: {scene} - {angle_name}") # 调用API生成图片 image_url = self.call_zimage_api(full_prompt, size="1024x1024") if image_url: # 下载图片 filename = f"{product_name}_{scene_idx+1}_{angle_name}.png" save_path = os.path.join(scene_dir, filename) if self.download_image(image_url, save_path): results.append({ "product": product_name, "scene": scene, "angle": angle_name, "path": save_path, "prompt": full_prompt }) print(f"✓ 已保存: {filename}") else: print(f"✗ 下载失败: {filename}") # 避免请求过快 time.sleep(2) print(f"产品图集生成完成: {len(results)}张图片") return { "product": product_name, "total_images": len(results), "save_dir": save_dir, "images": results } # 使用示例 def main(): # 配置API密钥（实际使用时从环境变量读取） api_key = os.getenv("DASHSCOPE_API_KEY") if not api_key: print("请设置DASHSCOPE_API_KEY环境变量") return # 初始化生成器 generator = ZImageProductGenerator(api_key) # 定义产品配置 products = [ { "name": "智能咖啡机", "type": "home_appliance", "style": "premium", "description": "一款现代简约风格的智能咖啡机，不锈钢机身，触摸屏控制，带有蒸汽打奶泡功能，设计优雅" }, { "name": "无线蓝牙耳机", "type": "electronics", "style": "lifestyle", "description": "入耳式无线蓝牙耳机，磨砂黑配色，人体工学设计，带有充电仓，支持主动降噪" } ] # 批量生成产品图 all_results = [] for product_config in products: result = generator.generate_product_set(product_config) all_results.append(result) # 产品间添加较长延迟 time.sleep(5) # 生成报告 generate_report(all_results) def generate_report(results): """生成生成报告""" report_path = "./output/generation_report.md" with open(report_path, 'w', encoding='utf-8') as f: f.write("# 产品图生成报告\n\n") f.write(f"生成时间: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n\n") total_images = 0 for result in results: total_images += result['total_images'] f.write(f"## {result['product']}\n") f.write(f"- 生成图片数: {result['total_images']}张\n") f.write(f"- 保存目录: {result['save_dir']}\n\n") f.write("### 生成的图片列表\n") for img in result['images']: f.write(f"- `{os.path.basename(img['path'])}`: {img['scene']} - {img['angle']}\n") f.write("\n") f.write(f"## 统计信息\n") f.write(f"- 总产品数: {len(results)}个\n") f.write(f"- 总图片数: {total_images}张\n") f.write(f"- 平均每个产品: {total_images/len(results):.1f}张\n") print(f"报告已生成: {report_path}") if __name__ == "__main__": main()

4. 效果优化技巧

在实际使用中，我们总结了一些提升生成效果的技巧：

4.1 提示词优化

Z-Image对提示词的结构比较敏感。我们发现这些写法效果更好：

具体比抽象好：不要说"好看的杯子"，要说"白色陶瓷马克杯，带有蓝色几何图案，简约现代设计"

添加摄影术语：加上"professional product photography"、"studio lighting"、"sharp focus"这些词，能显著提升图片的专业感

控制负面提示：明确告诉模型不要什么，比如"no text, no watermark, no people, no blurry"

4.2 参数调优

# 推荐的参数设置 optimal_params = { "size": "1024x1024", # 电商图常用尺寸 "num_inference_steps": 12, # Turbo版本8-12步效果最好 "guidance_scale": 3.5, # 指导系数，太高会过度饱和 "prompt_extend": True, # 开启提示词增强 "seed": 42 # 固定种子可以获得可重复的结果 }

4.3 后处理流程

生成后的图片可能需要一些简单的后处理：

from PIL import Image, ImageFilter, ImageEnhance class ImagePostProcessor: @staticmethod def enhance_product_image(image_path, output_path): """增强产品图片""" img = Image.open(image_path) # 1. 调整对比度 enhancer = ImageEnhance.Contrast(img) img = enhancer.enhance(1.1) # 2. 调整锐度 enhancer = ImageEnhance.Sharpness(img) img = enhancer.enhance(1.2) # 3. 轻微降噪（保持细节） img = img.filter(ImageFilter.SMOOTH_MILD) # 4. 保存 img.save(output_path, quality=95) return output_path @staticmethod def create_watermark(image_path, text, output_path): """添加简单水印（品牌标识）""" from PIL import ImageDraw, ImageFont img = Image.open(image_path) draw = ImageDraw.Draw(img) # 使用默认字体 font = ImageFont.load_default() # 在右下角添加水印 text_width, text_height = draw.textsize(text, font=font) margin = 20 position = (img.width - text_width - margin, img.height - text_height - margin) # 半透明背景 draw.rectangle( [position[0]-5, position[1]-5, position[0]+text_width+5, position[1]+text_height+5], fill=(255, 255, 255, 128) ) # 文字 draw.text(position, text, fill=(0, 0, 0), font=font) img.save(output_path) return output_path

5. 实际应用案例

我们在一个家居用品电商项目中实际应用了这套系统，效果怎么样？用数据说话：

效率提升：原来一个产品拍一套图（4个角度×3个场景）需要摄影师工作2-3小时，现在用系统生成，算上调整提示词的时间，大概15-20分钟就能完成。

成本对比：外包摄影每套图平均成本300-500元，使用系统后，主要成本就是API调用费用，每张图几分钱，一套12张图不到1块钱。

质量评估：我们做了个盲测，把AI生成的图和实拍图混在一起，让50个目标用户评分（5分制）。结果AI图平均4.2分，实拍图4.5分。对于非高端产品，这个差距在可接受范围内。

特别适合的场景：

新品概念图：产品还没生产出来，需要先做页面
长尾SKU：那些销量不高、不值得专门拍摄的产品
A/B测试：快速生成不同风格的图测试点击率
节日营销：快速生成带有节日元素的场景图

6. 遇到的坑和解决方案

当然，实际使用中也不是一帆风顺，遇到了一些问题：

问题1：生成结果不稳定有时候同样的提示词，两次生成的结果差异很大。我们的解决方案是固定seed参数，并且对重要的产品图，我们会生成3-5个变体，然后人工挑选最好的。

问题2：复杂结构理解有限对于结构特别复杂的产品，比如有多个可动部件的工具，Z-Image有时候会生成畸形的结构。我们的解决方案是拆解提示词，先描述整体，再描述局部，或者用"product diagram"、"exploded view"这样的术语。

问题3：品牌元素处理一开始生成的图太"通用"，缺乏品牌特色。我们通过微调提示词，加入品牌色彩、设计风格描述，并且在后期统一添加品牌水印来解决。

7. 总结与建议

整体用下来，基于Z-Image的电商产品图生成系统确实能解决很多实际问题。对于中小电商团队来说，这套方案性价比很高，部署简单，效果也够用。

如果你也想尝试，我的建议是：

从小范围开始：先选几个非核心产品试试水，熟悉整个流程，调整提示词模板，找到最适合你产品的风格。

建立审核机制：AI生成毕竟不是百分百可靠，重要的产品图还是需要人工审核一下，特别是检查有没有明显的畸形或错误。

结合实拍使用：AI生成和实拍不是二选一，可以结合起来用。主推产品用实拍图，长尾产品用AI图，概念阶段用AI快速出图。

关注成本控制：虽然单张图成本很低，但批量生成时也要注意API调用费用。可以设置月度预算，或者对生成次数做限制。

技术发展真的很快，像Z-Image这样的模型让很多以前需要专业团队才能做的事，现在中小团队也能做了。虽然现在还有些局限性，但相信随着模型不断优化，AI生成的产品图会越来越接近甚至超越实拍效果。如果你也在做电商，不妨试试这套方案，说不定能帮你省下不少时间和预算。

获取更多AI镜像
想探索更多AI镜像和应用场景？访问 CSDN星图镜像广场，提供丰富的预置镜像，覆盖大模型推理、图像生成、视频生成、模型微调等多个领域，支持一键部署。

基于造相Z-Image的电商产品图自动生成系统实战