城市设计中生成式人工智能应用的进展综述

洪齐远; 夏俊豪; 龙瀛

doi:10.3724/j.fjyl.LA20250329

城市设计中生成式人工智能应用的进展综述

Generative Artificial Intelligence in Urban Design: A Review of Recent Applications

摘要

摘要:
目的生成式人工智能（generative artificial intelligence, GAI）正深度介入城市设计流程，为复杂任务建构、多角色参与、高效生成提供技术支持。本研究旨在识别GAI在城市设计中的主要应用类型与介入机制，分析其演化趋势与典型特征，并对关键技术挑战与未来研究方向进行梳理。
方法通过系统检索近10年中英文主流数据库中的代表性研究与案例，纳入125篇与GAI及城市设计交叉相关的文献，构建“GAI技术类型×城市设计任务阶段”的分析框架，将模型划分为图像驱动、语言驱动、结构驱动与反馈优化4类，任务阶段划分为前期分析、方案生成、评估优化与决策、效果表达，并结合设计思维理论，采用技术接受模型（technology acceptance model, TAM）作为描述性分析视角，梳理不同模型在感知有用性/感知易用性上的表现与介入逻辑差异。
结果研究发现，GAI已广泛嵌入城市设计各阶段任务流程，不同类型模型在多阶段任务中形成了协同耦合机制。图像驱动模型因其直观性与易用性被最早采纳，语言、结构与反馈优化模型在策略建构、规则嵌入与多目标优化中展现出更大潜力，但整体应用仍不均衡。模型组合、跨模态协同与过程迭代呈上升态势，不同模型间感知有用性与易用性的差异影响其应用比例。
结论当前多数GAI模型仍以过程式组合为主，尚未具备生成式智能的逻辑建构能力，在城市设计中的作用主要体现为任务响应与辅助生成。模型可控性差、可解释性不足与地域数据适应性不强是制约其广泛应用的核心瓶颈。未来研究应聚焦于构建结构化的本土设计数据体系、增强模型的控制透明机制与跨模型协同能力，推动GAI从工具型介入向设计逻辑深度参与者演进。

Abstract:
Objective As urban design faces increasing demands for contextual responsiveness, iterative optimization, and data-informed reasoning, integrating artificial intelligence into the design process has gained renewed relevance. Among emerging technologies, generative artificial intelligence (GAI) shows strong potential for automating content creation and simulating spatial configurations. This research provides a comprehensive review of recent developments in the application of GAI to urban design. The research identifies representative technical pathways, their respective intervention stages, and the functional mechanisms by which generative models are reshaping the design workflow. This research presents a structured, theory-informed synthesis of how different generative models contribute to tasks such as intention modeling, spatial reasoning, and performance-driven design. Building on design thinking and a descriptive lens informed by the technology acceptance model (TAM), the research examines how model type, data modality, and task characteristics affect GAI’s functional role, usability, and acceptance. Particular attention is given to mapping deployment forms, from isolated tools to coordinated multi-model workflows, and to characterizing cross-cutting challenges of controllability, transparency, and contextual adaptability in urban design settings.
Methods Following the PRISMA protocol, the research conducts a multi-stage literature review combining automated search and expert screening. A total of 125 peer-reviewed articles and high-impact preprints are selected from Web of Science, CNKI, arXiv, and selected industry sources, covering the period from 2014 to July 2025. Search terms such as “generative AI”, “AIGC”, “GAN”, “diffusion model”, “variational autoencoder”, “autoregressive model”, “large language model”, and urban-related keywords are used in various combinations. Based on the collected literature, four types of generative models are summarized as image-driven, language-driven, structure-driven, and feedback-optimized models, according to their application characteristics in urban design tasks. These types are aligned with four stages of the design process: preliminary analysis, scheme generation, evaluation and decision-making, and outcome expression. On this basis, a two-dimensional framework to examine how different GAI pathways intervene across tasks is formed. To refine the mapping, each design stage is further broken down into three representative sub-tasks. Preliminary analysis includes public demand analysis, urban data enhancement, case/task framing, and spatial element recognition. Scheme generation covers design intention modeling, spatial layout generation, and 3D form construction. The evaluation and decision-making stage includes multi-objective optimization, scheme evaluation, and scenario prediction. The final expression stage involves textual documentation, 2D representation, and visual rendering. A quantitative analysis is also conducted to show the distribution of model types over design stages, identify common combinations, and trace the evolution of research focus over time. TAM informs a descriptive synthesis of perceived usefulness (PU) and perceived ease of use (PEU) across model types to illuminate adoption patterns.
Results The findings reveal that GAI models are increasingly integrated into urban design workflows but exhibit uneven adoption across task types and modalities. Image-driven models dominate in both early-stage analysis and final visual representation due to their high interpretability, usability, and compatibility with existing design practices. Language-driven models are commonly used in public demand analysis, participatory planning, and scenario scripting, enabled by the rise of large language models (LLMs) such as ChatGPT and DeepSeek. Structure-driven models, though less prevalent, show promise in generating street networks, land-parcel layouts, and spatial typologies using graph-based logic. Feedback-optimized models, which rely on reinforcement learning, evolutionary algorithms, and performance simulation are the least adopted, but demonstrate strong potential in multi-objective optimization and iterative decision-making. Recent research indicates an increasing use of multi-model workflows, such as text-to-image pipelines integrated with urban simulation or feedback loops. While GAI applications increasingly support design iteration, their adoption is heavily influenced by the controllability, explainability, and contextual adaptability of models. PU and PEU vary significantly by model type, with image-driven models rated highest and structure-driven and feedback-optimized models facing usability challenges due to complexity and low transparency.
Conclusion Although GAI has demonstrated broad applicability across the urban design process, current implementations are largely procedural and auxiliary in nature. Most models recombine existing inputs rather than construct original logic, and few possess autonomous reasoning or normative awareness. This limits their role to content augmentation rather than conceptual guidance in design development. Moreover, issues such as opaque decision logic, lack of domain-specific knowledge embedding, and poor adaptability to local planning norms hinder practical adoption. Addressing these challenges requires multi-level efforts: 1) Construct structured, regionally grounded urban design datasets; 2) improve model interpretability, controllability, and responsiveness to professional input; and 3) develop modular, multi-model systems that support seamless interaction across design stages. Human – AI collaboration mechanisms — especially those based on iterative prompts and semantic feedback-must be enhanced to enable AI not just as a tool, but as an active design partner. This review offers a comprehensive reference for scholars and practitioners seeking to understand how GAI is reshaping the logic, structure, and agency of urban design.

HTML全文

参考文献(72)

施引文献

资源附件(3)