Abstract:
Objective As urban design faces increasing demands for contextual responsiveness, iterative optimization, and data-informed reasoning, integrating artificial intelligence into the design process has gained renewed relevance. Among emerging technologies, generative artificial intelligence (GAI) shows strong potential for automating content creation and simulating spatial configurations. This research provides a comprehensive review of recent developments in the application of GAI to urban design. The research identifies representative technical pathways, their respective intervention stages, and the functional mechanisms by which generative models are reshaping the design workflow. This research presents a structured, theory-informed synthesis of how different generative models contribute to tasks such as intention modeling, spatial reasoning, and performance-driven design. Building on design thinking and a descriptive lens informed by the technology acceptance model (TAM), the research examines how model type, data modality, and task characteristics affect GAI’s functional role, usability, and acceptance. Particular attention is given to mapping deployment forms, from isolated tools to coordinated multi-model workflows, and to characterizing cross-cutting challenges of controllability, transparency, and contextual adaptability in urban design settings.
Methods Following the PRISMA protocol, the research conducts a multi-stage literature review combining automated search and expert screening. A total of 125 peer-reviewed articles and high-impact preprints are selected from Web of Science, CNKI, arXiv, and selected industry sources, covering the period from 2014 to July 2025. Search terms such as “generative AI”, “AIGC”, “GAN”, “diffusion model”, “variational autoencoder”, “autoregressive model”, “large language model”, and urban-related keywords are used in various combinations. Based on the collected literature, four types of generative models are summarized as image-driven, language-driven, structure-driven, and feedback-optimized models, according to their application characteristics in urban design tasks. These types are aligned with four stages of the design process: preliminary analysis, scheme generation, evaluation and decision-making, and outcome expression. On this basis, a two-dimensional framework to examine how different GAI pathways intervene across tasks is formed. To refine the mapping, each design stage is further broken down into three representative sub-tasks. Preliminary analysis includes public demand analysis, urban data enhancement, case/task framing, and spatial element recognition. Scheme generation covers design intention modeling, spatial layout generation, and 3D form construction. The evaluation and decision-making stage includes multi-objective optimization, scheme evaluation, and scenario prediction. The final expression stage involves textual documentation, 2D representation, and visual rendering. A quantitative analysis is also conducted to show the distribution of model types over design stages, identify common combinations, and trace the evolution of research focus over time. TAM informs a descriptive synthesis of perceived usefulness (PU) and perceived ease of use (PEU) across model types to illuminate adoption patterns.
Results The findings reveal that GAI models are increasingly integrated into urban design workflows but exhibit uneven adoption across task types and modalities. Image-driven models dominate in both early-stage analysis and final visual representation due to their high interpretability, usability, and compatibility with existing design practices. Language-driven models are commonly used in public demand analysis, participatory planning, and scenario scripting, enabled by the rise of large language models (LLMs) such as ChatGPT and DeepSeek. Structure-driven models, though less prevalent, show promise in generating street networks, land-parcel layouts, and spatial typologies using graph-based logic. Feedback-optimized models, which rely on reinforcement learning, evolutionary algorithms, and performance simulation are the least adopted, but demonstrate strong potential in multi-objective optimization and iterative decision-making. Recent research indicates an increasing use of multi-model workflows, such as text-to-image pipelines integrated with urban simulation or feedback loops. While GAI applications increasingly support design iteration, their adoption is heavily influenced by the controllability, explainability, and contextual adaptability of models. PU and PEU vary significantly by model type, with image-driven models rated highest and structure-driven and feedback-optimized models facing usability challenges due to complexity and low transparency.
Conclusion Although GAI has demonstrated broad applicability across the urban design process, current implementations are largely procedural and auxiliary in nature. Most models recombine existing inputs rather than construct original logic, and few possess autonomous reasoning or normative awareness. This limits their role to content augmentation rather than conceptual guidance in design development. Moreover, issues such as opaque decision logic, lack of domain-specific knowledge embedding, and poor adaptability to local planning norms hinder practical adoption. Addressing these challenges requires multi-level efforts: 1) Construct structured, regionally grounded urban design datasets; 2) improve model interpretability, controllability, and responsiveness to professional input; and 3) develop modular, multi-model systems that support seamless interaction across design stages. Human – AI collaboration mechanisms — especially those based on iterative prompts and semantic feedback-must be enhanced to enable AI not just as a tool, but as an active design partner. This review offers a comprehensive reference for scholars and practitioners seeking to understand how GAI is reshaping the logic, structure, and agency of urban design.