Generative Text-to-Image Models in Architectural Design: A Study on Relationship of Language, Architectural Quality and Creativity

Emel Cantürk Alyıldız

doi:10.5281/zenodo.10057738

Authors

Emel Cantürk Alyıldız KocaeliUniversity, Faculty of Architecture and Design, Kovaeli, Türkiye

DOI:

https://doi.org/10.5281/zenodo.10057738

Keywords:

Artificial Intelligence, architectural design, text-to-image generation models, design method, architectural quality, architectural creativity

Abstract

Text-guided generation of images with deep learning technology has made significant advances and has seen an increasing interest since 2021. With these mostly web-based models, users can synthesise photorealistic and high-quality digital images from natural language descriptions with no or little understanding of the underlying technology. Although these AI technologies are in the early phases, there is already an explosion in AI-generated architectural activity. While generative AI technologies propose a new design method for designers and architects, it will undoubtedly redefine the skills, knowledge and competencies that designers should equipped with. This research focuses on understanding the “artificial intelligence – architect” interaction as a design method, specifically the “language as a design driver”, and interrogates the role of the designer in AI-driven design. In the context of the research, the textual inputs (“prompts”) and the outputs of the architectural design studies of 36 subjects generated in Midjourney – a text-to-image latent diffusion model – were analysed in terms of the possible relationships between the language of the prompts, (1) prompt length, (2) descriptive language, (3) specific architecture-related indicators, and the quality of the outputs in two terms of architectural quality and architectural creativity.

References

Bolojan, D. (2022). Creative AI: Augmenting Design Potency, in Machine Hallucinations: Architecture and Artificial Intelligence, Mathias del Campo and Neil Leach (ed.) AD, 3 (92), 22-27.

Bommasani, R. (2023). AI Spring? Four Takeaways from Major Releases in Foundation Models, Stanford Institute for Human-Centered Artificial Intelligence. Retrieved from https://hai.stanford.edu/news/ai-spring-four-takeaways-major-releases-foundation-models, 30.07.2023.

Brown, T.B, Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, R., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henghan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., & Dario, A. (2020). Language Models are Few-Shot Learners. ArXiv, 22 Jul 2020, https://doi.org/10.48550/arXiv.2005.14165

Carpo, M. (2017). The Second Digital Turn: Design Beyond Intelligence. MIT Press.

Chaillou, S. (2022). Artificial Intelligence and Architecture: From Research to Practice. Basel: Birkhauser.

Clark, A., & Chalmers, D. (1998). The Extended Mind. Analysis, 58(1), 7-19. https://doi.org/10.1093/analys/58.1.7

Crespo, S., & McCormick, F. (2022). Augmenting Digital Nature: Generative Art as a Constructive Feedback Loop, in Machine Hallucinations: Architecture and Artificial Intelligence, Mathias del Campo and Neil Leach (ed.) AD, 3 (92), 54-59.

Dang, H., Mecke, L., Lehmann, F., Goller, S., & Buschek, D. (2022). How to Prompt? Opportunities and Challenges of Zero- and Few-Shot Learning for Human-AI Interaction in Creative Applications of Generative Models. ArXiv. 3 Sep 2022. https://doi.org/10.48550/arXiv.2209.01390

Deckers, N., Fröbe, M., Kiesel, J., Pandolfo, G., Schröder, C., Stein, B., & Potthast, M. (2023). The Infinite Index: Information Retrieval on Generative Text-To-Image Models. ArXiv, 21 Jan 2023. https://doi.org/10.48550/arXiv.2212.07476

Dhariwal, P., & Nichol, A. (2021). Diffusion Models Beat GANs on Image Synthesis. arXiv, 1 Jun 2021. https://doi.org/10.48550/arXiv.2105.05233

del Campo, M., & Leach, N. (2022). Can Machines Hallucinate Architecture?: AI as a Design Method, in Machine Hallucinations: Architecture and Artificial Intelligence, Mathias del Campo and Neil Leach (ed.) AD, 3 (92), 6-13.

del Campo, M., & Manninger, S. (2022). Strange but Familiar Enough: The Design Ecology of Neural Architecture, in Machine Hallucinations: Architecture and Artificial Intelligence, Mathias Del Campo and Neil Leach (ed.) AD, 3 (92), 38-46.

del Campo, M., & Carlson, A, I. (2022). Strange but Familiar Enough: Reinterpreting Style in the Context of AI, in Artificial Intelligence and Architecture: From Research to Practice, Stanislas Chaillou (ed.) Basel: Birkhauser, 72-79.

dPrix, W., Schmidbaur, K., Bolojan, D., & Baesta, E. (2022). The Legacy Sketch Machine: From Artificial to Architectural Intelligence, in Machine Hallucinations: Architecture and Artificial Intelligence, Mathias del Campo and Neil Leach (ed.) AD, 3 (92), 14-21.

Goodfellow, I. J, Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014) Generative Adversarial Networks. arXiv, 10 June 2014. https://doi.org/10.48550/arXiv.1406.2661

Karras, T., Laine, S., & Aila, T. (2019). A Style-Based Generator Architecture for Generative Adversarial Networks. arXiv, 29 Mar 2019. https://doi.org/10.48550/arXiv.1812.04948

Koh, I. (2022). Architectural Plasticity: The Neural Sampling of Forms, in Artificial Intelligence and Architecture: From Research to Practice, Stanislas Chaillou (ed.) Basel: Birkhäuser, 110-117.

McCarthy, J., Minsky, M. L., Rochester, N., & Shannon, E. S. (2006 [1995]). A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence, August 1955. AI Magazine, 24(4), 12-14. https://doi.org/10.1609/aimag.v27i4.1904

Schön, Donald A., (1987). Educating the Reflective Practitioner: Towards a New Design for Teaching and Learning in the Professions, San Francisco, CA: Jossey-Bass Publishers.

Webster, H., (2008). Architectural Education after Schön: Cracks, Blurs, Boundaries and Beyond. Journal for Education in the Built Environment, 3(2), 63-74.

Oppenlaender, J., Linder, R., & Silvennoinen, J. (2023). Prompting AI Art: An Investigation into the Creative Skill of Prompt Engineering. arXiv, 13 Mar 2023. https://doi.org/10.48550/arXiv.2303.13534

Oppenlaender, J., (2022). The Creativity of Text-to-Image Generation. arXiv, 31 Oct 2022. https://doi.org/10.48550/arXiv.2206.02904

Howe, J. (1994). Artificial Intelligence at Edinburgh University: a Perspective. Retrieved from https://www.inf.ed.ac.uk/about/AIhistory.html, 11.04.2023.

Reynolds, L., & McDonell, K. (2021). Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm. ArXiv, 15 Feb 2021. https://doi.org/10.48550/arXiv.2102.07350

Ramesh, A., Pavlov, M., Goh, G., Gray, S., Voss, C., Radford, A., Chen, M., & Sutskever, I. (2021). Zero-Shot Text-to-Image Generation. ArXiv, 26 Feb 2021. https://doi.org/10.48550/arXiv.2102.12092

Russel, S. J., & Norvig, P. (2003). Artificial Intelligence: A Modern Approach, New Jersey: Prentice Hall.

Generative Text-to-Image Models in Architectural Design

A Study on Relationship of Language, Architectural Quality and Creativity

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

Language