Rethinking smart cities: Synthetic data must reflect human lives, not just structures
The study shows a critical gap in modern urban informatics. Synthetic data and digital twin models, the authors note, have been instrumental in planning urban infrastructure, but they remain “object-oriented.” Current systems use satellite and sensor data to build photorealistic city models that can simulate physical appearances but rarely integrate social dynamics or lived experiences.
A new study warns that the next frontier of urban planning, powered by artificial intelligence (AI) and synthetic data, risks reproducing real-world inequities if it continues to focus on physical infrastructure rather than human lives. In a paper titled "Urban Synthetic Data for Whom? Shifting the Focus from Objects to People," published in the Journal of Planning Education and Research, researchers Xinyue Ye (University of Alabama), Wei Zhai (University of Texas at Arlington), and Xishuang Dong (Prairie View A&M University) call for a sweeping shift in how cities use synthetic data and digital twins.
The authors argue that while urban synthetic data has transformed city planning, it has largely mirrored a machine's-eye view, streets, vehicles, and buildings, while excluding the dynamic realities of people who live and move within them. Their work proposes a human-centered model of digital twins, virtual city replicas powered by synthetic data that ethically simulate human behaviors and social diversity.
Reframing digital twins around human experience
The study shows a critical gap in modern urban informatics. Synthetic data and digital twin models, the authors note, have been instrumental in planning urban infrastructure, but they remain "object-oriented." Current systems use satellite and sensor data to build photorealistic city models that can simulate physical appearances but rarely integrate social dynamics or lived experiences.
Ye and his co-authors advocate for human-centric digital twins, which embed behavioral diversity, demographic variation, and cultural patterns into simulations. Such systems could move beyond static visualizations toward interactive, evolving models that better represent urban vibrancy.
By applying reweighting techniques, scenario sampling for vulnerable groups, and co-design validation with community stakeholders, synthetic data could start to reflect not just buildings but the people who inhabit them. This evolution, they argue, will ensure that digital twins envision inclusive urban futures rather than replicating existing social biases.
Synthetic data as a bridge between privacy and inclusion
A major challenge in using real human data for urban research is privacy. Regulations such as GDPR and HIPAA limit the availability of detailed human mobility and behavioral datasets. Synthetic data, artificially generated datasets that replicate statistical patterns without revealing real identities, offers a solution.
The study shows how generative AI techniques, including GANs, diffusion models, and physics-based simulations, are now capable of producing lifelike depictions of human movement, crowd interactions, and demographic variation. These models enable cities to experiment with ethically safe, privacy-preserving urban scenarios while maintaining scientific rigor.
For instance, Virtual Singapore uses synthetic pedestrian data to simulate how elderly or disabled residents interact with public spaces. In Stockholm, synthetic traffic simulations help test infrastructure resilience under congestion or extreme weather. During the Paris 2024 Olympics, planners deployed digital twins built with synthetic pedestrian dynamics to manage crowd flows for hundreds of thousands of daily visitors—balancing safety, accessibility, and efficiency.
The authors emphasize that synthetic data not only enhances behavioral realism but also democratizes research collaboration. Because it can be shared without breaching privacy laws, it enables universities, local governments, and tech companies to co-develop models that are more inclusive and equitable.From Ethical Design to Participatory Urban Futures
The study takes a bold stance on governance, calling for participatory synthetic data design to ensure fairness and accountability. According to Ye, Zhai, and Dong, this involves involving residents, advocacy groups, and community organizations at every stage, from model creation to validation.
They outline multiple real-world examples of how this participatory approach can reshape policy. In Los Angeles, combining cellphone data with simulated "shadow populations" revealed unmet transit needs for night-shift workers, influencing new bus routes. In Glasgow, wheelchair users co-programmed virtual agents to model realistic turning radii, ensuring accessibility compliance. In Barcelona, multilingual AI interfaces increased civic participation in zoning consultations by 37%.
The authors argue that transparency and co-design can turn digital twins from technocratic tools into living civic systems. By embedding privacy-preserving algorithms such as federated learning and differential privacy, cities can maintain accountability while empowering communities to shape their own urban futures.
However, they caution that synthetic data carries limitations. It can underrepresent minority mobility behaviors or oversmooth outliers critical for resilience planning. To mitigate these risks, the study recommends hybrid models combining synthetic and real-time sensor data, as well as equity audits to detect underrepresentation before simulations are deployed in real-world policy.
- FIRST PUBLISHED IN:
- Devdiscourse