學術演講

TIGP (SNHCC) -- Generative Data Science towards Trustworthy Data Collaboration

講者王啟樺博士 (Department of Statistics and Data Science, University of California, Los Angeles (UCLA))
邀請人：TIGP (SNHCC)
時間2025-09-15 (Mon.) 09:00 ~ 11:00
地點Google Meet

線上串流

Google Meet 會議參加資訊
【視訊通話】
或撥打以下電話號碼：‪(US) +1 508-834-8543‬ PIN 碼：‪351 453 313‬#
【更多電話號碼】

摘要

The modern digital economy increasingly depends on data collaboration, yet concerns about privacy, fairness, and regulatory compliance hinder the direct sharing of sensitive data. This talk revisits the role of generative data science in enabling trustworthy collaboration by focusing on synthetic tabular data generation and its integration with differential privacy. We begin by motivating why data collaboration has become essential for branding, marketing, and digital platforms under concurrent regulations such as the EU Digital Markets Act and GDPR. We then survey the progress of generative modeling for tabular data, highlighting representative approaches including GAN-based (CTGAN), diffusion-based (TabDDPM), and language-based (GReaT) methods, and discuss how fidelity, utility, and privacy jointly determine the quality of synthetic data. Finally, we examine differential privacy as a legal and technical standard for protecting personally identifiable information, and explore its application to statistics, machine learning models, and generative frameworks. Together, these perspectives outline a pathway towards building a principled framework of generative data science that supports trustworthy data collaboration in both academic and industrial contexts.

中央研究院資訊科學研究所

活動訊息

學術演講

TIGP (SNHCC) -- Generative Data Science towards Trustworthy Data Collaboration

線上串流

摘要