首页

当前您的位置: 首页 > 学术讲座 > 正文

Sequential data integration under dataset shift

发布日期:2024-12-10点击: 发布人:统计与数学学院

报告题目:Sequential data integration under dataset shift

主讲人:盛赢 助理研究员(中国科学院数学与系统科学研究院)

时间:2024年12月13日(周五)10:00 a.m.

地点:北院卓远楼305会议室

主办单位:统计与数学学院

摘要:With the rapidly increasing availability of large-scale and high-velocity streaming data, efficient algorithms that can process data in batches without requiring expensive storage and computation resources have drawn considerable attention. An emerging challenge in developing efficient batch processing techniques is dataset shift, where the joint distribution of the collected data varies across batches. If not recognized and addressed properly, dataset shift often leads to erroneous statistical inferences when integrating data from different batches. In this paper, two shift-adjusted estimation procedures are developed for updated estimation of the parameter in the presence of dataset shift. Under prior probability shift, we can obtain parameter estimation and assess the degree of dataset shift simultaneously. We study the asymptotic properties of the proposed estimators and evaluate their performance in numerical studies. The proposed methodologies are illustrated with an analysis of the Ford GoBike docked bike-sharing data. This is a joint work with Jing Qin and Chiung-Yu Huang.

主讲人简介:

中国科学院数学与系统科学研究院助理研究员。2018年在中国科学院数学与系统科学研究院获得博士学位,随后在加州大学旧金山分校从事博士后研究工作。主要研究方向为整合分析、可更新估计、生存分析等,在Biometrics、Technometrics、Statistics in Medicine、Statistica Sinica 等期刊发表学术论文10余篇。主持国家自然科学基金青年项目。