7月22日“数字+”与之江统计讲坛系列讲座预告（105-107讲）

发布者：施宇婷发布时间：2025-07-22浏览次数：10

讲座时间：2025年7月22日(周二) 14：00

地点： 综合楼644会议室

讲座一：Homogeneity Pursuit in Clustered Data Analysis When Cluster Sizes Are Small

报告人简介：

张文扬，澳门大学商务智能及分析讲座教授，统计学三大国际顶尖期刊Journal of the American Statistical Association 及 Annals of Statistics 的副主编，商务和经济统计方面的国际顶尖期刊 Journal of Business & Economic Statistics 的副主编。主要从事：大数据分析，金融数据分析，高维数据分析，非参数建模、时间序列分析、空间数据分析，多层次建模，生存分析，结构方程模型等方向的研究。他在国际顶尖学术期刊发表了很多非常有影响的学术论文，关于ABC方法的一篇论文被引用超过3500多次。他曾先后在英国伦敦政治经济亚洲博彩平台排名、英国 Kent 大学、英国 Bath 大学、英国 York 大学任教。他曾是英国皇家统计学会科研委员会委员(历史上第三位华人担任该委员会委员)，也是2020及2026年香港政府聘请的香港六年一次的科研评比的评审委员。

报告摘要：

Clustered data analysis is an important topic in data science. A well established approach is to assume all clusters share the same unknown parameters of interest, and the difference between different clusters is formulated and accounted for by cluster effects. Whilst this approach works very well in many issues, such as exploring the global impact of an explanatory variable on the response variable, it does not provide much insight about individual attributes of each cluster. Assuming different clusters have completely different parameters would result in too many unknown parameters, which would lead to large variances of the final estimators. Following the idea of homogeneity pursuit, various modelling approaches are proposed in recent literature to group the unknown parameters and explore the individual attributes in clustered data analysis. However, most of them are either difficult to implement or require each cluster to have reasonably big cluster size. In this talk, I will present a new approach, which is easy to implement and does not require any cluster to have big size. I will also show its asymptotic properties without assuming the size of any cluster

tends to infinity. I will also use intensive simulation studies to show the approach works very well when sample size is finite. Finally, I apply the approach to a well known financial dataset to show its superiority in exploring individual attributes in clustered data analysis.

讲座二：Predicting Future Change-points in Time Series

报告人简介：

邱俊业（Chunyip Yau）, 香港中文大学统计系教授，2010年博士毕业于哥伦比亚大学，主要的研究方向包括：时间序列，空间统计，环境统计和变点分析等，已在统计的四大JASA、JRSSB、AOS、Biometrika和计量的顶级杂志JOE等杂志发表论文近60篇，同时也是Journal of Time Series Analysis的Associate Editor和International Journal of Mathematics and Statistics的Chief Editor。

报告摘要：

Change-point detection and estimation procedures have been widely developed in the literature. However, commonly used approaches in change-point analysis primarily focus on detecting change-points within an entire time series (off-line methods), or the quickest detection of change-points in sequentially observed data (on-line methods). Both classes of methods are concerned with change-points that have already occurred. The arguably more important question of when future change-points may occur remains largely unexplored. In this paper, we develop a novel statistical model that describes the mechanism of change-point occurrence. Specifically, the model assumes a latent process in the form of a random walk driven by non-negative innovations, and an observed process which behaves differently when the latent process belongs to different regimes. By construction, an occurrence of a change-point is equivalent to crossing a regime threshold by the latent process. Therefore, by predicting when the latent process will cross the next regime threshold, future change-points can be forecasted. We establish probabilistic properties of the model such as stationarity and ergodicity, and develop a composite likelihood-

based approach for parameter estimation and model selection. Moreover, we construct predictors and prediction intervals for future change-points based on the estimated model.

讲座三： Functional principal component analysis with informative observation times.

报告人简介：

桑培俊, 加拿大滑铁卢大学统计与精算系担任副教授。在Annals of Statistics, Biometrika, Biometrics, 等统计杂志发表过多篇文章。主要研究方向是函数型数据和正样本和无标记数据的半监督学习（PU Learning），尤其是函数型数据分析回归模型中的统计推断问题以及在线学习问题，和PU Learning的半参模型。

报告摘要：

Functional principal component analysis has been shown to be invaluable for revealing variation modes of longitudinal outcomes, which serves as important building blocks for forecasting and model building. Decades of research have advanced methods for functional principal component analysis often assuming independence between the observation times and longitudinal outcomes. Yet such assumptions are fragile in real-world settings where observation times may be driven by outcome-related reasons. Rather than ignoring the informative observation time process, we explicitly model the observational times by a general counting process dependent on time-varying prognostic factors. Identification of the mean, covariance function, and functional principal components ensues via inverse intensity weighting. We propose using weighted penalized splines for estimation and establish consistency and convergence rates for the weighted estimators. Simulation studies demonstrate that the proposed estimators are substantially more accurate than the existing ones in the presence of a correlation between the observation time process and the longitudinal outcome process. We further examine the finite-sample performance of the proposed method using the Acute Infection and Early Disease Research Program study.