Skip to main content

CANSSI Quebec's Inaugural Postdoc Day

September 7, 2023
By Christopher Plenzich

CANSSI Quebec Postdoc Day

CANSSI (Canadian Statistical Sciences Institute) Quebec hosted its inaugural Postdoc Day on September 7th, 2023. The event took place at Concordia University's 4th Space and featured 8 postdoctoral presenters from various locations in Quebec, showcasing their research in statistical sciences.

The vision for this event came from CANSSI Quebec's Interim Regional Director, Mélina Mailhot, and project coordinator, Christopher Plenzich. Their efforts attracted a strong attendance of over 50 attendees, both in person and online. The event was a great success and has set the stage for future exciting events by CANSSI Quebec.

CANSSI Quebec thanks Concordia University and the 4th Space team for their collaboration.

If you missed out, don't worry! You can catch all the presentations on YouTube.


10:00am - 10:10am Opening Remarks  
10:10am - 10:40am Lenin Del Rio Amador, Université du Québec à Montréal  
10:45am - 11:15am Arthur Chatton, Université de Montréal  
11:20am - 11:50am Cong Jiang, Université de Montréal  
11:55am - 12:25pm Bhargob Deka, Polytechnique Montréal  
12:25pm - 1:30pm Lunch Break  
1:30pm - 2:00pm Laetitia Jeancolas, Concordia University  
2:05pm - 2:35pm Roxane Turcotte, Université du Québec à Montréal  
2:40pm - 3:10pm Marlon Moresco, Concordia University  
3:15pm - 3:45pm Rishikesh Yadav, HEC Montreal  
3:45pm - 4:00pm Closing Remarks
4:00pm - 6:00pm Reception (LB921-04)  

Presentation Abstracts

A global view of the ENSO-floods correlations and the future socioeconomic impact of floods based on hybrid simulations

Lenin del Rio Amador, Mathieu Boudreault and David Carozza

Department of Mathematics, Université du Québec à Montréal, Montreal, Quebec, Canada

Floods are generally considered as one of the most significant extreme events in terms of casualties and losses. Although they are a direct consequence of weather and climate extremes, a full understanding of the influence of climate change on their socioeconomic impact requires the consideration of the human component in terms of exposure and vulnerability. At the same time, a global view of flood risk based on large catalogues of physically consistent events is of key interest to environmental research, climate science, economics, and financial risk management. However, the lack of flexibility to account for socioeconomic variables and the high computational cost to produce large global event sets of flood impact for future climate scenarios make the use of regional hydrological models unpractical. As an alterative, in this work we use the global flood modeling framework developed by Carozza and Boudreault (C&B) (Carozza & Boudreault, 2021).

The C&B model applies statistical and machine learning methods to relate historical flood occurrence and impact data with climatic, watershed, and socioeconomic factors for 4,734 basins at Pfafstetter level 5 globally. The model is climate‐consistent, global, fast, flexible, and ideal for applications that do not necessarily require high‐resolution flood mapping. After training with observational data in the period 1986-2017, the climate variables are replaced with bias-corrected output from the NCAR CESM Large Ensemble (40 members) to project flood impact in the period 1981-2060. This effective 3200 years of simulated data were used to determine the impacts of ENSO on flooding and to produce a variety of event sets to assess how climate change and future socioeconomic growth may affect flood impact.

Personalised longitudinal super learning: an application in hemodiafiltration

Dynamic prediction models provide predicted outcome values that can be updated over time for an individual as new measurements become available. Such models are crucial to provide updated predictions to their users to give them the most useful information at the current time. Furthermore, personalised predictive models target a specific individual and use data from an "historical" cohort (other individuals with data previously recorded, but for which we will not make a prediction) to improve the accuracy of the predictions for the targeted individual. Personalised online super learner (POSL, Malenica et al., 2023) is an ensemble machine learning method for personalised dynamic predictions. It can combine multiple parametric and/or non-parametric approaches (referred as candidate learners) to obtain the best (convex or non-convex) combination of their predictions through cross-validation. We illustrate the use of the personalised online super learner for predicting the convection volume of patients undergoing hemodiafiltration. Models trained on the full trajectory of the historical cohort performed better for predictions at earlier times, while models trained on the history of the targeted individual outperformed at later times. The non-convex POSL outperformed the candidate learners in terms of accuracy, while being slightly less well-calibrated. Furthermore, nonconvex POSL achieved the best discrimination with an AUROC equal to 0.82 (95%CI from 0.80 to 0.83). We will end the presentation by discussing the choices and challenges of implementing POSL.

Vaccine effectiveness estimation under the test-negative design: identifiability and efficiency theory for causal inference under conditional exchangeability

The test-negative design (TND) is routinely used for the monitoring of seasonal flu vaccine effectiveness and recently become integral to COVID-19 vaccine surveillance. Distinct from the case-control study, it recruits participants with a common symptom presentation and tests them for the target infection. Positive tests are considered "cases," while negative tests are "controls." Logistic regression has traditionally adjusted for confounders to estimate vaccine effectiveness in TND. However, it may be biased if effect modification by a confounder exists. We first review an inverse probability of treatment weighting estimator for the marginal risk ratio that is valid under effect modification but requires parametric modeling for vaccination probability. To address this limitation, we propose a novel doubly robust and efficient estimator of the marginal risk ratio. We theoretically and empirically demonstrate the parametric convergence rates achieved through machine learning of the nuisance functions.

Online Quantification of Aleatory Uncertainties in Probabilistic Models

Engineering problems rely on probabilistic models for decision making tasks. The state-space models and the Bayesian neural networks are such models used for time series forecasting and regression tasks, respectively. These approaches involve unknown parameters for not only modeling physical phenomena but also for quantifying the model’s epistemic and aleatory uncertainties. Even though there can be a closed-form solution possible for estimating the mean and the epistemic uncertainties, it still remains a challenge to develop a computationally efficient method for estimating the aleatory uncertainties. This would enable scaling our probabilistic models for large-scale implementation. This seminar presents an analytical Bayesian inference method called the approximate Gaussian variance inference (AGVI) in the context of time series and regression tasks. AGVI enables performing closed-form online estimation of the error covariance matrix quantifying the aleatory uncertainties. Two applied examples are included for time series modeling where the first case study compares the performance of AGVI with the existing adaptive Kalman filtering (AKF) approaches and the second case study show its application on real datasets from a concrete gravity dam. The method is also thoroughly compared with other Bayesian approaches for small and large regression tasks. The proposed method can be competitive or exceed the performance of existing approaches in terms of its predictive capacity while being up to orders of magnitude faster.

Does the amyloid status impact the neuro vascular coupling in subjective memory loss population?

A growing body of research suggests that vascular dysfunction contributes to the pathophysiology of Alzheimer's disease (AD) AD). Several studies have reported reductions in cerebral blood flow ( CBF across AD clinical stages [1], but the task related changes of micro vascular perfusion (neurovascular remain unknown at AD preclinical stage. The goal of this study is to inve stigate whether amyloidosis in cognitively normal individuals with subjective memory complaints impacts their neur ovascular coupling , to better understand AD pathogenesis.

Semi‐parametric modeling of risk exposure with monotonicity constraints in automobile insurance

Generalized additive models (GAM) and generalized additive models for location, scale, and shape (GAMLSS) allow the inclusion of smoothing functions in the modeling of model parameters. This allows for a much more flexible structure between an explanatory variable and a response variable since linearity is no longer imposed as in the case of a generalized linear model (GLM). This type of model makes it easier to analyze the relationship between variables. However, when working with data associated with a practical context, such flexibility is not always desirable, since certain form constraints may be necessary. In the case of car insurance data, risk exposure should always be an increasing function, since the greater the risk exposure, the higher the probability of an accident, and this increase should be reflected in the premium. In this work, shape constraints to model risk exposure measured by mileage have been included in GAM and GAMLSS models. We use data from a Canadian company to illustrate the proposed approaches. We discuss the relevance of mileage as a measure of risk exposure and how the measure should be included in the modeling.

Uncertainty Propagation and Dynamic Robust Risk Measures

This paper introduces a novel framework for assessing uncertainty in risk measures and proposes dynamic robust risk measures as a solution. Uncertainty is a critical aspect in analyzing dynamic risks, particularly in domains such as insurance and finance. We present a new notion of dynamic uncertainty sets designed explicitly for discrete stochastic processes. These uncertainty sets capture the inherent uncertainty surrounding random losses or models, accounting for factors such as distributional ambiguity. Our framework provides a comprehensive representation of uncertainty by considering a range of viable candidates for the actual loss or model.

Within this framework, we define dynamic robust risk measures that are the supremum of all candidates' risks in the uncertainty set. These measures offer a robust and reliable approach to quantifying risk, considering the evolving nature of uncertainty over time. Our approach provides a more accurate and realistic risk assessment by incorporating dynamic aspects. At the same time, the static case is still a particular case of our framework. The proposed framework has practical implications in various domains, such as insurance and finance, where accurate risk assessment is vital for decision-making. It enables practitioners to understand better and manage the inherent uncertainties inherent in dynamic risk scenarios. By incorporating the dynamic robust risk measures, decision-makers can make more informed choices that account for the evolving nature of uncertainty.

The conditional spatial extremes model of Wadsworth and Tawn, which focuses on extreme events given threshold exceedance at a site, has garnered a lot of attention as a flexible way to model large-scale spatiotemporal events. We consider extensions that combine Gaussian Markov random field residual processes along with data augmentation schemes for dealing with left-censored realizations, exploiting the sparsity of the precision matrix obtained through the basis function approximation of the Gaussian process. Models are fitted using Markov chain Monte Carlo methods through a combination of Metropolis-within-Gibbs and Langevin steps, e.g., Metropolis-adjusted Langevin algorithm (MALA). We showcase the scalability of the approach using precipitation data from British Columbia.

Back to top

© Concordia University