Asymptopia — Living in a world of big data; Spatial Statistics; Statistical learning for spectroscopic data; The use of survival analysis
Day 2 – Foundations of Data Science II (Tuesday 7th December)
9.30am: Prof. Brendan Murphy, UCD. Topic: Asymptopia — Living in a world of big data
Brief synopsis : This talk will review a selection of large sample theory results from statistics that are of relevance in the era of big data. Topics reviewed will include parameter estimation, density estimation, function estimation and Bayesian methods. Please ensure participants have R Studio installed on machines to run code.
Speaker biography: Brendan Murphy is Full Professor and Head of School in the School of Mathematics and Statistics at University College Dublin. He has research interests in clustering, classification and latent variable modeling. He is interested in applications from social sciences, food science, sensors, medicine and biology. He is currently Editor for Social Sciences and Government for the Annals of Applied Statistics. He has recently co-authored a research monograph on Model- Based Clustering and Classification.
11.00am: Dr Shirin Moghaddam, UL. Topic: The use of survival analysis in prediction of biochemical recurrence in prostate cancer
Brief synopsis: Prostate cancer (PCa) represents a significant healthcare problem due to the dilemmas associated with its detection and treatment especially with the projected increase in its incidence. Treatment decisions represent a significant dilemma and determining the long-term outcome of a patient would better inform treatment decisions. Either active surveillance which spares unnecessary intervention versus active treatment which is associated with considerable side effects and quality of life issues. Biochemical recurrence (BCR) represents the first sign of treatment failure and predicting BCR pre-treatment would better inform treatment decisions. In this presentation the benefit of using survival models for predicting time to BCR will be discussed and a predictive model will be presented which could improve clinical decision making for physicians and patients.
Speaker biography: Shirin Moghaddam is a Lecturer in Statistics & Data Science in University of Limerick. She obtained her PhD from NUI Galway on Bayesian Imputation of Right Censored Data in Time-To-Event Studies. Her research interests include Survival analysis, Bayesian approach and Machine learning, in particular, their application in cancer research. Shirin is the vice-chair of the Young Statisticians’ Section of the Irish Statistical Association and also a member of Cancer Trials Ireland.
12.00am: Dr Katarina Domijan, MU. Topic: Statistical learning for spectroscopic data
Brief synopsis: This lecture will introduce examples of data arising in chemometric applications. Classical methods as well as some state of the art machine learning models will be presented.
Speaker biography: Dr. Domijan is a Lecturer/Assistant Professor in the Department of Mathematics and Statistics at Maynooth University. Her expertise lie in applying statistics and machine learning techniques to analyse data of complex structure that arise in a variety of applications. Along with modelling, research also concerns model visualization and interpretability as well as statistical computing. In particular, she works in developing software tools that address the interpretability deficit of complex machine learning models fitted to high-dimensional datasets.
2.00-5.00pm: Prof. Chris Brunsdon, MU. Topic: Spatial Statistics
Brief synopsis: In this talk Chris will explain the key ideas of spatial statistics, and provide a number of practical examples using R, and in particular the ‘mgcv’ package for analysis, and the ‘tmap’ and ’sf’ packages to manipulate and visualise geographical data.
The talk is broadly divided into three sections
1. Working with geographical data in R – including how read in spatial data, and use it to create maps via ’sf’ and ’tmap’
2. Spatial Statistical models with area-based data – including the use of Markov random fields to model broadband uptake in Ireland, based on census data for Irish electoral divisions
3. Spatial Statistical models with point-based data – including the analysis of historical rainfall data in Ireland.
Practical worked examples will be available in blog form for reference after the talk is finished.
R and all of the standard packages (these will be included in a standard installation of R) – with some further ones:
The SFI Centre for Research Training in Foundations of Data Science will train a cohort of PhD students with world-class foundational understanding in the horizontal themes of Applied Mathematics, Statistics, and Machine Learning.
For perfomance reasons we use Cloudflare as a CDN network. This saves a cookie "__cfduid" to apply security settings on a per-client basis. This cookie is strictly necessary for Cloudflare's security features and cannot be turned off.