4eu+ summer school: Modern probability theory and its application

Warsaw, Poland

September 24, 2021, 08:00

presentation part I (pdf)
presentation part II (pdf)
presentation part III (pdf)
R session (R)

M. Bogicevic: The Tukey median - State of the art algorithms, implementations and comparisons

Department of Probability and Math. Statistics, MFF
Praha, Czech Republic

December 3, 2019, 10:40, Praktikum KPMS


This presentation is a short survey of Tukey median algorithms. It covers the chronological appearance of the implemented algorithms and its properties in regard to algorithms' complexities and execution times. Each algorithm is tested with generated and real data sets.

P. Laketa: The idea of random environment integer valued autoregressive models

Department of Probability and Math. Statistics, MFF
Praha, Czech Republic

December 3, 2019, 10:40, Praktikum KPMS


Lately, new type of INAR models is introduced, which overcomes problem with the data showing non-stationary characteristics. New models are more flexible. They are constructed using one Markov chain, named random environment process, containing the states of the environment in which the main process is observing.

C. Schütt: Flag numbers and floating bodies

Noon Lecture, Department of Applied Mathematics, MFF
Praha, Czech Republic

September 4, 2018, 12:15, room S6


We establish asymptotic results for weighted floating bodies of polytopes, which show connections between the weighted volume of a polytope and its flags. This connection is established by introducing the notion of flag simplices, which translate between the metric and combinatorial structure.

E. Werner: Floating bodies and approximation of convex bodies by polytopes

Noon Lecture, Department of Applied Mathematics, MFF
Praha, Czech Republic

August 28, 2018, 12:15, room S8

presentation (pdf)


How well can a convex body be approximated by a polytope? This is a fundamental question in convex geometry, also in view of applications in many other areas of mathematics and related fields. It often involves side conditions like a prescribed number of vertices and a requirement that the body contains the polytope or vice versa. Accuracy of approximation is often measured in the symmetric difference metric but other metrics can and have been considered. We will present several results, mostly related to approximation by "random polytopes". We will introduce floating bodies and an affine invariant from affine differential geometry associated to them, the affine surface area. This affine invariant appears naturally as an important ingredient in approximation questions.

Workshop on Functional Data Analysis

July 12, 2018
A series of informal talks about the diverse aspects of functional data analysis. The workshop is open to everyone.

Thursday, July 12, 2018 - K6 - KPMS, MFF, Sokolovská 83, Praha

  • 13:00-14:00 Frédéric Ferraty (University of Toulouse):
     Estimation of temperature-dependent growth profiles of fly larvae with application to criminology
  • 14:00-14:30 Jiří Dvořák:
     Envelope tests in functional data analysis
  • 14:30-15:00 Veronika Římalová (Palacký university Olomouc):
     An inferential framework for the analysis of spatio-temporal geochemical data
  • 15:00-15:30 Zuzana Rošťáková (Slovak Academy of Sciences):
     Probabilistic modelling and functional data analysis of sleep structure
  • 15:30-16:00 Stanislav Nagy:
     Nonparametric analysis of the shape of random curves

Frédéric Ferraty: Estimation of temperature-dependent growth profiles of fly larvae with application to criminology

presentation (pdf)

It is not unusual in cases where a body is discovered that it is necessary to determine a time of death or more formally a post mortem interval (PMI). Forensic entomology can be used to estimate this PMI by examining evidence obtained from the body from insect larvae growth. In this talk, we propose a method to estimate the hatching time of larvae (or maggots) based on their lengths, the temperature profile at the crime scene and experimental data on larval development. This requires the estimation of a time-dependent growth curve from experiments where larvae have been exposed to a relatively small number of constant temperature profiles. Since the temperature influences the developmental speed, a crucial steps is the time alignment of the curves at different temperatures. We then propose a model for time varying temperature profiles based on the local growth rate estimated from the experimental data. This allows us to find out the most likely hatching time for a sample of larvae from the crime scene. We explore via simulations the robustness of the method to errors in the estimated temperature profile and apply it to the data from two criminal cases from the United Kingdom. Asymptotic properties are also provided for the estimators of the growth curves and the hatching time.

Joint work with D. Pigoli, J.A.D. Aston, A. Mazumder, C. Richards and M.J.R. Hall.

Jiří Dvořák: Envelope tests in functional data analysis

presentation (pdf)

In spatial statistics it is a custom in the last decades to use functional summary statistics rather than numerical ones to describe observed data or theoretical properties of a model. This makes statistical inference of such data challenging and functional data analysis comes in. In this talk we will review the concept of envelope tests that has been developed in recent years and has become popular in the field of spatial statistic. The envelope test deals with the situation where we have observed a single function (as is common in this field) and we want to decide if it is extreme or not extreme with respect to a given population of functions. The advantage of the envelope test is that it provides a graphical interpretation and indicates the reason of (possible) rejection. This is important in applications as it facilitates coming up with interpretations and new hypotheses.

Veronika Římalová: An inferential framework for the analysis of spatio-temporal geochemical data

presentation (pdf)

In the broad framework of functional data analysis, the aim of this contribution is to introduce a functional regression framework for modelling space-time geochemical measurements. The motivating data set includes monthly measurements of potassium chloride pH taken from the site near Brno, Czech Republic. Sampling locations were selected with the purpose of testing if the site can be divided in two parts, agricultural and forest soil, according to its chemical properties. We suggest treating measurements as functions of time distributed in space and propose a function-on-scalar spatial regression model to describe the temporal distribution of the geochemical elements. To test for the possible differences between the two soil types, we propose a non-parametric functional testing procedure. The inference cannot be done directly on the original observations due to their dependency on spatial coordinates, instead it is performed on the residuals of the spatial functional model. Several regression models were fit to the data in order to derive spatially independent and thus permutable residuals. The proposed methodology will be demonstrated on the available geochemical dataset and geological interpretation of the results will be given.

Joint work with A. Menafoglio, A. Pini and E. Fišerová.

Zuzana Rošťáková: Probabilistic modelling and functional data analysis of sleep structure

presentation (pdf)

Sleep is a dynamical process, which plays important role in our lives. Its structure, quality and length influence humans’ daily behaviour, affectivity, mood and also health. Current studies dealing with relationship between sleep structure and humans’ physiological state or well-being measures are based on the extraction of one-dimensional sleep characteristics and their correlation with variables representing daily life behaviour. However, we hypothesise that this can lead to the loss of important information about the sleep dynamics. The Probabilistic sleep model (PSM) provides an alternative way of sleep modelling. The PSM operates on three-second long time segments of the EEG signal for each of which probability values of their relationship to one of sleep states - called microstates - is computed. Considering the probability values of a sleep microstate as a function of time we obtain a sleep probabilistic curve. In this presentation we provide an overview of chosen techniques of the functional data analysis, which may be useful in the sleep structure analysis and its relationships with measures representing daily behaviour (subjectively scored sleep quality, physiological factors, cognitive tests). We mainly focus on the functional cluster analysis, time alignment of curves and multilevel functional principal component analysis.

Joint work with R. Rosipal.

Stanislav Nagy: Nonparametric analysis of the shape of random curves

presentation (pdf)

In many situations, the shape of functional observations is an important feature that must be taken into account in statistical analysis. The information about the shape properties can be extracted from the derivatives of the sample trajectories. Though, this approach can be applied only if the curves are regular and smooth, and the derivatives must be estimated. We present a simple alternative to this methodology based on simultaneous evaluation of multivariate projections of the data. This technique does not require smoothness or continuity, yet provides fine recognition of shape traits of the curves. The idea is illustrated on - but not limited to - functional data depth.

Introductory Workshop PRIMUS

January 30 and 31, 2018
A series of informal talks of the members of the research team on their past and present research activities, and general research interests. The workshop is open to everyone, no registration is needed.

Tuesday, January 30, 2018 - "Praktikum" KPMS, MFF

  • 09:00-10:00 Daniel Hlubinka:
     Estimation of levels sets
  • 10:00-11:00 Jiří Dvořák:
     On point processes, Monte Carlo testing and stochastic reconstruction
  • 11:00-12:00 Pavel Valtr:
     From geometry to data depth
  • 12:00-13:00 Jan Rataj:
     Curvature measures and integral-geometric formulae

Wednesday, January 31, 2018 - "Praktikum" KPMS, MFF

  • 16:00-17:00 Martin Balko:
     Ramsey-type problems on ordered hypergraphs and connections to discrete geometry

Jiří Dvořák: On point processes, Monte Carlo testing and stochastic reconstruction

presentation (pdf)

Point processes are stochastic models for random collections of points, i.e. in a fixed observation window the positions as well as the number of observed points is random. The points may represent e.g. positions of trees of a given species in a forest, home addresses of patiens with a certain disease, nuclei of specific cells in a tissue etc. We discuss some basic characteristics of the point processes that describe their distribution and show how they can be used to test statistical hypotheses about the observed data. Simulation-based (Monte-Carlo) tests are often used in this context. Our attention will focus mainly on the case where the null hypothesis is not specific enough to enable simulations from the null model and we will discuss the stochastic reconstruction approach - it can be used instead of simulations to provide independent replicates (reconstructions, not copies) of the data to be used in the Monte-Carlo tests.

Jan Rataj: Curvature measures and integral-geometric formulae

presentation (pdf)

Martin Balko: Ramsey-type problems on ordered hypergraphs and connections to discrete geometry

presentation (pdf)

We discuss Ramsey numbers of ordered hypergraphs. That is, for an ordered k-uniform hypergraph H, we ask for the minimum positive integer N such that every 2-coloring of the complete ordered k-uniform hypergraph on N vertices contains a monochromatic copy of H. We consider several estimates on ordered Ramsey numbers for various classes of ordered hypergraphs and we show that some of them admit natural geometric interpretations that yield new results for various extremal problems in discrete geometry. In particular, we mention connections between estimating ordered Ramsey numbers of monotone paths and the Erdos-Szekeres theorem.

S. Nagy: Statistical Data Depth and its Applications

Seminar, Department of Cybernetics, ČVUT
Praha, Czech Republic

February 22, 2018

presentation (pdf)


In nonparametric statistics the concept of quantiles is of paramount importance. In multivariate spaces, however, quantiles cannot be defined directly, due to the lack of natural ordering of points. In the talk we focus on one possible solution to this problem, using a tool called data depth. Depth is a function that quantifies the "centrality" of points, with respect to a given probability distribution. Points with high depth values form the "inner" quantile regions of the distribution; points of low depth lie on the outskirts of the data cloud. We discuss approaches to the definition of data depth, and illustrate these in a series of simple examples. The applications of this methodology include data visualisation,(robust) estimation, classification, clustering, or outlier detection for multivariate, high-dimensional, and even functional (infinite-dimensional) datasets.

S. Nagy: Theory of Functional Data Depth

Aalto Stochastics and Statistics Seminar
Helsinki, Finland

February 7, 2018


Depth has become a quite popular concept in functional data analysis. In the talk we discuss its general framework. We show that most known functional depths can be classified into few groups, within which they share similar theoretical properties. We focus on uniform consistency results for the sample versions of these functionals, and demonstrate that some well-known approaches to depth assessment are hardly theoretically adequate.

S. Nagy: On Symmetry of Multivariate Random Variables (in Slovak)

20th Winter School ROBUST
Rybník, Czech Republic

January 26, 2018

presentation (pdf)


Na rozdiel od distribúcií na reálnej osi, vo viacrozmerných priestoroch neexistuje jednoznačne prijímaná definícia symetrie rozdelenia. Niekoľko rôznych prístupov kategorizujú Zuo a Serfling (2000), z čoho neskôr vychádza Serfling (2006) pri uvádzaní týchto definícií do štatistickej literatúry. V príspevku preskúmame niektoré tvrdenia Zua a Serflinga (2000) a ukážeme, že najzaujímavejšie dôkazy v ich článku nie sú úplné. Pri ďalšom skúmaní týchto problémov narazíme na neznámy dôkaz Funkovej charakterizácie symetrie konvexných telies - problému, formulovaného v roku 1913, ktorý bol vyriešený až v roku 1970. Pokiaľ vieme, jedná sa o prvý elementárny dôkaz tohto významného tvrdenia v literatúre.

S. Nagy: Data Depth and Its Place in Modern Mathematics (in Slovak)

Department of Probability and Math. Statistics, Charles University

October 2, October 30, and December 11, 2017
A series of lectures providing a broad introduction to the topic of statistical data depth, and its links to several related fields of advanced mathematics.
  1. Časť I: Štatistická hĺbková funkcia
  2. Časť II: Miery symetrie
  3. Časť III: Plávajúce telesá


V krátkej sérii prednášok sa zameriame na tzv. štatistickú hĺbku dát, a jej teoretické vlastnosti. Hĺbka, v štatistike známa od 70. rokov, je neparametrický nástroj analýzy dát, mienený ako zovšeobecnenie kvantilov a poradí pre komplexné (typicky mnohorozmerné) pozorovania. V prvej časti série predstavíme rozličné prístupy k počítaniu hĺbky, a uvedieme prehľad ich základných vlastností a štatistických aplikácií. V druhej časti odhalíme vzťahy medzi hĺbkou dát a tzv. mierami symetrie množín známymi z konvexnej analýzy. Nakoniec, v tretej časti ukážeme že hĺbka blízko súvisí aj s viacerými ďalšími konceptami intenzívne skúmanými v modernej matematike. Hlavný dôraz bude kladený na historické súvislosti, málo známe medziodborové prepojenia, zaujímavé otvorené problémy, a perspektívne nové smery výskumu.

S. Nagy: Geometry of Multivariate Quantiles (in Slovak)

49th Conference of Slovak Mathematicians (plenary talk)
Jasná pod Chopkom, Slovakia

November 24, 2017


Možno najlepším odhadom polohy používaným v štatistickej analýze jednorozmerných dát je medián - bod m, ktorý delí dáta na dve rovnako veľké množiny pozorovaní i) menších ako m, a ii) väčších ako m. Medián má radu výborných vlastností: vždy existuje, ľahko sa interpretuje, a je iba málo ovplyvňovaný hrubými chybami merania. Jeho zovšeobecnením sú kvantily, a ďalšie štatistiky založené na poradiach pozorovaní. Pre viacrozmerné dáta je však ťažké tieto koncepty zmysluplne definovať. Jedným z prístupov ako tak urobiť je pomocou tzv. hĺbky dát. V prehľadovom príspevku predstavíme základnú myšlienku merania hĺky dát, a poodhalíme jej vzťahy s niektorými pojmami známymi v geometrii. Ukážeme, že v geometrii existujú dôležité výsledky priamo aplikovateľné na problémy vyvstávajúce pri skúmaní teórie hĺbky. S ich pomocou čiastoene vyriešime niektoré odolávajúce otvorené štatistické problémy, a naznačíme možnosti budúceho výskumu tak pre hĺbku, ako aj pre konvexnú geometriu.

S. Nagy: Halfspace Depth and the Geometry of Multivariate Quantiles

Noon lecture
Department of Applied Mathematics, Charles University

October 12, 2017


Statistical data depth is a non-parametric tool applicable to multivariate data, whose main goal is a reasonable generalisation of quantiles to multivariate datasets. We discuss the halfspace depth, the most important depth in statistics. This depth was first proposed in 1975; its rigorous investigation starts in the 1990s, and still an abundance of open problems stimulates the research in the area. We present several interesting links of the halfspace depth, and some well-studied concepts from geometry. Using these relations we resolve some open problems concerning data depth, and outline perspectives for future research both in data depth, and in geometry.