France-Mexico Meeting on Data Analysis
- Avner Bar-Hen (Conservatoire National des Arts et Métiers, Paris) “How mathematicians deal with big data: the example of biostatistics”
- Adeline Leclercq-Samson (Université de Grenoble-Alpes) “Some big data issues in Social and Human Sciences”
- Anatoli Louditski (Université de Grenoble-Alpes) “Optimization and statistical learning”
- Sarah Cohen-Boulakia (Lab. de Recherche en Informatique, Orsay) “The Paris-Saclay Center for Data Science: interdisciplinary projects and collaborative research opportunities”
- Xavier Vigouroux (Centre Excellence en Programmation Parallèle du Groupe ATOS). “Contribution of High Performance computers to Big Data : Challenges, State of the art and Perspectives”
- Carlos Gershenson (IIMAS-UNAM), “When Slower is Faster”
- Natalia García Colín (INFOTEC, Aguascalientes)“Using machine learning techniques on Mexican data, explorations and early results”.
- Miguel Nakamura (CIMAT, Guanajuato) "Relating extinction rate to the fossil record".
- Johan Van Horebeck (CIMAT, Guanajuato) “A look at radial base Kernel PCA in data space”
- Rolando Biscay (CIMAT, Guanajuato) "Face recognitition in videos by Fisher vectors of binary features with spatial information".
- Eduardo Gutiérrez Peña (IIMAS-UNAM) “Measures of niche overlap in Ecology”
- Pablo Suarez-Serrato, (Instituto de Matemáticas-UNAM) "Social Automation, Twitter Bots and Human Rights".
- Eric Bonnetier (Université de Grenoble-Alps) "Localized large gradients in composite media and the Neumann-Poincaré operator."
- Anatoli Iouditski (Université de Grenoble-Alpes) Ellipsoid algorithm, or why convex programming is "simple".
|THURSDAY 3||FRIDAY 4|
|10:00 – 10:25||Avner Bar-Hen||Pablo Suarez-Serrato|
|10:30 – 10:55||Carlos Gershenson||Anatoli Louditski|
|11:00-11:30||Coffee break||Coffee break|
|11:30 – 11:55||Miguel Nakamura||Rolando Biscay|
Public Lecture 1
Public Lecture 2
|13:00-13:25||Adeline Leclercq-Samson||Sarah Cohen-Boulakia|
|13:30-13:55||Natalia García Colín||Johan Van Horebeck|
|14:00-15:30||Lunch Break||Lunch Break|
|15:30-15:55||Eduardo Gutiérrez Peña||Xavier Vigouroux|
Discussion: perspectives (Big group)
Discussion: perspectives (small group)
Iscription fees (including the meals for the 2 days):
- $750 general public
- $350 students
- Scholarships possibilities
For further information: www.data-analysis.matem.unam.mx
The rapid growth in the size and scope of datasets in science and technology has created a need for novel foundational perspectives on data analysis that blend the mathematical, statistical and computational sciences. We present some issue for biological sciences with a special focus on health studies.
Social and Human Sciences generate new data that are challenging to analyse, understand and interpret. In this talk, we will give an overview of some of them: social networks (facebook data, twitter, etc), human psychology (dynamical mouse tracking data), speech and language development, etc.
New challenging subjects of statistical inquiry have encouraged massive collaboration between statistics, computer science and optimization. Its objective is in developing scalable algorithms for statistical inference. We discuss some of most efficient techniques of large-scale convex optimization and their applications in statistical learning.
Principal component analysis (PCA) is at the core of many dimension reduction techniques. Its extension based on implicit transformations (aka. kernels) is especially popular in the machine learning literature. Nevertheless, as one works only in an indirect way in the original data space, its interpretation is not always obvious and intuitive. In this talk we look at different kinds of characterizations and explore how linearizations in data space can be useful for the case of large n, based on random projections and random Fourier features.
In Ecology, the niche of a species is usually defined as a multidimensional hyper-volume in which a species maintains a viable population (Hutchinson 1957). The community structure may be shaped by resource partitioning between co-occurring species, so quantifying the degree of this partitioning (i.e. niche overlap) is very important when studying species co-existence (Geange et al. 2010). The niche space is often described by multiple axes or variables. When all such axes describe continuous measurements, the niche overlap may be quantified using a measure of similarity of two probability density functions, and is often estimated using non-parametric methods. Here we discuss a Bayesian approach to this problem based on Gaussian Dirichlet process mixture models. We also propose a simple exploratory --but more flexible-- measure of niche overlap. Both ideas are illustrated with real data concerning three mammalian species inhabiting the 'El Triunfo' Biosphere Reserve in Chiapas, Mexico. *Joint work with M. Mendoza, A. Contreras and E. Mendoza
In composite media, places where inhomogeneities are touching or close to touching are likely to be areas where the solutions of the governing elliptic differential equations present large gradients. This concentration phenomenon proves very interesting in many exciting applications, such as medical imaging, bio-sensing and optoelectronics. It is intimately related to the properties of the Neumann Poincaré operator, an integral operator that can be used as a tool to represent solutions to elliptic differential equations, and also appears in related phenomena of super-resolution and cloaking. In this talk, we describe how the blow up of the gradients can be inferred from the spectral properties of the Neumann-Poincaré operator. This is joint work with Faouzi Triki (Université Grenoble-Alpes).