Master in Statistics, Data Intelligence, and the Foundations of the Sciences
Ancona, Italy
DURATION
1 Years
LANGUAGES
English
PACE
Full time
APPLICATION DEADLINE
Request application deadline
EARLIEST START DATE
Sep 2025
TUITION FEES
Request tuition fees
STUDY FORMAT
On-Campus
Introduction
The Master in Statistics, Data Intelligence, and the Foundations of the Sciences offer a unique opportunity to obtain not only technical proficiency in data analysis and processing techniques through hands-on tutorials on some of the most popular platforms (Python, STATA, R, Matlab) but also to understand their epistemic rationale and grounding. The Master blends STEM courses (statistics, econometrics, game theory, machine learning, deep learning, AI and logic programming) with courses dedicated to the foundations of the scientific method, epistemology and philosophy of science, focused on the theoretical foundations that underlie such diverse inferential techniques and which, possibly, justify them.
This choice aims to put inferential methodologies into perspective and examine/formalize them also within the scientific ecosystem in which they are embedded: this implies a comprehensive look at the “data generating process” as a web of complex dynamics underpinning data sampling, curation, interpretation, and disclosure.
The STEM courses display a rich panorama of inferential techniques and address specific research targets (forecasting, time-series analysis, biostatistics and epidemiology, deep learning, causal modelling, model selection, risk analysis, and sensitivity analysis) by adopting the most recent methodological developments. This fosters a deep understanding of their rationale, powers and limits, by allowing the students to compare problems and toolsets in different contexts of investigation or data analysis.
The foundational courses are focused on probability theory, imprecise probabilities, rational choice theory, theories of causality, foundations of statistics, the logic of scientific methods, Bayesian and formal epistemology and address meta-problems such as the demarcation problem (what is science and according to which criteria), peer disagreement, judgment aggregation, belief polarization, types of inference (e.g. abduction, analogical inference), metascience, science lobbyism, research integrity, evidence-based policy, science regulation and economics of science.
At the end of the Master course, students will be able to evaluate the best scientific methodology to use for their investigation; analyze data and studies of others in their specific sector of research, and offer consultancy services to policymakers. Journalists and political decision-makers will have acquired the critical tools to orient themselves in the supply of information produced by the various scientific sectors.
Admissions
Curriculum
First Semester, Part A
Tutorial: Introduction to STATA for Data Analysis by Riccardo Cappelli
STATA is a statistical software widely used in data analysis and statistical research. This course aims to help students become familiar with the fundamentals of STATA. An overview of the main STATA techniques will be provided as well as the application of these techniques to real-world data.
Risk and Decision-Making for Data Science and AI by Norman Fenton
This module provides a comprehensive overview of risk assessment, prediction and decision-making challenges covering public health and medicine, the law, government strategy, transport safety and consumer protection. Students will learn how to see through much of the confusion about risk in public discourse and will be provided with methods and tools for improved risk assessment that can be directly applied to personal, group, and strategic decision-making.
The module also directly addresses the limitations of big data and machine learning for solving decision and risk problems. While classical statistical techniques for risk assessment are introduced (including hypothesis testing, p-values, and regression) the module exposes the severe limitations of these methods. In particular, it focuses on the need for causal modelling of problems and a Bayesian approach to probability reasoning. Bayesian networks are used as a unifying theme throughout.
Causation and Probabilities by Alexander Gebharter
This course provides a crash course in the basics of probability theory followed by an overview of accounts of causation related to probabilities. The general idea is that causal structure explains various kinds of probabilistic dependence. While knowledge of correlation is a useful tool for prediction, only causal information provides a reliable guide to control one’s environment.
Epistemology II by Alexander Gebharter
What is knowledge? How does it relate to truth and rationality? How can we justify our beliefs and how should we revise them in the light of new incoming evidence? These are some of the main questions raised within epistemology. “Epistemology I” and “Epistemology II” explore questions like these and how they are answered by the current accounts on the market as well as the new problems these answers give rise to.
Tutorial: R & Matlab by Federico Giri
This course aims to provide an introduction to Matlab (R) programming techniques.
Tutorial: PYTHON by Adriano Mancini
The course is structured to guide learners through Python programming from fundamental concepts to advanced data science techniques. It starts with an introduction to Python to understand the core principles of programming including data structures. The latter part of the course introduces powerful libraries for data science: NumPy, SciPy, and sci-kit-learn.
Epistemology I by Michał Sikorski
What is knowledge? How does it relate to truth and rationality? How can we justify our beliefs and how should we revise them in the light of new incoming evidence? These are some of the main questions raised within epistemology. “Epistemology I” and “Epistemology II” explore questions like these and how they are answered by the current accounts on the market as well as the new problems these answers give rise to.
The Philosophy of Evolutionary Theory by Elliot Sober
This course is based on Elliot Sober’s new book “The Philosophy of Evolutionary Theory”. It
covers topics such as units of selection and common ancestry all deeply related to probabilistic reasoning.
First Semester, Part B
Artificial Intelligence & Logic Programming I by Aldo Dragoni
Content:
- Artificial Intelligence: history and difference between the logical-symbolic approach and the neural approach.
- First-order logic: Syntax, Semantics, Formal system.
- Resolution method: Herbrand’s theorem. Conversion to the clausal form of a closed formula. The Resolution Principle for ground clauses. Unification.
- The Resolution Principle. Linear Resolution.
- Definite programs: Semantics. Correctness of SLD Resolution. The Occur-Check problem. Completeness of SLD Resolution. Independence
- From the Computation Rule. SLD Refutation Procedure. Computational adequacy of Definite Programs.
- Logic programming: PROLOG. Declarative programming.
Principles of epidemiology and biostatistics for Public Health Research by Rosaria Gesuita, Edlira Skrami, Andrea Faragalli, Marica Iommi
Main topics:
- Introduction to Epidemiology, Prof. Rosaria Gesuita (2 hours)
- Observational studies, frequency and association measures, Prof. Rosaria Gesuita (6 hours) & Dr. Marica Iommi (4 hours)
- Descriptive study design, Analytical approaches, Experimental study designs, Prof. Edlira Skrami (8 hours)
- Study protocol, Dr. Andrea Faragalli (4 hours)
- Principles of sample size estimation, Dr Andrea Faragalli (4 hours)
- Principles of systematic review and meta-analysis, Dr. Marica Iommi (4 hours)
Foundations of the Sciences by Barbara Osimani
Content: What is Science? Who says what science is, with what authority and according to which criteria? What justifies scientific knowledge? Are its foundations, if any, of a logical, metaphysical or practical nature? What are the grounds for acting on this basis? What are the principal tools allowing us to further our knowledge of reality? How do we evaluate their adequacy and reliability? What distinguishes a scientific method from other sources of knowledge? what distinguishes the different approaches to statistical inference (e.g. frequentist vs. Bayesian school vs. imprecise probabilities approach, and their respective subdivisions)?
What are the methodological and practical implications? How do the diverse paradigms deal with the relationship between theory/hypothesis and evidence? These are some of the questions that the course addresses by resorting to a large philosophical and methodological literature devoted to the foundations of science, scientific inference, and pragmatic dimensions in scientific practice.
In particular, the course will focus on the following themes:
- Epistemology and ontology of science: the demarcation problem;
- Scientific uncertainty: Probability and the Foundations of Statistics;
- (Formal) methods in the Science
Foundations of Econometrics I by Claudia Pigini
''Foundation of Econometrics I & II" provides an essential framework for understanding and applying econometric methods. Covering data exploration, regression analysis, prediction modelling, and causal inference, students gain practical skills using RStudio. Suggested readings complement theoretical concepts. Ideal for those seeking proficiency in data-driven decision-making in business, economics, and policy.
Bayesian Inference by Eric-Jan Wagenmakers
This course will cover the theory and practice of “common sense expressed in numbers”, that is Bayesian inference. In the first part of the course, I will use the binomial model to cover the theoretical building blocks (e.g., prior and posterior distributions, coherence, parameter estimation and Bayes factor hypothesis testing, vague vs. informed prior distributions, model-averaging, model misspecification, etc.). In the second part, I will showcase Bayesian inference in practice and feature Bayesian t-tests, regression, ANOVA, and other models.
Fundamentals of Machine Learning by Marco Piangerelli
The course aims to compactly present the main paradigms of machine learning (supervised, unsupervised and reinforcement learning) while
also presenting their statistical basis (statistical learning theory). The most recent developments in terms of explainability and interpretability of ML models will also be presented.
Statistical Schools: Concepts of Probability, Statistical Inference, and Data Analysis by Christian Hennig
The course will give a comparative overview of various concepts of probability, statistical inference, and data analysis. There will be a focus on the connection between statistical models and data in the real world, the role of model assumptions for analysing data, the limitations of objectivity and the necessity of judgment and subjective decision.
Second Semester, Part A
Time-series forecasting with Deep Learning by Alessandro Galdelli
Content:
- Introduction to Time-Series Analysis
- Fundamentals of Deep Learning for Time-Series
- Working with Time-Series Data
- Deep Learning Models for Time-Series Forecasting
- Advanced Forecasting Techniques
- Evaluation Metrics and Model Optimization
- Case Studies and Applications
- Future Trends and Challenges in Time-Series Forecasting
Causal Inference by Alexander Gebharter
This course builds on basic insights established in the course “Causation and Probabilities” and some of the formal tools introduced in the course “Formal Epistemology”. It further advances topics from these courses and provides an introduction to causal models and causally interpreted Bayesian networks. These tools can be used to formulate complex causal hypotheses more precisely, to generate probabilistic predictions based on observation and hypothetical intervention, and to uncover causal structures from observational and experimental data. The course combines content and will allow students to familiarize themselves with these tools by applying them to different tasks and toy examples.
Formal Epistemology II by Alexander Gebharter
''Formal Epistemology I” and ``Formal Epistemology II” build on the basis laid by the course ``Epistemology” and in later parts on basic concepts introduced at the beginning of the course ``Causal Inference”. It explores the foundations and dynamics of knowledge and reasoning by utilizing formal tools, especially probability theory and simple graphical models.
Bayesian Philosophy of Science by Stephan Hartmann
This course aims to show how Bayesian methods can be used to answer central questions in the philosophy of science. To this end, in the first part of the course, students will learn to construct Bayesian models (in particular using the theory of Bayesian networks) and apply them to selected problems. To this end, there will be two tutorial sessions in which students can train their mathematical problem-solving skills. In the second part, we will first briefly talk about different epistemic theories of epistemic justification and then focus on the debate on probabilistic measures of coherence discussed in formal epistemology.
We will then examine the possibilities of developing a coherentist Bayesian philosophy of science, focusing in particular on the extent to which this approach can shed light on current debates about scientific explanation and intertheoretical relations. Finally, we will discuss the (possible) limits of Bayesianism and coherentism.
Rationality in the Sciences by Barbara Osimani
What is scientific rationality? Are different sorts of rationality at play in scientific practice? If so, how do they intertwine and impact on scientific production? In particular, what role does strategic rationality play in scientific settings, especially those characterized by strong conflicts of interest?
How do we deal with scientific dissent (in these cases)? What are the forces that shape the collection, selection, production and disclosure/communication of scientific evidence in diverse scientific ecosystems (past and present)? This module will investigate these themes by drawing on a double-track approach: the ``abductive” approach of metascience studies, which aims to develop tools for bias and fraud detection, and the theoretical approach of recent literature on (Bayesian) persuasion games.
Foundations of Econometrics II by Claudia Pigini
''Foundation of Econometrics I & II" provides an essential framework for understanding and applying econometric methods. Covering data exploration, regression analysis, prediction modelling, and causal inference, students gain practical skills using RStudio. Suggested readings complement theoretical concepts. Ideal for those seeking proficiency in data-driven decision-making in business, economics, and policy.
Formal Epistemology I by Michał Sikorski
''Formal Epistemology I” and ``Formal Epistemology II” build on the basis laid by the course ``Epistemology” and in later parts on basic concepts introduced at the beginning of the course ''Causal Inference”. It explores the foundations and dynamics of knowledge and reasoning by utilizing formal tools, especially probability theory and simple graphical models.
Beyond Inferential Statistics: Abduction and Q Methodology by Raffaele Zanoli
Main Topics:
- Introduction Statistical and methodological differences between inferential and non-inferential statistics
- Induction, Deduction and Abduction
- Objectivity vs Subjectivity: Epistemological and Statistical Considerations
- Q Methodology and the Scientific Study of Subjectivity
- Examples and practicals
Second Semester, Part B
Artificial Intelligence & Logic Programming II by Aldo Dragoni
Content:
- Artificial Intelligence: history and difference between the logical-symbolic approach and the neural approach.
- First-order logic: Syntax, Semantics, Formal system.
- Resolution method: Herbrand’s theorem. Conversion to the clausal form of a closed formula. The Resolution Principle for ground clauses. Unification.
- The Resolution Principle. Linear Resolution.
- Definite programs: Semantics. Correctness of SLD Resolution. The Occur-Check problem. Completeness of SLD Resolution. Independence
- From the Computation Rule. SLD Refutation Procedure. Computational adequacy of Definite Programs.
- Logic programming: PROLOG. Declarative programming.
Economics of Science and Technology by Nicola Matteucci
The course presents normative and positive (from Latin positum) topics of the economics of regulation and public policy, with a focus on science-based (high-tech) economic sectors, and on big societal challenges whose solution relies on scientific knowledge. Policy-making is meant in its widest definition, spanning from detailed sectoral norms and policy (e.g., health policy and regulation) to broader policy-making (e.g., development or environment policy). The course revolves around the two fundamental categories of “market” and “government failures”, to present a reasoned (non-systematic) review of influential works analyzing the causes, mechanisms and consequences of policy failure and/or capture. The main stepping stone of the course is scientific lobbyism.
Economics of Regulation in Science-Based Domains by Nicola Matteucci
The course presents normative and positive (from Latin positum) topics of the economics of regulation and public policy, with a focus on science-based (high-tech) economic sectors, and on big societal challenges whose solution relies on scientific knowledge. Policy-making is meant in its widest definition, spanning from detailed sectoral norms and policy (e.g., health policy and regulation) to broader policy-making (e.g., development or environment policy).
The course revolves around the two fundamental categories of “market” and “government failures”, to present a reasoned (non-systematic) review of influential works analysing the causes, mechanisms and consequences of policy failure and/or capture. The main stepping stone of the course is scientific lobbyism.
Questionnaire development: How to collect data from surveys. Do's and Don'ts by Simona Naspetti
This course provides an overview of questionnaire development and strategies for collecting data through surveys. Participants will learn how to design, and implement surveys to gather accurate and meaningful data. Through lectures, case studies, and interactive activities, participants will gain practical skills and insights into the do’s and don’ts of questionnaire development.
Time Series Econometrics by Giulio Palomba
Main topics:
- Time series data and stochastic processes
- Dynamic models
- ARMA models
- Unit roots
- VAR models
- Cointegration
- GARCH models
The Integrity of Research by Andrea Saltelli
The various dimensions of research integrity are organized in terms of norms, functions, and unity. Norms refer to how science conforms or deviates from normative standards. Functions relate to how science and research are endowed with a functioning, non-damaged mechanism. The third meaning pertains to the notion of science as an unbroken and undivided entity. The course also serves as an introduction to the historical, philosophical, and sociological elements of science, mostly from the field of Science and Technology Studies (STS), and has a section on science and lobbying.
Ethics of Quantification by Andrea Saltelli
The course presents a mixture of statistical and sociological elements linked to various forms of statistical and mathematical quantification and their technical and normative quality. Sensitivity analysis and sensitivity auditing will be presented as methodologies relevant to the analysis of quality, with a discussion of the properties of the available methods. Other topics covered are the politics of modelling, participatory modelling, and the sociology of quantification.
Imprecise Probabilities by Serena Doria
Unlike classical probability theory, which deals with crisp probabilities, imprecise probability acknowledges the limitations of perfect knowledge. It provides a robust and versatile approach to situations where information is scarce, incomplete, or unreliable. We will begin by examining the motivations behind imprecise previsions and probabilities and contrasting them with classical probability theory. We will explore the necessary mathematical tools to represent imprecise probabilities and we will explore how this framework can be used in artificial intelligence and decision theory.
Rational Choice Theory by Giacomo Sillari
This course delves into Rational Choice Theory, exploring decision-making in conditions of risk, ignorance, and uncertainty. It begins by examining how decisions are made when outcomes are unknown, with particular focus on philosophical applications such as maximin in Rawls's difference principle and the debate with Harsanyi.
From this, the course moves to different interpretations of probability, with particular attention devoted to subjective probability and the Dutch book theorem. The course then covers Expected Utility Theory from a foundational point of view, reviewing the machinery related to the representation theorem and concludes with Strategic Rationality, focusing on how individuals make decisions in strategic environments where outcomes depend on the actions of others, particularly dealing with coordination and cooperation.
Program Outcome
The Master is aimed at students and scholars from both the human sciences and STEM disciplines, but also at professionals who want to enrich their skills in the field of data analysis, science epistemology, and evidence-based policy. The figure that emerges is essentially that of a data analyst, with a rich methodological and foundational background, but the Master can very well also contribute to enriching the educational profile of journalists, politicians and professionals in any sector (from economic to healthcare to legal).
At the end of the Master, students will be able to evaluate the best scientific methodology to use for their investigation; analyze data and studies of others in their specific sector of research, and offer consultancy services to policy-makers. Journalists and political decision-makers will have acquired the critical tools to orient themselves in the supply of information produced in the various scientific sectors.