August 1, 2014
Similar papers 2
September 15, 2020
We present a rigorous thermodynamic treatment of irreversible binary aggregation. We construct the Smoluchowski ensemble as the set of discrete finite distributions generated from the same initial state of all monomers upon fixed number merging events and define a probability measure on this ensemble such that the mean distribution in the mean-field approximation is governed by the Smoluchowski equation. In the scaling limit this ensemble gives rise to a set of relationships ...
June 19, 2018
Modern statistical modeling is an important complement to the more traditional approach of physics where Complex Systems are studied by means of extremely simple idealized models. The Minimum Description Length (MDL) is a principled approach to statistical modeling combining Occam's razor with Information Theory for the selection of models providing the most concise descriptions. In this work, we introduce the Boltzmannian MDL (BMDL), a formalization of the principle of MDL w...
August 6, 2014
Despite its popularity, it is widely recognized that the investigation of some theoretical aspects of clustering has been relatively sparse. One of the main reasons for this lack of theoretical results is surely the fact that, whereas for other statistical problems the theoretical population goal is clearly defined (as in regression or classification), for some of the clustering methodologies it is difficult to specify the population goal to which the data-based clustering al...
February 4, 2009
The cosmological many-body problem is effectively an infinite system of gravitationally interacting masses in an expanding universe. Despite the interactions' long-range nature, an analytical theory of statistical mechanics describes the spatial and velocity distribution functions which arise in the quasi-equilibrium conditions that apply to many cosmologies. Consequences of this theory agree well with the observed distribution of galaxies. Further consequences such as thermo...
September 13, 2010
This paper addresses the statistical significance of structures in random data: Given a set of vectors and a measure of mutual similarity, how likely does a subset of these vectors form a cluster with enhanced similarity among its elements? The computation of this cluster p-value for randomly distributed vectors is mapped onto a well-defined problem of statistical mechanics. We solve this problem analytically, establishing a connection between the physics of quenched disorder...
April 7, 2020
Combining intuitive probabilistic assumptions with the basic laws of classical thermodynamics, using the latter to express probabilistic parameters in terms of the thermodynamic quantities, we get a simple unified derivation of the fundamental ensembles of statistical physics avoiding any limiting procedures, quantum hypothesis and even statistical entropy maximization. This point of view leads also to some related classes of correlated particle statistics.
April 4, 2020
Traditional Bayesian random partition models assume that the size of each cluster grows linearly with the number of data points. While this is appealing for some applications, this assumption is not appropriate for other tasks such as entity resolution, modeling of sparse networks, and DNA sequencing tasks. Such applications require models that yield clusters whose sizes grow sublinearly with the total number of data points -- the microclustering property. Motivated by these ...
August 11, 2023
This draft is intended to be used as class notes for a grad course on rigorous statistical mechanics at math department of UFMG. It should be considered as a very prelimivary version and a work in progress. Several chapters lack references, exercises, and revision.
August 2, 2005
We propose a dynamical scheme for the combined processes of fragmentation and merging as a model system for cluster dynamics in nature and society displaying scale invariant properties. The clusters merge and fragment with rates proportional to their sizes, conserving the total mass. The total number of clusters grows continuously but the full time-dependent distribution can be rescaled over at least 15 decades onto a universal curve which we derive analytically. This curve i...
February 28, 2009
Thermodynamics of clusterized matter is studied in the framework of statistical models with non-interacting cluster degrees of freedom. At variance with the analytical Fisher model, exact Metropolis simulation results indicate that the transition from homogeneous to clusterized matter lies along the $\rho=\rho_0$ axis at all temperatures and the limiting point of the phase diagram is not a critical point even if the surface energy vanishes at this point. Sensitivity of the in...