April 23, 2015
Given the widespread use of lossless compression algorithms to approximate algorithmic (Kolmogorov-Chaitin) complexity, and that lossless compression algorithms fall short at characterizing patterns other than statistical ones not different to entropy estimations, here we explore an alternative and complementary approach. We study formal properties of a Levin-inspired measure $m$ calculated from the output distribution of small Turing machines. We introduce and justify finite approximations $m_k$ that have been used in some applications as an alternative to lossless compression algorithms for approximating algorithmic (Kolmogorov-Chaitin) complexity. We provide proofs of the relevant properties of both $m$ and $m_k$ and compare them to Levin's Universal Distribution. We provide error estimations of $m_k$ with respect to $m$. Finally, we present an application to integer sequences from the Online Encyclopedia of Integer Sequences which suggests that our AP-based measures may characterize non-statistical patterns, and we report interesting correlations with textual, function and program description lengths of the said sequences.
Similar papers 1
November 6, 2012
Drawing on various notions from theoretical computer science, we present a novel numerical approach, motivated by the notion of algorithmic probability, to the problem of approximating the Kolmogorov-Chaitin complexity of short strings. The method is an alternative to the traditional lossless compression algorithms, which it may complement, the two being serviceable for different string lengths. We provide a thorough analysis for all $\sum_{n=1}^{11} 2^n$ binary strings of le...
November 20, 2012
We show that real-value approximations of Kolmogorov-Chaitin (K_m) using the algorithmic Coding theorem as calculated from the output frequency of a large set of small deterministic Turing machines with up to 5 states (and 2 symbols), is in agreement with the number of instructions used by the Turing machines producing s, which is consistent with strict integer-value program-size complexity. Nevertheless, K_m proves to be a finer-grained measure and a potential alternative ap...
January 25, 2011
We describe an alternative method (to compression) that combines several theoretical and experimental results to numerically approximate the algorithmic (Kolmogorov-Chaitin) complexity of all $\sum_{n=1}^82^n$ bit strings up to 8 bits long, and for some between 9 and 16 bits long. This is done by an exhaustive execution of all deterministic 2-symbol Turing machines with up to 4 states for which the halting times are known thanks to the Busy Beaver problem, that is 11019960576...
November 6, 2017
Previously referred to as `miraculous' in the scientific literature because of its powerful properties and its wide application as optimal solution to the problem of induction/inference, (approximations to) Algorithmic Probability (AP) and the associated Universal Distribution are (or should be) of the greatest importance in science. Here we investigate the emergence, the rates of emergence and convergence, and the Coding-theorem like behaviour of AP in Turing-subuniversal mo...
April 8, 2007
A drawback of Kolmogorov-Chaitin complexity (K) as a function from s to the shortest program producing s is its noncomputability which limits its range of applicability. Moreover, when strings are short, the dependence of K on a particular universal Turing machine U can be arbitrary. In practice one can approximate it by computable compression methods. However, such compression methods do not always provide meaningful approximations--for strings shorter, for example, than typ...
August 26, 2011
It is discussed and surveyed a numerical method proposed before, that alternative to the usual compression method, provides an approximation to the algorithmic (Kolmogorov) complexity, particularly useful for short strings for which compression methods simply fail. The method shows to be stable enough and useful to conceive and compare patterns in an algorithmic models. (article in Spanish)
April 22, 2008
Although information content is invariant up to an additive constant, the range of possible additive constants applicable to programming languages is so large that in practice it plays a major role in the actual evaluation of K(s), the Kolmogorov-Chaitin complexity of a string s. Some attempts have been made to arrive at a framework stable enough for a concrete definition of K, independent of any constant under a programming language, by appealing to the "naturalness" of the ...
February 15, 2017
The Kolmogorov complexity of x, denoted C(x), is the length of the shortest program that generates x. For such a simple definition, Kolmogorov complexity has a rich and deep theory, as well as applications to a wide variety of topics including learning theory, complexity lower bounds and SAT algorithms. Kolmogorov complexity typically focuses on decompression, going from the compressed program to the original string. This paper develops a dual notion of compression, the map...
November 28, 2013
TThe problem is to identify a probability associated with a set of natural numbers, given an infinite data sequence of elements from the set. If the given sequence is drawn i.i.d. and the probability mass function involved (the target) belongs to a computably enumerable (c.e.) or co-computably enumerable (co-c.e.) set of computable probability mass functions, then there is an algorithm to almost surely identify the target in the limit. The technical tool is the strong law of ...
July 30, 2024
We study practical approximations to Kolmogorov prefix complexity (K) using IMP2, a high-level programming language. Our focus is on investigating the interpreter optimality for this language as the reference machine for the Coding Theorem Method (CTM). A method advanced to deal with applications to algorithmic complexity different to the popular traditional lossless compression approach based on the principles of algorithmic probability. The chosen model of computation is pr...