October 1, 2001
We present statistical analyses of the large-scale structure of three types of semantic networks: word associations, WordNet, and Roget's thesaurus. We show that they have a small-world structure, characterized by sparse connectivity, short average path-lengths between words, and strong local clustering. In addition, the distributions of the number of connections follow power laws that indicate a scale-free pattern of connectivity, with most nodes having relatively few connections joined together through a small number of hubs with many connections. These regularities have also been found in certain other complex natural networks, such as the world wide web, but they are not consistent with many conventional models of semantic organization, based on inheritance hierarchies, arbitrarily structured networks, or high-dimensional vector spaces. We propose that these structures reflect the mechanisms by which semantic networks grow. We describe a simple model for semantic growth, in which each new word or concept is connected to an existing network by differentiating the connectivity pattern of an existing node. This model generates appropriate small-world statistics and power-law connectivity distributions, and also suggests one possible mechanistic basis for the effects of learning history variables (age-of-acquisition, usage frequency) on behavioral performance in semantic processing tasks.
Similar papers 1
June 25, 2001
The lexicon consists of a set of word meanings and their semantic relationships. A systematic representation of the English lexicon based in psycholinguistic considerations has been put together in the database Wordnet in a long-term collaborative effort1. We present here a quantitative study of the graph structure of Wordnet in order to understand the global organization of the lexicon. We find that semantic links follow power-law, scale-invariant behaviors typical of self-o...
June 27, 2002
We define two words in a language to be connected if they express similar concepts. The network of connections among the many thousands of words that make up a language is important not only for the study of the structure and evolution of languages, but also for cognitive science. We study this issue quantitatively, by mapping out the conceptual network of the English language, with the connections being defined by the entries in a Thesaurus dictionary. We find that this netw...
April 24, 2023
Interpreting natural language is an increasingly important task in computer algorithms due to the growing availability of unstructured textual data. Natural Language Processing (NLP) applications rely on semantic networks for structured knowledge representation. The fundamental properties of semantic networks must be taken into account when designing NLP algorithms, yet they remain to be structurally investigated. We study the properties of semantic networks from ConceptNet, ...
February 10, 2016
Recent empirical and modeling research has focused on the semantic fluency task because it is informative about semantic memory. An interesting interplay arises between the richness of representations in semantic memory and the complexity of algorithms required to process it. It has remained an open question whether representations of words and their relations learned from language use can enable a simple search algorithm to mimic the observed behavior in the fluency task. He...
February 2, 2011
In this paper we extract the topology of the semantic space in its encyclopedic acception, measuring the semantic flow between the different entries of the largest modern encyclopedia, Wikipedia, and thus creating a directed complex network of semantic flows. Notably at the percolation threshold the semantic space is characterised by scale-free behaviour at different levels of complexity and this relates the semantic space to a wide range of biological, social and linguistics...
November 29, 2017
Recent work has attempted to characterize the structure of semantic memory and the search algorithms which, together, best approximate human patterns of search revealed in a semantic fluency task. There are a number of models that seek to capture semantic search processes over networks, but they vary in the cognitive plausibility of their implementation. Existing work has also neglected to consider the constraints that the incremental process of language acquisition must plac...
May 4, 2001
Human language can be described as a complex network of linked words. In such a treatment, each distinct word in language is a vertex of this web, and neighboring words in sentences are connected by edges. It was recently found (Ferrer and Sol\'e) that the distribution of the numbers of connections of words in such a network is of a peculiar form which includes two pronounced power-law regions. Here we treat language as a self-organizing network of interacting words. In the f...
December 22, 2003
A thesaurus is one, out of many, possible representations of term (or word) connectivity. The terms of a thesaurus are seen as the nodes and their relationship as the links of a directed graph. The directionality of the links retains all the thesaurus information and allows the measurement of several quantities. This has lead to a new term classification according to the characteristics of the nodes, for example, nodes with no links in, no links out, etc. Using an electronic ...
August 5, 2022
Centrality, in some sense, captures the extent to which a vertex controls the flow of information in a network. Here, we propose Local Detour Centrality as a novel centrality-based betweenness measure that captures the extent to which a vertex shortens paths between neighboring vertices as compared to alternative paths. After presenting our measure, we demonstrate empirically that it differs from other leading central measures, such as betweenness, degree, closeness, and the ...
January 6, 2008
The classical forms of knowledge representation fail when a strong dynamical interconnection between system and environment comes into play. We propose here a model of information retrieval derived from the Kintsch-Ericsson scheme, based upon a long term memory (LTM) associative net whose structure changes in time according to the textual content of the analyzed documents. Both the theoretical analysis carried out by using simple statistical tools and the tests show the appea...