acm - an acm publication

Articles

P2P networks are inherently unstable

Ubiquity, Volume 2018 Issue June, June 2018 | BY Ted G. Lewis 


Full citation in the ACM Digital Library  | PDF


Ubiquity

Volume 2018, Number June (2018), Pages 1-10

P2P networks are inherently unstable
Ted G. Lewis
DOI: 10.1145/3223880

The P2P technology underlying file-sharing systems like Gnutella and distributed autonomous organizations like blockchain are inherently unstable because of self-organizing processes akin to Gause's competitive exclusion principle, and preferential attachment. To maintain an egalitarian P2P organization it is necessary to conserve the original network's entropy, defined as the random structure of the network and actions among peers.

Peer-to-peer (P2P) networks have been around for decades. Perhaps the most popular instantiations are Gnutella for sharing music and more recently, blockchain for recording crypto-currency transactions like Bitcoin. The goal of a P2P network is to democratize information and decentralize control. Every node of the network is assumed to be on an even level with all other nodes. Governance is democratized because there is no central control of the network. Because P2P networks are inherently unstable, these idealistic goals often fail. Over time, egalitarian P2P networks self-organize into non-egalitarian systems with a handful of dominant nodes, and many subservient nodes. They form hub-like structure because of Gause's competitive exclusion principle and preferential attachment. The emergence of unintended structure de-stabilizes the P2P network over time, leaving a few nodes in control of the many nodes. Bitcoin and its underlying distributed P2P blockchain, illustrates the inevitable self-organization that emerges from democratic and egalitarian networks, rendering systems like Bitcoin far less than egalitarian and decentralized.

Egalitarian Systems Tend to Self-Organize

British statistician George Udny Yule (1871-1951) was perhaps the first mathematician to observe preferential attachment in evolutionary systems in 1925 [1]. Dubbed the "Yule Distribution" by Herbert A. Simon, Yule's Distribution is now known as a power law of the form 1/xp, where p > 0. Power laws are long-tailed, indicating smaller values of x are much more likely to occur than large values. A heavy-tailed power law is one in which p is relatively small, and a short-tailed distribution is one in which p is relatively large. Power law distributions are distinctively non-random distributions, being biased toward small values of x.

Preferential attachment is a form of reinforcement learning whereby the likelihood of selecting an item increases with its frequency of being previously selected. Sidebar 1 illustrates how preferential attachment emerges from randomness to form a non-random power law distribution. The lopsided power law is long-tailed, meaning there are very few highly selected items, and many rarely selected items (see Figure 1b).

The topology of a network of unevenly connected nodes tends to form highly connected hubs versus uniformly random connections because small unevenness or inequality builds over time, leading to a preference of a few nodes over all others. Networks whose node connections obey a power law are called scale-free. When applied to networks such as the internet or virtual networks such as blockchain, scale-free structure emerges as a byproduct of socio-economic organizing principles such as lower cost, higher bandwidth, or preferred social connections. For one or more of these reasons, a handful of highly connected or preferred nodes arise, while less-favored nodes decline in popularity. In Gnutella, for example, higher bandwidth nodes that deliver music faster and with greater frequency are favored over less capable nodes. Eventually, the topological structure of Gnutella became a scale-free network with a few vary large hubs, and many small hubs [2, 3]. Similarly, the Bitcoin blockchain has become centralized around a handful of shifting dominant miners a number of times over its short life.

Systems that are completely democratic and egalitarian obey a binomial distribution and systems with dominant nodes tend to obey a power law. Thus, the shape of a network's distribution of connections is a test of its egalitarian structure, or lack of it. Research has shown that Gnutella followed a power-law-like distribution as it evolved over the years. Gnutella (unconsciously) self-organized, so it is no longer an egalitarian network [4, 5]. We claim self-organization always emerges from P2P networks unless additional work is inputted to prevent preferential attachment. Emergence of structure renders egalitarian networks unstable relative to their objective of being without a central controller or undemocratic principles of operation.

Competitive Exclusion Principle

Russian biologist Georgii Frantsevich Gause formulated the competitive exclusion principle by noting a slightly more capable species competing in a closed ecological system tends to dominate all other species, and perhaps even monopolizes resources, over time. The dominant species may initially possess a very minor advantage over its competitors, but if it is able to exploit that advantage, it will rise to the top echelon of the ecosystem. This idea, while not always true in all cases, may explain why egalitarian P2P networks evolve into non-egalitarian networks. Sooner or later, a node learns to exploit an advantage and leverages that advantage into a dominant position in the ecosystem. Figure 2 illustrates the application of the competitive exclusion principle in operation in an open market where consumers select competitors based on a preference. The details are explained in Sidebar 2.

Bitcoin and the underlying P2P blockchain is an illustration of Gause's law. Satoshi Nakamoto envisioned blockchain as a P2P network of equal nodes all incentivized to maintain equality and fairness across all nodes of the system. But over time, miners that invested in faster and more plentiful proof-of-work machines began to dominate because they were rewarded more than slower and less capable miners. Socio-economic-technical factors like the cost of electrical power, bandwidth, processing power, and the formation of mining cabals destroyed the egalitarian and democratic structure intended by Nakamoto. Over time, blockchain has self-organized into a network of thousands of miners, but only a handful are highly connected and highly successful. It is no longer a level playing field.

Conservation of Entropy

Truly equal P2P networks are randomly connected networks that obey binomial distributions of connectivity. Random connectivity means they are high-entropy networks. Self-organized networks that obey power laws are low-entropy networks because they have traded randomness for structure. For example, a random network of n = 100 nodes and m = 200 connections will contain approximately 2.5 bits of entropy. A scale-free network of the same number of nodes and connections will have approximately 1.6 bits of entropy, because of its power law structure. The bin selection experiments of Figure 1a and 1b contained roughly 3.0 bits and 2.5 bits of entropy, respectively. A hub-dominated scale-free network trades entropy for hub-like structure.

The key to maintaining egalitarian P2P networks is to guarantee conservation of entropy. That is, some effort must go into maintaining randomness. If we apply this idea to the Bitcoin blockchain, as an illustration, we might guarantee randomness by adding a randomizer to the delegation of proof-of-work. We also have to change the reward system of Bitcoin. Here is how a randomizer works.

Transactions are bundled together to form blocks, which must be added to the chain. In classic blockchain, the first miner to perform proof-of-work is rewarded with payment. This means miners with faster and cheaper computers will eventually dominate. In an entropy-conserving blockchain a miner is selected at random to perform the hash coding and Merkle Tree processing functions necessary to add a block to the chain. All other miners only need to verify the work of the chosen miner in order to insert the new block into their copies of the chain. Blocks are added faster and everyone eventually gets paid when they are selected frequently and at random.

Membership now becomes a problem, because classic blockchain does not limit the number of miners. Anyone can join the blockchain. A very large number of miners lead to infrequent rewards and a slower blockchain. Therefore, in addition to a randomizer, the modern blockchain needs a membership process. If the blockchain is truly democratic, then miners are elected and allowed to retain membership for an economically rewarding period of time, adequate to incentivize participation.

Non-Bitcoin blockchains, without proof-of-work and without open membership, are appearing on a daily basis. The trend seems to be toward community-based blockchains where membership is defined by a specific industry. For example, the food supply chain is populated only by supply chain participants and associated shippers. Financial institutions are building blockchains for a closed community of banks. Trading partners reduce friction on their trading platforms while restricting membership to trading partners, only. The problem of membership may not be a problem after all, but the even distribution of workload will remain problematic unless a randomizer is added to the algorithm. Otherwise, what is to prevent preferential attachment from setting it?

References

[1] Lewis, T.G. Network Science: Theory and Applications. John Wiley & Sons, 2009.

[2] Barabasi, A. and Albert, R. Emergence of scaling in random networks. Science 286, 5439 (1999), 509–512.

[3] Ripeanu, M., Foster, I. and Iamnitchi, A. Peer-to-Peer architecture case study: Gnutella network. University of Chicago Technical Report TR-2001-26. 2002; https://newtraell.cs.uchicago.edu/research/publications/techreports/TR-2001-26

[4] S. Saroiu, S. , Gummadi, P. K., and Gribble, S. D. A measurement study of peer-to-peer file sharing systems. SPIE Proceedings Vol. 4673. 2002; https://people.mpi-sws.org/~gummadi/papers/mmcn.pdf

[5] Xie, C. Analysis of large-scale peer-to-peer network topology. IEEE Globecom 2006 (San Francisco). IEEE, Washington D.C., 2007; https://grid.cs.gsu.edu/~cscyqz/courses/aos/sample-paper/sample-paper.pdf

Author

Ted Lewis is a retired professor of computer science interested in network science, social media, and emerging technologies, and has published over 30 books on topics ranging from personal computing to complexity theory.

Consider the following simple experiment. Arrange eight bins along an x-axis as shown in Figure 1, and select bins at random to produce the probability distribution shown in Figure 1a. After 1,000 trials, the distribution of selected bins is approximately uniform. However, notice there is unevenness due to statistical fluctuations in the trials.

Consider a slightly more elaborate experiment where bin selection is biased by the probability of each bin. In this experiment, the probability of selecting a bin is proportional to the probability of previous selections. We can sample from the cdf(x) to obtain bin number x, according to the pdf(x) established by prior selections. That is, the selection probability is reinforced and biased by the history of selections. As a bin is selected more often, it becomes more attractive as a future selection. A small unevenness in initial selections creates a bias in future selections. Future selections prefer to select high probability past selections, hence the term "preferential attachment." This preference is why a handful of dominant bins appear as a small unevenness grows into a larger unevenness over time.

Figure 1b shows the results of experiments where the probability of selecting a bin depends on the pdf(x), x = bin#. The entropy of the uneven distribution is lower, because of the structure that emerges from small unevenness in the initial distribution. Note that the dominant bin cannot be determined beforehand, because the small unevenness is random.

Figure 1c shows the results of analyzing the probability distributions of the four experiments shown in Figure 1b. The drop-off in pdf(x) approximates a power law. Thus, a long-tailed distribution emerges from preferential attachment, and entropy declines due to the increase in uneven structure.

In summary, the distribution of selecting one of eight bins 1,000 times at random is uniform with an entropy of approximately 3.0. (see Figure 1a). Preferential attachment of bins with entropy readings ranging from 2.51 to 2.69 is show in Figure 1b. Note that entropy declines as randomness declines, indicating the onset of structure in the form or preferred bins (see Figure 1c). Preferential attachment produces a long-tailed pdf(x).

F1-1AFigure 1a. Random selection of bins.

F1-1BFigure 1b. Preferred selection of bins.

F1-1CFigure 1c. Preferential attachment produces pdf(x) that approximates a power law. The results of four experiments plus the average over all four are shown in this bar chart. The power law is fit to the average. R-squared is 0.71 (low) because of the low number of data points used here: four experiments and eight bins and 1,000 trials.

Gause's competitive exclusion principle applies to P2P networks even when they are initially considered equals because of unevenness in socio-economic conditions such as the cost of electricity in the case of Bitcoin, and popularity or features of a product in a competitive open market as in the open internet. Figure 2 illustrates the formation of scale-free (centralized) networks due to preferential attachment. It is not important why consumers select one node over all others, only that they prefer one for some reason.

Gause's law tends to form a monopoly, although it also allows for oligopoly formation. Monopoly and oligopoly formation renders P2P networks unsuitable as decentralized and democratic networks. The very idea of a P2P network is to maintain equality among nodes. No subset should dominate others.

In Figure 2, the stages in the formation of biased network structure due to preferential attachment illustrate how consumers migrate to competitors (nodes) with one or more superior properties such as lower cost, better service, etc.

F2-2Figure 2. Snapshots a-f show the formation of a network of nodes connected by users that indicate preference by selecting one node over all others. Snapshot e contains three dominant nodes, while snapshot f shows only one remaining. Graph g shows the growth to a single dominant node as in f, while graph h shows three preferred nodes as in e.

2018 Copyright held by the Owner/Author.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2018 ACM, Inc.

COMMENTS

POST A COMMENT
Leave this field empty

2018 Symposia

Ubiquity symposium is an organized debate around a proposition or point of view. It is a means to explore a complex issue from multiple perspectives. An early example of a symposium on teaching computer science appeared in Communications of the ACM (December 1989).

To organize a symposium, please read our guidelines.

 

Ubiquity Symposium: Big Data

Table of Contents

  1. Big Data, Digitization, and Social Change (Opening Statement) by Jeffrey Johnson, Peter Denning, David Sousa-Rodrigues, Kemal A. Delic
  2. Big Data and the Attention Economy by Bernardo A. Huberman
  3. Big Data for Social Science Research by Mark Birkin
  4. Technology and Business Challenges of Big Data in the Digital Economy by Dave Penkler
  5. High Performance Synthetic Information Environments: An integrating architecture in the age of pervasive data and computing By Christopher L. Barrett, Jeffery Johnson, and Madhav Marathe
  6. Developing an Open Source "Big Data" Cognitive Computing Platform by Michael Kowolenko and Mladen Vouk
  7. When Good Machine Learning Leads to Bad Cyber Security by Tegjyot Singh Sethi and Mehmed Kantardzic
  8. Corporate Security is a Big Data Problem by Louisa Saunier and Kemal Delic
  9. Big Data: Business, technology, education, and science by Jeffrey Johnson, Luca Tesei, Marco Piangerelli, Emanuela Merelli, Riccardo Paci, Nenad Stojanovic, Paulo Leitão, José Barbosa, and Marco Amador
  10. Big Data or Big Brother? That is the question now (Closing Statement) by Jeffrey Johnson, Peter Denning, David Sousa-Rodrigues, Kemal A. Delic