Peer-to-peer (abbreviated to P2P) refers to a computer
network in which each computer in the network can act as a client or
server for the other computers in the network, allowing shared access to
files and peripherals
without the need for a central server. P2P networks can be set up in
the home, a business or over the Internet. Each network type requires
all computers in the network to use the same or a compatible program to
connect to each other and access files and other resources found on the
other computer. P2P networks can be used for sharing content such as
audio, video, data or anything in digital format.
P2P is a distributed application architecture that partitions tasks
or workloads among peers. Peers are equally privileged participants in
the application. Each computer in the network is referred to a node.
The owner of each computer on a P2P network would set aside a portion
of its resources - such as processing power, disk storage or network
bandwidth -to be made directly available to other network participant,
without the need for central coordination by servers or stable hosts. With this model, peers are both suppliers and consumers of resources, in contrast to the traditional client–server model where only servers supply (send), and clients consume (receive).
The first peer-to-peer application was the file sharing system Napster,
originally released in 1999. The concept has inspired new structures
and philosophies in many areas of human interaction. Peer-to-peer
networking is not restricted to technology; it also covers social
processes with a peer-to-peer dynamic. In such context, social peer-to-peer processes are currently emerging throughout society.
Architecture of P2P systems
Peer-to-peer systems often implement an abstract overlay network, built at Application Layer,
on top of the native or physical network topology. Such overlays are
used for indexing and peer discovery and make the P2P system independent
from the physical network topology. Content is typically exchanged
directly over the underlying Internet Protocol (IP) network. Anonymous peer-to-peer systems are an exception, and implement extra routing layers to obscure the identity of the source or destination of queries.
In structured peer-to-peer networks, peers (and, sometimes,
resources) are organized following specific criteria and algorithms,
which lead to overlays with specific topologies and properties. They
typically use distributed hash table-based (DHT) indexing, such as in the Chord system (MIT).
Unstructured peer-to-peer networks do not impose any structure on the overlay networks. Peers in these networks connect in an ad-hoc fashion.
Ideally, unstructured P2P systems would have absolutely no centralized
system, but in practice there are several types of unstructured systems
with various degrees of centralization. Three categories can easily be
seen.
- In pure peer-to-peer systems the entire network consists solely of equipotent peers. There is only one routing layer, as there are no preferred nodes with any special infrastructure function.
- Hybrid peer-to-peer systems allow such infrastructure nodes to exist, often called supernodes.
- In centralized peer-to-peer systems, a central server is used for indexing functions and to bootstrap the entire system. Although this has similarities with a structured architecture, the connections between peers are not determined by any algorithm.
The first prominent and popular peer-to-peer file sharing system, Napster, was an example of the centralized model. Freenet and early implementations of the gnutella protocol, on the other hand, are examples of the decentralized model. Modern gnutella implementations, Gnutella2, as well as the now deprecated Kazaa network are examples of the hybrid model.
A pure P2P network does not have the notion of clients or servers but only equal peer
nodes that simultaneously function as both "clients" and "servers" to
the other nodes on the network. This model of network arrangement
differs from the client–server
model where communication is usually to and from a central server. A
typical example of a file transfer that does not use the P2P model is
the File Transfer Protocol
(FTP) service in which the client and server programs are distinct: the
clients initiate the transfer, and the servers satisfy these requests.
The P2P overlay network
consists of all the participating peers as network nodes. There are
links between any two nodes that know each other: i.e. if a
participating peer knows the location of another peer in the P2P
network, then there is a directed edge from the former node to the
latter in the overlay network. Based on how the nodes in the overlay
network are linked to each other, we can classify the P2P networks as
unstructured or structured.
Structured systems
Structured P2P networks employ a globally consistent protocol to
ensure that any node can efficiently route a search to some peer that
has the desired file, even if the file is extremely rare. Such a
guarantee necessitates a more structured pattern of overlay links. By
far the most common type of structured P2P network is the distributed hash table (DHT), in which a variant of consistent hashing is used to assign ownership of each file to a particular peer, in a way analogous to a traditional hash table's
assignment of each key to a particular array slot. Though the term DHT
is commonly used to refer to the structured overlay, in practice, DHT is
a data structure implemented on top of a structured overlay.
Distributed hash tables
Distributed hash tables (DHTs) are a class of decentralized distributed systems that provide a lookup service similar to a hash table: (key, value) pairs are stored in the DHT, and any participating node
can efficiently retrieve the value associated with a given key.
Responsibility for maintaining the mapping from keys to values is
distributed among the nodes, in such a way that a change in the set of
participants causes a minimal amount of disruption. This allows DHTs to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.
DHTs form an infrastructure that can be used to build peer-to-peer networks. Notable distributed networks that use DHTs include BitTorrent's distributed tracker, the Kad network, the Storm botnet, YaCy, and the Coral Content Distribution Network.
Some prominent research projects include the Chord project, the PAST storage utility, the P-Grid, a self-organized and emerging overlay network and the CoopNet content distribution syste.
DHT-based networks have been widely utilized for accomplishing efficient resource discovery for grid computing
systems, as it aids in resource management and scheduling of
applications. Resource discovery activity involves searching for the
appropriate resource types that match the user’s application
requirements. Recent advances in the domain of decentralized resource
discovery have been based on extending the existing DHTs with the
capability of multi-dimensional data organization and query routing.
Majority of the efforts have looked at embedding spatial database
indices such as the Space Filling Curves (SFCs) including the Hilbert curves,
Z-curves, k-d tree, MX-CIF Quad tree and R*-tree for managing, routing,
and indexing of complex Grid resource query objects over DHT networks.
Spatial indices are well suited for handling the complexity of Grid
resource queries. Although some spatial indices can have issues as
regards to routing load-balance in case of a skewed data set, all the
spatial indices are more scalable in terms of the number of hops
traversed and messages generated while searching and routing Grid
resource queries. More recent evaluation of P2P resource discovery
solutions under real-workloads have pointed out several issues in
DHT-based solutions such as high cost of advertising/discovering
resources and static and dynamic load imbalance.
Unstructured systems
An unstructured P2P network is formed when the overlay links are
established arbitrarily. Such networks can be easily constructed as a
new peer that wants to join the network can copy existing links of
another node and then form its own links over time. In an unstructured
P2P network, if a peer wants to find a desired piece of data in the
network, the query has to be flooded through the network to find as many
peers as possible that share the data. The main disadvantage with such
networks is that the queries may not always be resolved. Popular content
is likely to be available at several peers and any peer searching for
it is likely to find the same thing. But if a peer is looking for rare
data shared by only a few other peers, then it is highly unlikely that
search will be successful. Since there is no correlation
between a peer and the content managed by it, there is no guarantee
that flooding will find a peer that has the desired data. Flooding also
causes a high amount of signaling traffic in the network and hence such
networks typically have very poor search efficiency. Many of the popular
P2P networks are unstructured.
In pure P2P networks: Peers act as equals, merging the roles
of clients and server. In such networks, there is no central server
managing the network, neither is there a central router. Some examples
of pure P2P Application Layer networks designed for peer-to-peer file sharing are gnutella (pre v0.4) and Freenet.
There also exist hybrid P2P systems, which distribute their
clients into two groups: client nodes and overlay nodes. Typically, each
client is able to act according to the momentary need of the network
and can become part of the respective overlay network
used to coordinate the P2P structure. This division between normal and
'better' nodes is done in order to address the scaling problems on early
pure P2P networks. As examples for such networks can be named modern
implementations of gnutella (after v0.4) and Gnutella2.
Another type of hybrid P2P network are networks using on the one hand
central server(s) or bootstrapping mechanisms, on the other hand P2P
for their data transfers. These networks are in general called
'centralized networks' because of their lack of ability to work without
their central server(s). An example for such a network is the eDonkey network (often also called eD2k).
Indexing and resource discovery
Older peer-to-peer networks duplicate resources across each node in
the network configured to carry that type of information. This allows
local searching, but requires much traffic.
Modern networks use central coordinating servers and directed search
requests. Central servers are typically used for listing potential peers
(Tor), coordinating their activities (Folding@home), and searching (Napster, eMule).
Decentralized searching was first done by flooding search requests out
across peers. More efficient directed search strategies, including
supernodes and distributed hash tables, are now used.
Peer-to-peer-like systems
In modern definitions of peer-to-peer technology, the term implies
the general architectural concepts outlined in this article. However,
the basic concept of peer-to-peer computing was envisioned in earlier
software systems and networking discussions, reaching back to principles
stated in the first Request for Comments, RFC 1.
A distributed messaging system that is often likened as an early peer-to-peer architecture is the USENET
network news system that is in principle a client–server model from the
user or client perspective, when they read or post news articles.
However, news servers communicate with one another as peers to propagate Usenet news articles over the entire group of network servers. The same consideration applies to SMTP email in the sense that the core email relaying network of Mail transfer agents has a peer-to-peer character, while the periphery of e-mail clients and their direct connections is strictly a client–server relationship. Tim Berners-Lee's vision for the World Wide Web, as evidenced by his WorldWideWeb
editor/browser, was close to a peer-to-peer design in that it assumed
each user of the web would be an active editor and contributor creating
and linking content to form an interlinked web of links. This contrasts to the broadcasting-like structure of the web as it has developed over the years.
Advantages and weaknesses
In P2P networks, clients provide resources, which may include bandwidth,
storage space, and computing power. This property is one of the major
advantages of using P2P networks because it makes the setup and running
costs very small for the original content distributor. As nodes arrive
and demand on the system increases, the total capacity of the system
also increases, and the likelihood of failure decreases. If one peer on
the network fails to function properly, the whole network is not
compromised or damaged. In contrast, in a typical client–server
architecture, clients share only their demands with the system, but not
their resources. In this case, as more clients join the system, fewer
resources are available to serve each client, and if the central server
fails, the entire network is taken down. The decentralized nature of P2P
networks increases robustness because it removes the single point of failure that can be inherent in a client-server based system.
Another important property of peer-to-peer systems is the lack of a
system administrator. This leads to a network that is easier and faster
to setup and keep running because a full staff is not required to ensure
efficiency and stability. Decentralized networks introduce new security
issues because they are designed so that each user is responsible for
controlling their data and resources. Peer-to-peer networks, along with
almost all network systems, are vulnerable to unsecure and unsigned
codes that may allow remote access to files on a victim's computer or
even compromise the entire network. A user may encounter harmful data by
downloading a file that was originally uploaded as a virus disguised in
an .exe, .mp3, .avi, or any other filetype. This type of security issue
is due to the lack of an administrator that maintains the list of files
being distributed.
Harmful data can also be distributed on P2P networks by modifying
files that are already being distributed on the network. This type of
security breach is created by the fact that users are connecting to
untrusted sources, as opposed to a maintained server. In the past this
has happened to the FastTrack network when the RIAA managed to introduce faked chunks into downloads and downloaded files (mostly MP3
files). Files infected with the RIAA virus were unusable afterwards or
even contained malicious code. The RIAA is also known to have uploaded
fake music and movies to P2P networks in order to deter illegal file
sharing.
Consequently, the P2P networks of today have seen an enormous increase
of their security and file verification mechanisms. Modern hashing, chunk verification
and different encryption methods have made most networks resistant to
almost any type of attack, even when major parts of the respective
network have been replaced by faked or nonfunctional hosts.
There are both advantages and disadvantages in P2P networks related
to the topic of data backup, recovery, and availability. In a
centralized network, the system administrators are the only forces
controlling the availability of files being shared. If the
administrators decide to no longer distribute a file, they simply have
to remove it from their servers, and it will no longer be available to
users. Along with leaving the users powerless in deciding what is
distributed throughout the community, this makes the entire system
vulnerable to threats and requests from the government and other large
forces. For example, YouTube has been pressured by the RIAA, MPAA, and
entertainment industry to filter out copyrighted content. Although
server-client networks are able to monitor and manage content
availability, they can have more stability in the availability of the
content they choose to host. A client should not have trouble accessing
obscure content that is being shared on a stable centralized network.
P2P networks, however, are more unreliable in sharing unpopular files
because sharing files in a P2P network requires that at least one node
in the network has the requested data, and that node must be able to
connect to the node requesting the data. This requirement is
occasionally hard to meet because users may delete or stop sharing data
at any point.
In this sense, the community of users in a P2P network is completely
responsible for deciding what content is available. Unpopular files will
eventually disappear and become unavailable as more people stop sharing
them. Popular files, however, will be highly and easily distributed.
Popular files on a P2P network actually have more stability and
availability than files on central networks. In a centralized network,
only the loss of connection between the clients and server is simple
enough to cause a failure, but in P2P networks, the connections between
every node must be lost in order to fail to share data. In a centralized
system, the administrators are responsible for all data recovery and
backups, while in P2P systems, each node requires its own backup system.
Because of the lack of central authority in P2P networks, forces such
as the recording industry, RIAA, MPAA, and the government are unable to
delete or stop the sharing of content on P2P systems.
Social and economic impact
The concept of P2P is increasingly evolving to an expanded usage as the relational dynamic active in distributed networks, i.e., not just computer to computer, but human to human. Yochai Benkler has coined the term commons-based peer production to denote collaborative projects such as free and open source software and Wikipedia. Associated with peer production are the concepts of:
- peer governance (referring to the manner in which peer production projects are managed)
- peer property (referring to the new type of licenses which recognize individual authorship but not exclusive property rights, such as the GNU General Public License and the Creative Commons licenses)
- peer distribution (or the manner in which products, particularly peer-produced products, are distributed)
Some researchers have explored the benefits of enabling virtual
communities to self-organize and introduce incentives for resource
sharing and cooperation, arguing that the social aspect missing from
today's peer-to-peer systems should be seen both as a goal and a means
for self-organized virtual communities to be built and fostered.
Ongoing research efforts for designing effective incentive mechanisms
in P2P systems, based on principles from game theory are beginning to
take on a more psychological and information-processing direction.
Applications
There are numerous applications of peer-to-peer networks. The most commonly known is for content distribution
Content delivery
- Many file sharing networks, such as gnutella, G2 and the eDonkey network popularized peer-to-peer technologies. From 2004 on, such networks form the largest contributor of network traffic on the Internet.
- Peer-to-peer content delivery networks (P2P-CDN). See : Giraffic, Kontiki, Ignite, RedSwoosh.
- Peer-to-peer content services, e.g. caches for improved performance such as Correli Caches
- Software publication and distribution (Linux, several games); via file sharing networks.
- Streaming media. P2PTV and PDTP. Applications include TVUPlayer, Joost, CoolStreaming, Cybersky-TV, PPLive, LiveStation, Giraffic and Didiom.
- Spotify uses a peer-to-peer network along with streaming servers to stream music to its desktop music player.
- Peercasting for multicasting streams. See PeerCast, IceShare, FreeCast, Rawflow
- Pennsylvania State University, MIT and Simon Fraser University are carrying on a project called LionShare designed for facilitating file sharing among educational institutions globally.
- Osiris (Serverless Portal System) allows its users to create anonymous and autonomous web portals distributed via P2P network.
Exchange of physical goods, services, or space
- Peer-to-peer renting web platforms enable people to find and reserve goods, services, or space on the virtual platform, but carry out the actual P2P transaction in the physical world (for example: emailing a local footwear vendor to reserve for you that comfy pair of slippers which you've always had your eyes on, or contacting a neighbor who has listed their weedwacker for rent).
Networking
- Dalesa a peer-to-peer web cache for LANs (based on IP multicasting).
- Voice Peering Fabric is a peer-to-peer interconnect system for routing VoIP traffic between organizations by utilizing BGP and ENUM technology.
Science
- In bioinformatics, drug candidate identification. The first such program was begun in 2001 the Centre for Computational Drug Discovery at the University of Oxford in cooperation with the National Foundation for Cancer Research. There are now several similar programs running under the United Devices Cancer Research Project.
- The sciencenet P2P search engine.
Search
- Distributed search engine, a search engine where there is no central server
- YaCy, a free distributed search engine, built on principles of peer-to-peer networks.
- FAROO, a Peer-to-peer Web search engine
Communications networks
- Skype, one of the most widely used internet phone applications is using P2P technology.
- VoIP (using application layer protocols such as SIP)
- Instant messaging and online chat
- Completely decentralized networks of peers: Usenet (1979) and WWIVnet (1987).
General
- Research like the Chord project, the PAST storage utility, the P-Grid, and the CoopNet content distribution system.
- JXTA, for Peer applications. See Collanos Workplace (Teamwork software), Sixearch
Miscellaneous
- The U.S. Department of Defense has started research on P2P networks as part of its modern network warfare strategy. In May, 2003 Dr. Tether. Director of Defense Advanced Research Project Agency testified that U.S. Military is using P2P networks.
- Kato et al.’s studies indicate over 200 companies with approximately $400 million USD are investing in P2P network. Besides File Sharing, companies are also interested in Distributing Computing, Content Distribution.
- Wireless community network, Netsukuku
- An earlier generation of peer-to-peer systems were called "metacomputing" or were classed as "middleware". These include: Legion, Globus
- Bitcoin is a peer-to-peer based digital currency.
Historical perspective
Tim Berners-Lee's vision for the World Wide Web
was close to a P2P network in that it assumed each user of the web
would be an active editor and contributor, creating and linking content
to form an interlinked "web" of links. This contrasts to the current broadcasting-like structure of the web.
Some networks and channels such as Napster, OpenNAP and IRC serving channels use a client–server structure for some tasks (e.g., searching) and a P2P structure for others. Networks such as gnutella or Freenet use a P2P structure for nearly all tasks, with the exception of finding peers to connect to when first setting up.
P2P architecture embodies one of the key technical concepts of the Internet, described in the first Internet Request for Comments, RFC 1,
"Host Software" dated April 7, 1969. More recently, the concept has
achieved recognition in the general public in the context of the absence
of central indexing servers in architectures used for exchanging multimedia files.
Network neutrality controversy
Peer-to-peer applications present one of the core issues in the network neutrality controversy. Internet service providers (ISPs) have been known to throttle P2P file-sharing traffic due to its high-bandwidth usage.
Compared to Web browsing, e-mail or many other uses of the internet,
where data is only transferred in short intervals and relative small
quantities, P2P file-sharing often consists of relatively heavy
bandwidth usage due to ongoing file transfers and swarm/network
coordination packets. In October 2007, Comcast, one of the largest broadband Internet providers in the USA, started blocking P2P applications such as BitTorrent.
Their rationale was that P2P is mostly used to share illegal content,
and their infrastructure is not designed for continuous, high-bandwidth
traffic. Critics point out that P2P networking has legitimate uses, and
that this is another way that large providers are trying to control use
and content on the Internet, and direct people towards a client-server-based
application architecture. The client-server model provides financial
barriers-to-entry to small publishers and individuals, and can be less
efficient for sharing large files. As a reaction to this bandwidth throttling, several P2P applications started implementing protocol obfuscation, such as the BitTorrent protocol encryption.
Techniques for achieving "protocol obfuscation" involves removing
otherwise easily identifiable properties of protocols, such as
deterministic byte sequences and packet sizes, by making the data look
as if it were random. The ISP's solution to the high bandwidth is P2P caching, where a ISP stores the part of files most accessed by P2P clients in order to save access to the Internet.
1 comments:
Quality posts is the key to interest the viewers to
go to see the website, that's what this site is providing.
Also see my webpage > How to Repair Outlook exchange
Post a Comment