A diskless node (or diskless workstation) is a workstation or personal computer without disk drives, which employs network booting to load its operating system from a server. (A computer may also be said to act as a diskless node, if its disks are unused and network booting is used.)
Diskless nodes (or computers acting as such) are sometimes known as network computers or hybrid clients. Hybrid client may either just mean diskless node, or it may be used in a more particular sense to mean a diskless node which runs some, but not all, applications remotely, as in the thin client computing architecture.
Advantages of diskless nodes can include lower production cost, lower
running costs, quieter operation, and manageability advantages (for
example, centrally managed software installation).
In many universities and in some large organizations, PCs are used in a similar configuration, with some or all applications stored remotely but executed locally—again, for manageability reasons. However, these are not diskless nodes if they still boot from a local hard drive.
Distinction between diskless nodes and centralized computing
Diskless nodes process data, thus using their own CPU and RAM to run software, but do not store data persistently—that task is handed off to a server. This is distinct from thin clients,
in which all significant processing happens remotely, on the server—the
only software that runs on a thin client is the "thin" (i.e. relatively
small and simple) client software, which handles simple input/output
tasks to communicate with the user, such as drawing a dialog box on the display or waiting for user input.
A collective term encompassing both thin client computing, and its technological predecessor, text terminals (which are text-only), is centralized computing. Thin clients and text terminals
can both require powerful central processing facilities in the servers,
in order to perform all significant processing tasks for all of the
clients.
Diskless nodes can be seen as a compromise between fat clients (such as ordinary personal computers)
and centralized computing, using central storage for efficiency, but
not requiring centralized processing, and making efficient use of the
powerful processing power of even the slowest of contemporary CPUs, which would tend to sit idle for much of the time under the centralized computing model.
Centralized computing or Thin client |
Diskless node | Fat client | |
---|---|---|---|
Local hard drives used | No | No | Yes |
Local general-purpose processing used | No | Yes | Yes |
Principles of operation
The operating system (OS) for a diskless node is loaded from a server, using network booting. In some cases, removable storage may be used to initiate the bootstrap process, such as a USB flash drive, or other bootable media such as a floppy disk, CD or DVD. However, the firmware
in many modern computers can be configured to locate a server and begin
the bootup process automatically, without the need to insert bootable
media.
For network auto-booting, the Preboot Execution Environment (PXE) or Bootstrap Protocol
(BOOTP) network protocols are commonly used to find a server with files
for booting the device. Standard full-size desktop PCs are able to be
network-booted in this manner with an add-on network card that includes a
UNDI
boot ROM. Diskless network booting is commonly a built-in feature of
desktop and laptop PCs intended for business use, since it can be used
on an otherwise disk-booted standard desktop computer to remotely run
diagnostics, to install software, or to apply a disk image to the local hard drive.
After the bootstrapping process has been initiated, as described
above, bootstrapping will take place according to one of three main
approaches.
- In the first approach (used, for example, by the Linux Terminal Server Project), the kernel is loaded into memory and then the rest of the operating system is accessed via a network filesystem connection to the server. (A small RAM disk may be created to store temporary files locally.) This approach is sometimes called the "NFS root" technique when used with Linux or Unix client operating systems.
- In the second approach, the kernel of the OS is loaded, and part of the system's memory is configured as a large RAM disk, and then the remainder of the OS image is fetched and loaded into the RAM disk. This is the implementation that Microsoft has chosen for its Windows XP embedded remote boot feature.
- In the third approach, disk operations are virtualized and are actually translated into a network protocol. The data that are usually stored in a disk drive are then stored in virtual disks files homed on a server. The disk operations such as requests to read/write disk sectors are translated into corresponding network requests and processed by a service or daemon running on the server side. This is the implementation that is used by Neoware Image Manager, Ardence, VHD and various "boot over iSCSI" products. This third approach differs from the first approach because what is remote is not a file system but actually a disk device (or raw device) and that the client OS is not aware that it is not running off a hard disk. This is why this approach is sometimes named "Virtual Hard Disk" or "Network Virtual Disk".
This third approach makes it easier to use client OS than having a
complete disk image in RAM or using a read-only file system. In this
approach, the system uses some "write cache" that stores every data that
a diskless node has written. This write cache is usually a file, stored
on a server (or on the client storage if any). It can also be a portion
of the client RAM. This write cache can be persistent or volatile. When
volatile, all the data that has been written by a specific client to
the virtual disk are dismissed when said client is rebooted, and yet,
user data can remain persistent if recorded in user (roaming) profiles
or home folders (that are stored on remote servers). The two major
commercial products (the one from Hewlett-Packard, and the other one from Citrix Systems) that allow the deployment of Diskless Nodes that can boot Microsoft Windows or Linux client OS use such write caches. The Citrix product cannot use persistent write cache, but VHD and HP product can.
Diskless Windows nodes
Windows 3.x and Windows 95 OSR1 supported Remote Boot operations, from Netware servers, Windows NT Servers and even DEC Pathworks servers.
Third party software Vendors such as Qualystem (acquired by Neoware), LanWorks (acquired by 3Com), Ardence (acquired by Citrix),APCT and Xtreamining Technology have developed and marketed software products aimed to remote-boot newer versions of the Windows
product line: Windows 95 OSR2 and Windows 98 were supported by
Qualystem and Lanworks, Windows NT was supported by APCT and Ardence
(called VenturCom at that time), and Windows 2000/XP/2003/Vista/Windows 7
are supported by Hewlett Packard (which acquired Neoware which had previously acquired Qualystem) and Citrix Systems (which acquired Ardence).
Comparison with fat clients
[ Software installation and maintenance ]
With essentially a single OS image for an array of machines (with
perhaps some customizations for differences in hardware configurations
among the nodes), installing software and maintaining installed software
can be more efficient. Furthermore, any system changes
made during operation (due to user action, worms, viruses, etc.) can be
either wiped out when the power is removed (if the image is copied to a
local RAM disk) such as Windows XP Embedded remote boot or prohibited entirely (if the image is a network filesystem). This allows use in public access areas (such as libraries) or in schools etc. where users might wish to experiment or attempt to "hack" the system.
However, it is not necessary to implement network booting to achieve either of the above advantages - ordinary PCs
(with the help of appropriate software) can be configured to download
and reinstall their operating systems on (e.g.) a nightly basis, with
extra work compared to using shared disk image that diskless nodes boot
off.
Modern diskless nodes can share the very same disk image, using a 1:N
relationship (1 disk image used simultaneously by N diskless nodes).
This makes it very easy to install and maintain software applications:
The administrator needs to install or maintain the application only
once, and the clients can get the new application as soon as they boot
off the updated image. Disk image sharing is made possible because they
use the write cache: No client competes for any writing in a shared disk
image, because each client writes to its own cache.
All the modern diskless nodes systems can also use a 1:1
Client-to-DiskImage relationship, where one client "owns" one disk image
and writes directly into said disk image. No write cache is used then.
Making a modification in a shared disk image is usually made this way:
1. The administrator makes a copy of the shared disk image that
he/she wants to update (this can be done easily because the disk image
file is opened only for reading)
2. The administrator boots a diskless node in 1:1 mode (unshared mode) from the copy of the disk image he/she just made
3. The administrator makes any modification to the disk image (for
instance install a new software application, apply patches or
hotfixes...)
4. The administrator shutdowns the diskless node that was using the disk image in 1:1 mode
5. The administrator shares the modified disk image
6. The diskless nodes use the shared disk image (1:N) as soon as they are rebooted.
[ Centralized storage ]
The use of central disk storage also makes more efficient use of disk
storage. This can cut storage costs, freeing up capital to invest in more reliable, modern storage technologies, such as RAID arrays which support redundant operation, and storage area networks
which allow hot-adding of storage without any interruption. Further, it
means that losses of disk drives to mechanical or electrical
failure—which are statistically highly probable events over a timeframe
of years, with a large number of disks involved—are often both less
likely to happen (because there are typically less disk drives that can
fail) and less likely to cause interruption (because they would likely
be part of RAID arrays). This also means that the nodes themselves are less likely to have hardware failures than fat clients.
Diskless nodes share these advantages with thin clients.
Performance of centralized storage
However, this storage efficiency can come at a price. As often
happens in computing, increased storage efficiency sometimes comes at
the price of decreased performance.
Large numbers of nodes making demands on the same server
simultaneously can slow down everyone's experience. However, this can be
mitigated by installing large amounts of RAM on the server (which speeds up read operations by improving caching
performance), by adding more servers (which distributes the I/O
workload), or by adding more disks to a RAID array (which distributes
the physical I/O workload). In any case this is also a problem which can affect any client-server network to some extent, since, of course, fat clients also use servers to store user data.
Indeed, user data may be much more significant in size and may be
accessed far more frequently than operating systems and programs in some
environments, so moving to a diskless model will not necessarily cause a noticeable degradation in performance.
Greater network bandwidth
(i.e. capacity) will also be used in a diskless model, compared to a
fat client model. This does not necessarily mean that a higher capacity
network infrastructure will need to be installed—it could simply mean
that a higher proportion of the existing network capacity will be used.
Finally, the combination of network data transfer latencies
(physically transferring the data over the network) and contention
latencies (waiting for the server to process other nodes' requests
before yours) can lead to an unacceptable degradation in performance
compared to using local drives, depending on the nature of the
application and the capacity of the network infrastructure and the
server.
[ Other advantages ]
Another example of a situation where a diskless node would be useful
is in a possibly hazardous environment where computers are likely to be
damaged or destroyed, thus making the need for inexpensive nodes, and
minimal hardware a benefit. Again, thin clients can also be used here.
Diskless machines may also consume little power and make little noise, which implies potential environmental benefits and makes them ideal for some computer cluster applications.
Comparison with thin clients
Major corporations tend to instead implement thin clients (using Microsoft Windows Terminal Server
or other such software), since much lower specification hardware can be
used for the client (which essentially acts as a simple "window" into
the central server which is actually running the users operating system
as a login session).
Of course, diskless nodes can also be used as thin clients. Moreover,
thin client computers are increasing in power to the point where they
are becoming suitable as fully-fledged diskless workstations for some
applications.
Both thin client and diskless node architectures employ diskless
clients which have advantages over fat clients (see above), but differ
with regard to the location of processing.
[ Advantages of diskless nodes over thin clients ]
- Distributed load The processing load of diskless nodes is distributed. Each user gets its own processing isolated environment, barely affecting other users in the network, as long as their workload is not filesystem-intensive. Thin clients rely on the central server for the processing and thus require a fast server. When the central server is busy and slow, both kinds of clients will be affected, but thin clients will be slowed down completely, whereas diskless nodes will only be slowed down when accessing data on the server.
- Better multimedia performance. Diskless nodes have advantages over thin clients in multimedia-rich applications that would be bandwidth intensive if fully served. For example, diskless nodes are well suited for video gaming.
- Peripheral support Diskless nodes are typically ordinary personal computers or workstations with no hard drives supplied, which means the usual large variety of peripherals can be added. By contrast, thin clients are typically very small, sealed boxes with no possibility for internal expansion, and limited or non-existent possibility for external expansion. Even if e.g. a USB device can be physically attached to a thin client, the thin client software might not support peripherals beyond the basic input and output devices - for example, it may not be compatible with graphics tablets, digital cameras or scanners.
[ Advantages of thin clients over diskless nodes ]
- The hardware is cheaper on thin clients, since processing requirements on the client are minimal, and 3D acceleration and elaborate audio support are not usually provided. Of course, a diskless node can also be purchased with a cheap CPU and minimal multimedia support, if suitable. Thus, cost savings may be smaller than they first appear for some organizations. However, many large organizations habitually buy hardware with a higher than necessary specification to meet the needs of particular applications and uses, or to ensure future proofing. There are also less "rational" reasons for overspecifying hardware which quite often come into play: departments wastefully using up budgets in order to retain their current budget levels for next year; and uncertainty about the future, or lack of technical knowledge, or lack of care and attention, when choosing PC specifications. Taking all these factors into account, thin clients may bring the most substantial savings, as only the servers are likely to be substantially "gold-plated" and/or "future-proofed" in the thin client model.
- Future proofing is not much of an issue for thin clients, which are likely to remain useful for the entirety of their replacement cycle - one to four years, or even longer - as the burden is on the servers. There are issues when it comes to diskless nodes, as the processing load is potentially much higher, thus meaning more consideration is required when purchasing. Thin client networks may require significantly more powerful servers in the future, whereas a diskless nodes network may in future need a server upgrade, a client upgrade, or both.
- Thin client networks have less network bandwidth consumption potentially, since much data is simply read by the server and processed there, and only transferred to the client in small pieces, as and when needed for display. Also, transferring graphical data to the display is usually more suited for efficient data compression and optimisation technologies than transferring arbitrary programs, or user data. In many typical application scenarios, both total bandwidth consumption and "burst" consumption would be expected to be less for an efficient thin client, than for a diskless node.
0 comments:
Post a Comment