Lexter

Internet History And Networks

This is the History so far:-

Recently, everyone seems to have heard about the Internet, but did you know that the net has been around since 1969? I always like to start Internet classes with a review of Internet history. Don't worry if you don't understand all the terms; the idea is to get a general picture of Internet history.
1969 - Birth of a Network
The Internet as we know it today, in the mid-1990s, traces it origins back to a Defence Department project in 1969. The subject of the project was wartime digital communications. At that time the telephone system was about the only theatre-scale communications system in use. A major problem had been identified in its design - its dependence on switching stations that could be targeted during an attack. Would it be possible to design a network that could quickly reroute digital traffic around failed nodes? A possible solution had been identified in theory. That was to build a "web" of datagram network, called a "catenet", and use dynamic routing protocols to constantly adjust the flow of traffic through the catenet. The Defence Advanced Research Projects Agency (DARPA) launched the DARPA Internet Program.
1970s - Infancy
DARPA Internet, largely the plaything of academic and military researchers, spent more than a decade in relative obscurity. As Vietnam, Watergate, the Oil Crisis, and the Iranian Hostage Crisis rolled over the nation, several Internet research teams proceeded through a gradual evolution of protocols. In 1975, DARPA declared the project a success and handed its management over to the Defence Communications Agency. Several of today's key protocols (including IP and TCP) were stable by 1980, and adopted throughout ARPANET by 1983.
Mid 1980s - The Research Net
Let's outline key features, circa-1983, of what was then called ARPANET. A small computer was a PDP-11/45, and a PDP-11/45 does not fit on your desk. Some sites had a hundred computers attached to the Internet. Most had a dozen or so, probably with something like a VAX doing most of the work - mail, news, EGP routing. Users did their work using DEC VT-100 terminals. FORTRAN was the word of the day. Few companies had Internet access, relying instead on SNA and IBM mainframes. Rather, universities and military research sites dominated the Internet community. It's most popular service was the rapid email it made possible with distant colleagues. In August 1983, there were 562 registered ARPANET hosts (RFC 1296).
UNIX deserves at least an honourable mention, since almost all the initial Internet protocols were developed first for UNIX, largely due to the availability of kernel source (for a price) and the relative ease of implementation (relative to things like VMS or MVS). The University of California at Berkeley (UCB) deserves special mention, because their Computer Science Research Group (CSRG) developed the BSD variants of AT&T's UNIX operating system. BSD UNIX and its derivatives would become the most common Internet programming platform.
Many key features of the Internet were already in place, including the IP and TCP protocols. ARPANET was fundamentally unreliable in nature, as the Internet is still today. This principle of unreliable delivery means that the Internet only makes a best-effort attempt to deliver packets. The network can drop a packet without any notification to sender or receiver. Remember, the Internet was designed for military survivability. The software running on either end must be prepared to recognize data loss, retransmitting data as often as necessary to achieve its ultimate delivery.
Late 1980s - The PC Revolution
Driven largely by the development of the PC and LAN technology, subnetting was standardized in 1985 when RFC 950 was released. LAN technology made the idea of a "catenet" feasible - an internetwork of networks. Subnetting opened the possibilities of interconnecting LANs with WANs.
The National Science Foundation (NSF) started the Supercomputer Centres program in 1986. Until then, supercomputers such as CrayÃ¢â‚¬â„¢s were largely the playthings of large, well-funded universities and military research centres. NSF's idea was to make supercomputer resources available to those of more modest means by constructing five supercomputer centres around the country and building a network linking them with potential users. NSF decided to base their network on the Internet protocols, and NSFNET was born. For the next decade, NSFNET would be the core of the U.S. Internet, until its privatisation and ultimate retirement in 1995.
Domain naming was stable by 1987 when RFC 1034 was released. Until then, hostnames were mapped to IP address using static tables, but the Internet's exponential growth had made this practice infeasible.
In the late 1980s, important advances related poor network performance with poor TCP performance, and a string of papers by the likes of Nagle and Van Jacobson (RFC 896, RFC 1072, RFC 1144, RFC 1323) present key insights into TCP performance.
The 1987 Internet Worm was the largest security failure in the history of the Internet. All things considered, it could happen again.
Early 1990s - Address Exhaustion and the Web
In the early 90s, the first address exhaustion crisis hit the Internet technical community. The present solution, CIDR, will sustain the Internet for a few more years by making more efficient use of IP's existing 32-bit address space. For a more lasting solution, IETF is looking at IPv6 and its 64-bit address space, but CIDR is here to stay.
Crisis aside, the World Wide Web (WWW) has been one of Internet's most exciting recent developments. The idea of hypertext has been around for more than a decade, but in 1989 a team at the European Centre for Particle Research (CERN) in Switzerland developed a set of protocols for transferring hypertext via the Internet. In the early 1990s a team at the National Centre enhanced it for Supercomputing Applications (NCSA) at the University of Illinois - one of NSF's supercomputer centres. The result was NCSA Mosaic, a graphical, point-and-click hypertext browser that made Internet easy. The resulting explosion in "Web sites" drove the Internet into the public eye.
Mid 1990s - The New Internet
Of at least as much interest as Internet's technical progress in the 1990s has been its sociological progress. It has already become part of the national vocabulary, and seems headed for even greater prominence. The business community, with a resulting explosion of service providers, consultants, books, and TV coverage, has accepted it. It has given birth to the Free Software Movement.
The Free Software Movement owes much to bulletin board systems, but really came into its own on the Internet, due to a combination of forces. The public nature of the Internet's early funding ensured that much of its networking software was non-proprietary. The emergence of anonymous FTP sites provided a distribution mechanism that almost anyone could use. Network newsgroups and mailing lists offered an open communication medium. Last but not least were individualists like Richard Stallman, who wrote EMACS, launched the GNU Project and founded the Free Software Foundation. In the 1990s, Linus Torvalds wrote Linux, the popular (and free) UNIX clone operating system.
egin{soapbox}
The explosion of capitalist conservatism, combined with a growing awareness of Internet's business value, has led to major changes in the Internet community. Many of them have not been for the good.
First, there seems to be a growing departure from Internet's history of open protocols, published as RFCs. Many new protocols are being developed in an increasingly proprietary manner. IGRP, a trademark of Cisco Systems, has the dubious distinction as the most successful proprietary Internet routing protocol, capable only of operation between Cisco routers. Other protocols, such as BGP, are published as RFCs, but with important operational details omitted. The notoriously mis-named Open Software Foundation has introduced a whole suite of "open" protocols whose specifications are available - for a price - and not on the net. I am forced to wonder: 1) why do we need a new RPC? and 2) why won't OSF tell us how it works?
People forget that businesses have tried to run digital communications networks in the past. IBM and DEC both developed proprietary networking schemes that only ran on their hardware. Several information providers did very well for themselves in the 80s, including LEXIS/NEXIS, Dialog, and Dow Jones. Companies like Tymnet and run into every major US city constructed public data networks. CompuServe and others built large bulletin board-like systems. Many of these services still offer a quality and depth of coverage unparalleled on the Internet (examine Dialog if you are sceptical of this claim). But none of them offered nudie GIFs that anyone could download. None of them let you read through the RFCs and then write a Perl script to tweak the one little thing you needed to adjust. None of them gave birth to a Free Software Movement. None of them caught people's imagination.
The very existence of the Free Software Movement is part of the Internet saga, because free software would not exist without the net. "Movements" tend to arise when progress offers us new freedoms and we find new ways to explore and, sometimes, to exploit them. The Free Software Movement has offered what would be unimaginable when the Internet was formed - games, editors, windowing systems, compilers, networking software, and even entire operating systems available for anyone who wants them, without licensing fees, with complete source code, and all you need is Internet access. It also offers challenges, forcing us to ask what changes are needed in our society to support these new freedoms that have touched so many people. And it offers chances at exploitation, from the businesses using free software development platforms for commercial code, to the Internet Worm and the security risks of open systems.
People wonder whether progress is better served through government funding or private industry. The Internet defies the popular wisdom of "business is better". Both business and government tried to build large data communication networks in the 1980s. Business depended on good market decisions; the government researchers based their system on openness, imagination and freedom. Business failed; Internet succeeded. Our reward has been its commercialisation.
end{soapbox}
For the next few years, the Internet will almost certainly be content-driven. Although new protocols are always under development, we have barely begun to explore the potential of just the existing ones. Chief among these is the World Wide Web, with its potential for simple on-line access to almost any information imaginable. Yet even as the Internet intrudes into society, remember that over the last two decades "The Net" has developed a culture of its own, one that may collide with society's. Already business is making its pitch to dominate the Internet. Already Congress has deemed it necessary to regulate the Web. The big questions loom unanswered: How will society change the Internet... and how will the Internet change society?
Protocols
One of the more important networking concepts is the protocol.

Douglas Comer defines a protocol as "a formal description of message formats and the rules two or more machines must follow to exchange those messages."
Protocols usually exist in two forms. First, they exist in a textual form for humans to understand. Second, they exist as programming code for computers to understand. Both forms should ultimately specify the precise interpretation of every bit of every message exchanged across a network.
Protocols exist at every point where logical program flow crosses between hosts. In other words, we need protocols every time we want to do something on another computer. Every time we want to print something on a network printer we need protocols. Every time we want to download a file we need protocols. Every time we want to save our work on disk, we don't need protocols - unless the disk is on a network file server.
Usually multiple protocols will be in use simultaneously. For one thing, computers usually do several things at once, and often for several people at one. Therefore, most protocols support multitasking. Also, one operation can involve several protocols. For example, consider the NFS (Network File System) protocol. A write to a file is done with an NFS operation, that uses another protocol (RPC) to perform a function call on a remote host, that uses another protocol (UDP) to deliver a data gram to a port on a remote host, that uses another protocol to delivery a data gram on an Ethernet, and so on. Along the way we made need to lookup host names (using the DNS protocol), convert data to a network standard form (using the XDR protocol), find a routing path to the host (using one or many of numerous protocols) - I think you get the idea.
egin{soapbox}
One of the challenges facing network designers is to construct protocols that are as specific as possible to one function. For example, I consider NFS a good protocol design because one protocol does file transport (NFS), one protocol does procedure calls (RPC), etc. If you need to make a remote procedure call to print a file, you already have the RPC protocol that already does almost everything you need. Add one piece to the puzzle - a printing protocol, defined in terms using the RPC protocol, and your job is done.
On the other hand, I do not consider TCP a very good protocol, because it mixes two functions: reliable data delivery and connection-oriented streams. Consequently, the Internet lacks a good, reliable data gram delivery mechanism, because TCP's reliable delivery techniques, while effective, are specific to stream connections.
end{soapbox}
Protocol Layering
Protocols define the format of the messages exchanged over the Internet. They are normally structured in layers, to simplify design and programming.

Protocol layering is a common technique to simplify networking designs by dividing them into functional layers, and assigning protocols to perform each layer's task.
For example, it is common to separate the functions of data delivery and connection management into separate layers, and therefore separate protocols. Thus, one protocol is designed to perform data delivery, and another protocol, layered above the first, performs connection management. The data delivery protocol is fairly simple and knows nothing of connection management. The connection management protocol is also fairly simple, since it doesn't need to concern itself with data delivery.
Protocol layering produces simple protocols, each with a few well-defined tasks. These protocols can then be assembled into a useful whole. Individual protocols can also be removed or replaced as needed for particular applications.
The most important layered protocol designs are the Internet's original DoD model, and the OSI Seven Layer Model. The modern Internet represents a fusion of both models.
DoD Networking Model
The first layered protocol model we will study is the 4-layer DoD Model. This is the model originally designed for the Internet, and is important because all of the Internet's core protocols adhere to it.

The Department of Defence Four-Layer Model was developed in the 1970s for the DARPA Internetwork Project that eventually grew into the Internet. The core Internet protocols adhere to this model, although the OSI Seven Layer Model is justly preferred for new designs.

The four layers in the DoD model, from bottom to top, are:
1. The Network Access Layer is responsible for delivering data over the particular hardware media in use. Different protocols are selected from this layer, depending on the type of physical network.
2. The Internet Layer is responsible for delivering data across a series of different physical networks that interconnect a source and destination machine. Routing protocols are most closely associated with this layer, as is the IP Protocol, the Internet's fundamental protocol.
3. The Host-to-Host Layer handles connection rendezvous, flow control, retransmission of lost data, and other generic data flow management. The mutually exclusive TCP and UDP protocols are this layer's most important members.
4. The Process Layer contains protocols that implement user-level functions, such as mail delivery, file transfer and remote login.
Encapsulation
Layered protocol models rely on encapsulation, which allows one protocol to be used for relaying another's messages.

Encapsulation, closely related to the concept of Protocol Layering, refers to the practice of enclosing data using one protocol within messages of another protocol.
To make use of encapsulation, the encapsulating protocol must be open-ended, allowing for arbitrary data to place in its messages. Another protocol can then be used to define the format of that data.
Encapsulation Example
For example, consider an Internet host that requests a hypertext page over a dialup serial connection. The following scenario is likely:
First, the Hypertext Transfer Protocol (HTTP) is used to construct a message requesting the page. The message, the exact format of which is unimportant at this time, is represented as follows:

Next, the Transmission Control Protocol (TCP) is used to provide the connection management and reliable delivery that HTTP requires, but does not provide itself. TCP defines a message header format, which can be followed by arbitrary data. So, a TCP message is constructed by attaching a TCP header to the HTTP message, as follows:

Now TCP does not provide any facilities for actually relaying a message from one machine to another in order to reach its destination. This feature is provided by the Internet Protocol (IP), which defines its own message header format. An IP message is constructed by attaching an IP header to the combined TCP/HTTP message:

Finally, although IP can direct messages between machines, it can not actually transmit the message from one machine to the next. This function is dependent on the actual communications hardware. In this example, we're using a dialup modem connection, so it's likely that the first step in transmitting the message will involve the Point-to-Point Protocol (PPP):

Note that I've drawn the PPP encapsulation a little differently, by enclosing the entire message, not just attaching a header. This is because PPP may modify the message if it includes bytes that can't be transmitted across the link. The receiving PPP reverses these changes, and the message emerges intact. The point to remember is that the encapsulating protocol can do anything it wants to the message - expand it, encrypt it, compress it - so long as the original message is extracted at the other end.
Standards
Protocols must be consistent to be effective. Therefore, standards are agreed upon and published.

Standards are the things that make the Internet work. Almost always they take the form of protocols that everyone has agreed on.
Role of Standards
Standardized protocols provide a common meeting ground for software designers. Without standards, it is unlikely that an IBM computer could transfer files from a Macintosh, or print to a NetWare server, or login to a Sun. The technical literature of the Internet consists primarily of standard protocols that define how software and hardware from wildly divergent sources can interact on the net.
Sources of Standards
Standards come in two flavours - de facto and de jure. De facto standards are common practices; de jure standards have been "blessed" by some official standards body. In the Internet, many different organizations try to play the standards game. IETF, the Internet Engineering Task Force, is chief among them. IETF issues the RFCs that define Internet Standards, and it is IETF's working groups that do the real work of developing new and enhanced Internet standards. ISO, the International Standards Organization, issues the OSI standards. IEEE, the Institute of Electrical and Electronic Engineers, issues key LAN standards such as Ethernet and Token-Ring. ANSI, the American National Standards Institute, issues FDDI. As the common oxymoron goes, "The nice thing about standards is that there's so many to choose from."
Requests For Comments (RFCs)
IETF's standards deserve special mention, since it is these standards, more than any other, that make the Internet work. IETF issues its standards as Requests For Comments (RFCs), but not all RFCs are standards. To understand IETF's standardization process, start with Internet Standard 1, "Official Internet Protocol Standards", which discusses the process and lists the current status of various Internet standards. Since RFCs, once issued, do not change, Standard 1 is periodically updated and reissued as a new RFC. At the time of this writing (October 1998), the most recent Standard 1 is RFC 2400.
The Internet Society (ISOC), IETF's parent organization, has a long-standing commitment to open standards. RFC 1602, "Internet Standards Process", includes the following statement:
Except as otherwise provided under this section, ISOC will not accept, in connection with standards work, any idea, technology, information, document, specification, work, or other contribution, whether written or oral, that is a trade secret or otherwise subject to any commitment, understanding, or agreement to keep it confidential or otherwise restrict its use or dissemination; and, specifically, ISOC does not assume any confidentiality obligation with respect to any such contribution.
Example: Hypertext Page Transfer
The encapsulation essay presented an example of transferring a hypertext page over a serial link. Let's take another look at the example, from the standpoint of layered, standard protocols.
Ã‚Â· A Web browser requests this URL: http://www.FreeSoft.org/Connected/index.html
A URL (Universal Resource Locator) is a name that identifies a hypertext page. This URL identifies the home page of Connected: An Internet Encyclopaedia. I'll explain URLs in more detail later, but for now let's just say that there are three main parts to it. Http identifies that the Hypertext Transfer Protocol (HTTP) is to be used to obtain the page. Www.FreeSoft.org is the name of the Internet host that should be contacted to obtain the Web page. Finally, /Connected/index.html identifies the page itself.
Ã‚Â· The DNS protocol converts www.FreeSoft.org into the 32-bit IP address 205.177.42.129
The Domain Name System (DNS) doesn't fit neatly into our layered protocol model, but it is a very important protocol. The lower levels of the protocol stack all use 32-bit numeric addresses. Therefore, one of the first steps is to translate the textual host name into a numeric IP address, written as four decimal numbers, separated by periods.
Ã‚Â· The HTTP protocol constructs a GET /Connected/index.html message, that will be sent to host 205.177.42.129 to request the Web page.
The HTTP protocol also specifies that TCP will be used to send the message, and that TCP port 80 is used for HTTP operations.
In the DoD model, this is a Process Layer operation.
Ã‚Â· The TCP protocol opens a connection to 205.177.42.129, port 80, and transmits the HTTP GET /Connected/index.html message.
The TCP protocol specifies that IP will be used for message transport.
In the DoD model, this is a Host-to-Host Layer operation.
Ã‚Â· The IP protocol transmits the TCP packets to 205.177.42.129
The IP protocol also selects a communication link to perform the first step of the transfer, in this case a modem.
In the DoD model, this is an Internet Layer operation.
Ã‚Â· The PPP protocol encodes the IP/TCP/HTTP packets and transmits them across the modem line.
In the DoD model, this is a Network Access Layer operation.
C Section 1 Review
Congratulations! You've completed the first section of the Programmed Instruction Course. Let's summarize the topics covered in this section:
Ã‚Â· Protocol
o A protocol is "a formal description of message formats and the rules two or more machines must follow to exchange those messages."
o Protocols let us perform operations on other computers over a network.
o Many protocols can be in use at once.
o Protocols should be as specific to one task as possible.
Ã‚Â· Standard
o Standards are protocols that everyone has agreed upon.
o Standard organizations exist to develop, discuss and enhance protocols.
o The most important Internet standard organization is the Internet Engineering Task Force (IETF).
o The most important Internet standard documents are the Requests For Comments (RFCs).
Ã‚Â· Protocol Layering
o Protocols are usually organized by layering them atop one another.
o Protocol layers should have specific, well-defined functions.
o The most important protocol layering designs are the 4-layer Department of Defense (DoD) Model, and the 7-layer Open System Interconnect (OSI) Model.
Ã‚Â· 4-Layer DoD Model
o The 4-layer DoD layered protocol model consists of the Process, Host-to-Host, Internet, and Network Access Layers.
o The DoD Model was developed for the Internet.
o The core Internet protocols adhere to the DoD Model.
Ã‚Â· Encapsulation
o Encapsulation happens when one protocol's message is embedded into another protocol's message.
o Protocol layering is implemented through encapsulation.
Now, let's proceed into Section 2...
Section 2 - Domain Naming
At the end of Section 1, we examined a simple Internet operation - a hypertext page transfer. Remember these key points?
Ã‚Â· A Web browser requested this URL: http://www.FreeSoft.org/Connected/index.html
Ã‚Â· The DNS protocol was used to convert www.FreeSoft.org into the 32-bit IP address 205.177.42.129
Ã‚Â· The HTTP protocol was used to construct a GET /Connected/index.html message
Ã‚Â· A table lookup in /etc/services revealed that HTTP uses TCP port 80
Ã‚Â· The TCP protocol was used to open a connection to 205.177.42.129, port 80, and transmit the GET /Connected/index.html message
Ã‚Â· The IP protocol was used to transmit the TCP packets to 205.177.42.129
Ã‚Â· Some media-dependent protocols were used to actually transmit the IP packets across the physical network
Over the next few sections in this course, we'll be looking at each step of this procedure at a high level of detail.
Ã‚Â· In this section, we'll examine the DNS protocol, used in the second step to convert the hostname www.FreeSoft.org into 205.177.42.129, the 32-bit numeric address used by the TCP and IP protocols.
Naming
In an informal way, we've already looked at a lot of the different names used for identification in the Internet. You should already have seen most of these terms.

Several types of names exist in the Internet design model. An understanding of each is critical to the engineer.
Ã‚Â· Domain Names are alphanumeric strings used by users to identify Internet hosts. Www.FreeSoft.org is a domain name. Domain names are converted into IP addresses by DNS.
Ã‚Â· IP Addresses are 32-bit numbers used to identify Internet hosts by the IP Protocol. Sometimes IP addresses must be written in a human-readable format; dotted quad notation is used, with each of the four bytes written as a decimal number, separated by periods. 205.216.34.7 is a dotted quad IP address.
Ã‚Â· Service Names are short strings that identify particular services on an Internet host. They must be converted to port numbers before use, which is commonly done using a services table, /etc/services on UN*X machines. Examples of service names are telnet, smtp, and http.
Ã‚Â· Port Numbers identify particular services on an Internet host to the TCP and UDP Protocols. They are 16-bit numbers, usually written in decimal, and known by convention. For example, port 25 is used for SMTP mail transfers, and port 80 for HTTP Web transfers.
Ã‚Â· Universal Resource Locators (URLs) are used by the World Wide Web to locate and identify Web documents and other resources. URLs typically contain service names, domain names and sometimes port numbers. URLs also include a string, usually a file system path, to distinguish between different documents available through a single server.
DNS Theory
DNS uses a distributed database to maintain its worldwide tree of names.
DNS uses a distributed database protocol to delegate control of domain name hierarchies among zones, each managed by a group of name servers. For example, *.cnn.com, where * is anything, is completely the responsibility of CNN (Turner Broadcasting, as they say). CNN is responsible for constructing name servers to handle any domain name ending in cnn.com, referred to as their Zone of Authority (ZOA). A zone takes its name from its highest point, so this zone is simply called cnn.com. CNN registers their zone with InterNIC, who loads their name server IP addresses into the root name servers, which makes this information available to the global Internet. CNN can also make sub delegations, like delegating news.cnn.com to their news division. This can be as simple as creating new name server entries with the longer names, but mechanisms exist if the delegee wants to operate an independent name server (see RFC 1034 Ã‚Â§4.2).
Of course, CNN doesn't actually maintain their own name server. Like most people, they let their Internet service provider do it for them. In their case, that means ANSnet, so nis.ans.net is their primary name server, and ns.ans.net their backup name server. How do I know this? I accessed InterNIC's Whois service and retrieved cnn.com's domain information record. Follow the link to try this yourself.
So, name servers contain pointers to other name servers, which can be used to transverse the entire domain naming hierarchy. You may be wondering how Internet hosts find an entry point to this system. Currently, it can be done in three major ways, all of which depend on preloading the IP address of at least one name server. One way is to reconfigure addresses of the root name servers. This method is typically used by Internet service providers on their name servers, typically in the UNIX file /etc/namedb/named.root. Another way is to preload the address of a name server that supports recursive queries, and send any name server lookups to it. This method is common among dial-up Internet subscribers. The user preloads the address of the service provider's name server, which processes all queries and returns the answer to the client. The final method is to automatically configure the address of a recursive name server, perhaps using a PPP extension (RFC 1877) that is not yet widely supported.
Once a host has been configured with initial name server addresses, it can use the DNS protocols to locate the name servers responsible for any part of the DNS naming hierarchy, and retrieve the resource records (RRs) that match DNS names to IP addresses and control Internet mail delivery.
RFC 1034
Recall from Section 1 that Internet standards are specified in documents called Requests For Comments (RFCs). Although RFCs can be terse and technical, reading them is critical to understanding Internet operation.
The first RFC we'll look at is RFC 1034, Domain Names - Concepts and Facilities.
I've reformatted a number of important RFCs to make them more suitable for Web presentation. If you'd like, click here to see RFC 1034 as it is distributed by InterNIC. This is a large document (123KB), and we're not going to view the RFCs in this format, but you may wish to see what the original documents look like.
In the EncyclopaediaÃ¢â‚¬â„¢s Web format, each RFC has a top page containing a table of contents that leads to Web pages containing the rest of the document.
Let's take a look at RFC 1034's top page.

DOMAIN NAMES - CONCEPTS AND FACILITIES
1. STATUS OF THIS MEMO
This RFC is an introduction to the Domain Name System (DNS), and omits many details, which can be found in a companion RFC, "Domain Names - Implementation and Specification" [RFC-1035]. That RFC assumes that the reader is familiar with the concepts discussed in this memo.
A subset of DNS functions and data types constitutes an official protocol. The official protocol includes standard queries and their responses and most of the Internet class data formats (e.g., host addresses).
However, the domain system is intentionally extensible. Researchers are continuously proposing, implementing and experimenting with new data types, query types, classes, functions, etc. Thus while the components of the official protocol are expected to stay essentially unchanged and operate as a production service, experimental behaviour should always be expected in extensions beyond the official protocol. Experimental or obsolete features are clearly marked in these RFCs, and such information should be used with caution.
The reader is especially cautioned not to depend on the values which appear in examples to be current or complete, since their purpose is primarily pedagogical. Distribution of this memo is unlimited.
Table of Contents
DNS Design Goals
To get a better understanding of DNS, we'll read some parts of RFC 1034, starting with its Introduction.

The design goals of the DNS influence its structure. They are:
Ã‚Â· The primary goal is a consistent name space, which will be used for referring to resources. In order to avoid the problems caused by ad hoc encoding, names should not be required to contain network identifiers, addresses, routes, or similar information as part of the name.
Ã‚Â· The sheer size of the database and frequency of updates suggest that it must be maintained in a distributed manner, with local caching to improve performance. Approaches that attempt to collect a consistent copy of the entire database will become more and more expensive and difficult, and hence should be avoided. The same principle holds for the structure of the name space, and in particular mechanisms for creating and deleting names; these should also be distributed.
Ã‚Â· Where there tradeoffs between the cost of acquiring data, the speed of updates, and the accuracy of caches, the source of the data should control the trade-off.
Ã‚Â· The costs of implementing such a facility dictate that it be generally useful, and not restricted to a single application. We should be able to use names to retrieve host addresses, mailbox data, and other as yet undetermined information. All data associated with a name is tagged with a type, and queries can be limited to a single type.
Ã‚Â· Because we want the name space to be useful in dissimilar networks and applications, we provide the ability to use the same name space with different protocol families or management. For example, host address formats differ between protocols, though all protocols have the notion of address. The DNS tags all data with a class as well as the type, so that we can allow parallel use of different formats for data of type address.
Ã‚Â· We want name server transactions to be independent of the communications system that carries them. Some systems may wish to use data grams for queries and responses, and only establish virtual circuits for transactions that need the reliability (e.g., database updates, long transactions); other systems will use virtual circuits exclusively.
Ã‚Â· The system should be useful across a wide spectrum of host capabilities. Both personal computers and large timeshared hosts should be able to use the system, though perhaps in different ways.
Elements of DNS
The DNS has three major components:
Ã‚Â· The DOMAIN NAME SPACE and RESOURCE RECORDS, which are specifications for a tree structured name space and data associated with the names. Conceptually, each node and leaf of the domain name space tree names a set of information, and query operations are attempts to extract specific types of information from a particular set. A query names the domain name of interest and describes the type of resource information that is desired. For example, the Internet uses some of its domain names to identify hosts; queries for address resources return Internet host addresses.
Ã‚Â· NAME SERVERS are server programs, which hold information about the domain tree's structure and set information. A name server may cache structure or set information about any part of the domain tree, but in general a particular name server has complete information about a subset of the domain space, and pointers to other name servers that can be used to lead to information from any part of the domain tree. Name servers know the parts of the domain tree for which they have complete information; a name server is said to be an AUTHORITY for these parts of the name space. Authoritative information is organized into units called ZONEs, and these zones can be automatically distributed to the name servers, which provide redundant service for the data in a zone.
Ã‚Â· RESOLVERS are programs that extract information from name servers in response to client requests. Resolvers must be able to access at least one name server and use that name server's information to answer a query directly, or pursue the query using referrals to other name servers. A resolver will typically be a system routine that is directly accessible to user programs; hence no protocol is necessary between the resolver and the user program.
These three components roughly correspond to the three layers or views of the domain system:
Ã‚Â· From the user's point of view, the domain system is accessed through a simple procedure or OS call to a local resolver. The domain space consists of a single tree and the user can request information from any section of the tree.
Ã‚Â· From the resolver's point of view, the domain system is composed of an unknown number of name servers. Each name server has one or more pieces of the whole domain tree's data, but the resolver views each of these databases as essentially static.
Ã‚Â· From a name server's point of view, the domain system consists of separate sets of local information called zones. The name server has local copies of some of the zones. The name server must periodically refresh its zones from master copies in local files or foreign name servers. The name server must concurrently process queries that arrive from resolvers.
In the interests of performance, implementations may couple these functions. For example, a resolver on the same machine as a name server might share a database consisting of the the zones managed by the name server and the cache managed by the resolver.
Domain Names
In this section, we'll be concentrating our attention on domain names.

Internet domains form the basis of the common Internet naming scheme. For example, www.cnn.com is a domain name, and cnn.com is a domain.
Domains are structured in the form of an inverted tree. Each branch or leaf on the tree is labelled with an simple alphanumeric string, and a complete domain name is written by stringing all the labels together, separated by periods. Thus, www.cnn.com is a third-level domain name. The root domain is COM, the second level label is cnn, and the third level is www. Incidentally, there is no standard way to visually distinguish a branch from a leaf. In fact, the Internet domain system makes no distinction between the two, since branches can have any attributes of a leaf, and leaves can have children added to them and become branches.
The diagram below illustrates the novell.com domain. The interpretation of domain names ending in novell.com is solely at the discretion of Novell. They manage the shaded area of the domain name space.

As documented in RFC 1591, top-level domain names take one of two forms. First, they can be generic domains, all of which are populated by predominately American domains. Alternately, a top-level domain can be a UN two-digit country code, listed in ISO-3166, the most common form for non-American domains.
Generic Domains Country Domains
(partial list)

com - Commercial uk - United Kingdom
edu - Educational fr - France
org - Non-profit Organizations de - Germany
net - Networking Providers nl - Netherlands
mil - US Military us - United States
gov - US Government au - Australia
int - International Organizations ax - Antarctica
To make use of domain names, they must be converted into 32-bit IP Addresses. This is done using the DNS Protocol.
Domain name registrations are handled by InterNIC in North America, RIPE in Europe, and APNIC in Asia. Domain name assignment is completely distinct from IP address assignment.
Resource Records
Resource records are the data elements that define the structure and content of the domain name space. All DNS operations are ultimately formulated in terms of resource records.

Resource Records (RRs) are the DNS data records. Their precise format is defined in RFC 1035 Ã‚Â§3.2.1. The most important fields in a resource record are Name, Class, Type, and Data. Name is a domain name, Class and Type are two-byte integers, and Data is a variable-length field to be interpreted in the context of Class and Type. Almost all Internet applications use Class 1, the Internet Class. For the Internet Class, many standard Types have been defined. The complete list can be found in the current Assigned Numbers RFC. Only those most important to DNS operation are shown here.
Address (A) RRs
Address (A) records match domain names to IP address, and are both the most important and the most mundane aspect of DNS. See RFC 1035 Ã‚Â§3.4.1 for a more detailed description of the A RR, though there is really very little to describe. The data section consists entirely of a 32-bit IP address. Most DNS operations are queries for A records matching a given domain name. Since hosts can have multiple IP addresses, corresponding to multiple physical network interfaces, so it is permissible for multiple A records to match a given domain name. Normally, only the first one is used, so chose a host's most reliable IP address and put it first when constructing name server databases.
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| ADDRESS |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

where:

ADDRESS A 32 bit Internet address.
Canonical Name (CNAME) RR
Canonical Names (CNAMEs) are the DNS equivalent of aliases or symbolic links. The data field contains another fully qualified DNS name, which should be used as the target of another DNS operation to acquire the desired information. However, a second lookup is rarely required, since most name servers will provide the additional records as part of the reply. See RFC 1035 Ã‚Â§3.3.1 for a more detailed description of the CNAME RR.
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
/ CNAME /
/ /
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

where:

CNAME A which specifies the canonical or primary
name for the owner. The owner name is an alias.
Pointer (PTR) RR
Pointers (PTRs) are like CNAMEs in their format - the data area contains a domain name. The difference between CNAMEs and PTRs is purely one of semantics. A CNAME specifies an alias, a PTR merely points to another location in the domain name space. The most important use of PTRs is to construct the in-addr.arpa domain, used to convert IP addresses to DNS names (the reverse of the normal process). See RFC 1035 Ã‚Â§3.3.12 for a more detailed description of the PTR RR, and RFC 1035 Ã‚Â»3.5 for a explanation of the in-addr.arpa domain.
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
/ PTRDNAME /
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

where:

PTRDNAME A which points to some location in the
domain name space.
Start of Authority (SOA) RR
A Start of Authority SOA RR marks the beginning of a DNS zone, and is typically seen as the first record in a name server for that domain. The encyclopaediaÃ¢â‚¬â„¢s discussion of name servers explains the various fields. See RFC 1035 Ã‚Â§3.3.13 for a more detailed description of the SOA RR.
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
/ MNAME /
/ /
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
/ RNAME /
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| SERIAL |
| |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| REFRESH |
| |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| RETRY |
| |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| EXPIRE |
| |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| MINIMUM |
| |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

where:

MNAME The of the name server that was the
original or primary source of data for this zone.

RNAME A which specifies the mailbox of the
person responsible for this zone.

SERIAL The unsigned 32 bit version number of the original copy
of the zone. Zone transfers preserve this value. This
value wraps and should be compared using sequence space
arithmetic.

REFRESH A 32 bit time interval before the zone should be
refreshed.

RETRY A 32 bit time interval that should elapse before a
failed refresh should be retried.

EXPIRE A 32 bit time value that specifies the upper limit on
the time interval that can elapse before the zone is no
longer authoritative.

MINIMUM The unsigned 32 bit minimum TTL field that should be
exported with any RR from this zone.
Name Server (NS) RR
An NS RR marks the beginning of a DNS zone and supplies the domain name of a name server for that zone. It is typically seen in two places - at the top of a zone, just after the SOA; and at the start of a sub zone, where an NS (and often a paired A) are all that is required to perform delegation. See RFC 1035 Ã‚Â§3.3.11 for a more detailed description of the NS RR.
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
/ NSDNAME /
/ /
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

where:

NSDNAME A which specifies a host which should be
Authoritative for the specified class and domain.
Introduction to DNS
The Domain Name System (DNS) is used to convert domain names into IP addresses.

Domain naming, and its most visible component, the Domain Name Service (DNS), is critical to the operation of the Internet. The average American phone number, with area code, is 10 digits in length and encodes 10^10, or 10,000,000,000 possibilities. The Internet IP address, at 32 bits, encodes 2^32 or 4,294,967,296 possibilities. For human engineering purposes, how can we build an effective directory of these difficult large numbers?
The telephone company solves this problem with lots of large paper directories, and operators you call and ask about numbers not in your directory. Internet solves this problem with a hierarchy of simple, mnemonic names, called domain names. Instead of remembering 205.216.138.22, all I need to know is the host's domain name - ns.adnc.com. Some people think the dots in a domain name correspond to the dots in the numeric address. This is not the case. There are always three periods in an IP address, separating its four constituent bytes. There are a variable number of periods in a domain name.
The crucial DNS documentation is provided in RFC 1034 and RFC 1035. The EncyclopaediaÃ¢â‚¬â„¢s Programmed Instruction Course has a DNS Section, and the EncyclopaediaÃ¢â‚¬â„¢s software section has a Dig page, discussing use of this free software diagnostic tool. DNS also plays an important role in Internet mail delivery.

Name Servers
Name servers are the hosts and programs, which answer DNS, protocol queries.

A name server is an Internet host running software capable of processing DNS requests. A popular free software name server is BIND Named, for UN*X hosts.
Primary and Secondary Name Servers
Typically, a single name server will be configured as the primary name server for a domain. For backup purposes, a number of other name servers may be configured as secondary name servers. From the standpoint of DNS, there is no difference between primary and secondary name servers, since the resolving algorithm simply uses a domain's NS records in the order provided. Typically, the primary name server is listed first, followed by the secondary, but this is not a requirement. In fact, if a group of domains is served by a set of name servers, the ordering of the name servers may be mixed among the domains, to facilitate load balancing.
A domain's primary name server will have a file on disk containing the RR definitions for that domain. Typically, secondary name servers do not have to be known to the primary. However, some sites, not wishing to publicly distribute copies of their entire domain, restrict zone transfers to reconfigured hosts. Secondary name servers depend on zone transfers for their operation.
Typically, a secondary name server will perform a zone transfer to acquire a complete copy of the primary's RR database, often saving this copy on disk. Periodically, the primary's SOA record for the domain is checked for changes in its SERIAL field. Upon detecting a change, the secondary performs another zone transfer to acquire the updated information. Therefore, the SERIAL field in a domain's SOA record must be changed every time a change is made within the domain.
The timing of secondary updates is governed by several fields in the domain's SOA record. The secondary check the primary's NS record every REFRESH seconds. If one can not perform a scheduled check, it retries every RETRY seconds. If a check can't be performed for EXPIRE seconds, then all the secondary's records for that domain are discarded, and it begins to return errors to lookup requests.
Recursion, Caching, and Authoritative Replies
If a name server receives a query for a domain it does not serve, two options are available. The name server may return a referral to the client citing better name servers. Such replies have empty answer sections, and NS records in the authority section pointing to the other servers. Alternately, the server may recourse by attempting to completely resolve the request through a series of exchanges with other name servers, delaying a reply to the original requester until it is complete.
Most name servers will recourse, since this permits them to cache the various resource records used to access the foreign domain, in anticipation of further similar requests. Every resource record has a Time To Live (TTL) field (distinct from the IP TTL field) which specifies the number of seconds the record may be cached before it must be discarded. Although an explicit TTL can be set on any resource record, most records default to the TTL specified in the MINIMUM field of their SOA. Clients may also cache, according to the same rules.
Part of the DNS message header is the Authoritative Answer (AA) bit. This bit is set in replies that come direct from a primary or secondary name server. This bit is clear in replies that come from a cache.
egin{soapbox}
In my opinion, authority is one of the most confusing aspects of DNS. First, it would be better to invert its sense of the AA bit and rename it "Cached". Second, it doesn't tell you what you really want to know - is this the most reliable information possible? This is because secondary name servers set the AA bit in their replies, and a common DNS misconfiguration is inaccurate secondaries. So call the present bit "Cached" and add another one - "Primary". Permit clients to set the bit in questions to demand forwarding to a primary name server. Use this option after receiving a suspicious or vacuous DNS answer. Then this concept would become useful.
end{soapbox}
The DNS Protocol
The DNS protocol is used to request resource records from name servers.

Part of the confusion associated with the DNS protocol is that it lacks a special name. Thus DNS can refer either to the entire system, or to the protocol that makes it work. This page documents the protocol, which operates in one of two basic modes - lookups or zone transfers.
DNS Lookups
Normal resource records lookups are done with UDP. An "intelligent retransmission" is to be used, though one is not specified in the protocol, resulting in a mix of poor strategies with good ones. The protocol itself is stateless; all the information needed is contained in a single message, fully documented in RFC 1035 Ã‚Â§4.1, and having the following format:
+---------------------+
| Header |
+---------------------+
| Question | the question for the name server
+---------------------+
| Answer | RRs answering the question
+---------------------+
| Authority | RRs pointing toward an authority
+---------------------+
| Additional | RRs holding additional information
+---------------------+
Ã‚Â· Questions are always Name, Type, Class tuples. For Internet applications, the Class is IN, the Type is a valid RR type, and the Name is a fully qualified domain name, stored in a standard format. Names can't be wild carded, but Types and Classes can be. In addition, special Types exist to wildcard mail records and to trigger zone transfers. The question is the only section included in a query message; the remaining sections being used for replies.
Ã‚Â· Answers are RRs that match the Name, Type, Class tuple. If any of the matching records are CNAME pointers leading to other records, the target records should also be included in the answer. There may be multiple answers, since there may be multiple RRs with the same labels.
Ã‚Â· Authority RRs are type NS records pointing to name servers closer to the target name in the naming hierarchy. This field is completely optional, but clients are encouraged to cache this information if further requests may be made in the same name hierarchy.
Ã‚Â· Additional RRs are records that the name server believes may be useful to the client. The most common use for this field is to supply A (address) records for the name servers listed in the Authority section.
However, cleverer name servers are feasible. For example, if the question is for an MX record for FreeSoft.org, the answer will currently point to mail.adnc.com. The name server can infer that the client's next request will be an A query for mail.adnc.com, which will be answered by with a CNAME record, the DNS equivalent of a symbolic link, and the target of that link, an A record for gemini.adnc.com. The name server can avoid all this extra traffic by just including the CNAME and A records as additional RRs in the original reply. Not all name servers do this, however. Use the Dig program to watch what really happens.
Zone Transfers
Sometimes, it is necessary to efficiently transfer the resource records of an entire DNS zone. This is most commonly done by a secondary name server having determined the need to update its database.
The operation of a zone transfer is almost identical to a normal DNS query, except that TCP is used (due to large quantity of reply records) and a special Class exists to trigger a zone transfer. A DNS query with Name=FreeSoft.org, Class=IN, Type=AXFR will trigger a zone transfer for FreeSoft.org. The end of a zone transfer is marked by duplicating the SOA RR that started the zone.
Zone transfers are discussed in more detail in RFC 1034 Ã‚Â§4.3.5.
Lower-Level Transport
Either TCP or UDP can be used to transport DNS protocol messages, connecting to server port 53 for either. Ordinary DNS requests can be made with TCP, though convention dictates the use of UDP for normal operation. TCP must be used for zone transfers, however, because of the danger of dropping records with an unreliable delivery protocol such as UDP.
Dig I
One of the most useful software tools for studying name server operation is called Dig, which stands for Domain Information Groper. Basically, this program submits DNS queries and presents the results in a human-readable format. Let's examine several examples of DIG usage.

Dig output begins with information about the command issued and the name server(s) used, then prints the resolver flags in use, then decodes the DNS message received back as an answer. After printing the header fields and flags, the question is printed, followed by the answer, authority records, and additional records sections. Each of these sections contains zero or more resource records, which are printed in a human-readable format, beginning with the domain name, then the Time To Live, then the type code, and finally the data field. Finally, summary information is printed about how long the exchange required.
tower:~\\$ dig @ns.adnc.com FreeSoft.org mx

[1] ; <<>> DiG 2.1 <<>> @ns.adnc.com FreeSoft.org mx
[2] ; (1 server found)
[3] ;; res options: init recurs defnam dnsrch
[4] ;; got answer:
[5] ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10
[6] ;; flags: qr aa rd ra; Ques: 1, Ans: 1, Auth: 2, Addit: 2
[7] ;; QUESTIONS:
[8] ;; FreeSoft.org, type = MX, class = IN
[9]
[10] ;; ANSWERS:
[11] FreeSoft.org. 86400 MX 100 mail.adnc.com.
[12]
[13] ;; AUTHORITY RECORDS:
[14] FreeSoft.org. 86400 NS ns.adnc.com.
[15] FreeSoft.org. 86400 NS ns2.adnc.com.
[16]
[17] ;; ADDITIONAL RECORDS:
[18] ns.adnc.com. 86400 A 205.216.138.22
[19] ns2.adnc.com. 86400 A 205.216.138.24
[20]
[21] ;; Total query time: 464 msec
[22] ;; FROM: tower to SERVER: ns.adnc.com 205.216.138.22
[23] ;; WHEN: Tue Mar 19 20:31:58 1996
[24] ;; MSG SIZE sent: 30 rcvd: 126
The main argument in this Dig request is FreeSoft.org, the domain name we are going to lookup. The first argument, @ns.adnc.com is optional and specifies a name server to use (normally the system default is chosen). The last argument specifies a Query Type, in this case for mail exchanger (MX) records. This argument is also optional, and defaults to address (A) RRs. Dig has numerous other options; see its man page for details.
Let's go through this line by line, as they've been numbered for your convenience (normally no numbers appear). The first two lines repeat the arguments back to us and tell us that the standard DNS lookup on the server succeeded. If Dig hangs after printing the first line, there may be a failure in your local DNS configuration; try replacing ns.adnc.com with an IP address.
Next we see the resolver options, which are documented in BIND's resolver (3) man page. Starting on line 5, we see the various header options from the reply. The various fields and flags are documented in RFC 1035 Ã‚Â§4.1.1.
Now comes the various resource records. The answer (line 11) is what we asked for. The authority records (line 14-15) inform us that this name server (ns.adnc.com) is authoritative for this record, as is ns2.adnc.com. Not surprisingly, we note that the AA bit was set as a flag (line 6). The additional records (lines 18-19) give the IP address of the name servers.
Finally, we get timing stats (line 21), an indication of the foreign DNS and IP addresses (line 22), a timestamp (line 23), and the sizes of the request and reply (line 24).
Knowing from the previous request that mail.adnc.com is the only mail exchanger for the FreeSoft.org domain, we now request the IP address of this host. Note that no type argument is required, as IP address lookup (A RRs) is the default.
tower:~\\$ dig @ns.adnc.com mail.adnc.com

[1] ; <<>> DiG 2.1 <<>> @ns.adnc.com mail.adnc.com
[2] ; (1 server found)
[3] ;; res options: init recurs defnam dnsrch
[4] ;; got answer:
[5] ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10
[6] ;; flags: qr aa rd ra; Ques: 1, Ans: 2, Auth: 3, Addit: 3
[7] ;; QUESTIONS:
[8] ;; mail.adnc.com, type = A, class = IN
[9]
[10] ;; ANSWERS:
[11] mail.adnc.com. 86400 CNAME gemini.adnc.com.
[12] gemini.adnc.com. 86400 A 205.216.138.22
[13]
[14] ;; AUTHORITY RECORDS:
[15] adnc.com. 86400 NS gemini.adnc.com.
[16] adnc.com. 86400 NS taurus.adnc.com.
[17] adnc.com. 86400 NS ns.mci.net.
[18]
[19] ;; ADDITIONAL RECORDS:
[20] gemini.adnc.com. 86400 A 205.216.138.22
[21] taurus.adnc.com. 86400 A 205.216.138.24
[22] ns.mci.net. 161123 A 204.70.128.1
[23]
[24] ;; Total query time: 310 msec
[25] ;; FROM: tower to SERVER: ns.adnc.com 205.216.138.22
[26] ;; WHEN: Tue Mar 19 20:33:00 1996
[27] ;; MSG SIZE sent: 31 rcvd: 183
I read this output as follows: mail.adnc.com is an alias for gemini.adnc.com, whose IP address is 205.216.138.22. The authority section is somewhat interesting, since ns.adnc.com does not appear to be authoriative for its own domain! But line 6 shows the Authoritative Answer (AA) bit set in the header, indicating that this is not a cached entry, but one that comes directly from an authoritative name server! Since we asked ns.adnc.com, and line 25 confirms that that's the server we were talking to, something seems a bit wierd here. Let's investigate further.
5 vyger> dig ns.adnc.com

[1] ; <<>> DiG 2.0 <<>> ns.adnc.com
[2] ;; ->>HEADER<<- opcode: QUERY , status: NOERROR, id: 6
[3] ;; flags: qr rd ra ; Ques: 1, Ans: 1, Auth: 9, Addit: 9
[4] ;; QUESTIONS:
[5] ;; ns.adnc.com, type = A, class = IN
[6]
[7] ;; ANSWERS:
[8] ns.adnc.com. 54654 A 205.216.138.22
[9]
[10] ;; AUTHORITY RECORDS:
[11] . 453829 NS A.ROOT-SERVERS.NET.
[12] . 453829 NS H.ROOT-SERVERS.NET.
[13] . 453829 NS B.ROOT-SERVERS.NET.
[14] . 453829 NS C.ROOT-SERVERS.NET.
[15] . 453829 NS D.ROOT-SERVERS.NET.
[16] . 453829 NS E.ROOT-SERVERS.NET.
[17] . 453829 NS I.ROOT-SERVERS.NET.
[18] . 453829 NS F.ROOT-SERVERS.NET.
[19] . 453829 NS G.ROOT-SERVERS.NET.
[20]
[21] ;; ADDITIONAL RECORDS:
[22] A.ROOT-SERVERS.NET. 520206 A 198.41.0.4
[23] H.ROOT-SERVERS.NET. 520206 A 128.63.2.53
[24] B.ROOT-SERVERS.NET. 520206 A 128.9.0.107
[25] C.ROOT-SERVERS.NET. 520206 A 192.33.4.12
[26] D.ROOT-SERVERS.NET. 520206 A 128.8.10.90
[27] E.ROOT-SERVERS.NET. 520206 A 192.203.230.10
[28] I.ROOT-SERVERS.NET. 520206 A 192.36.148.17
[29] F.ROOT-SERVERS.NET. 520206 A 192.5.5.241
[30] G.ROOT-SERVERS.NET. 520210 A 192.112.36.4
[31]
[32] ;; Sent 1 pkts, answer found in time: 330 msec
[33] ;; FROM: vyger to SERVER: default -- 127.0.0.1
[34] ;; WHEN: Fri Sep 6 16:38:22 1996
[35] ;; MSG SIZE sent: 29 rcvd: 340
A few things are different about this last exchange. First, instead of specifying a name server in the command, I used the default local name server. Second, the answer I got back was not authoritative (no AA bit in line 3), so it was cached. Line 8 shows a lower TTL value than the 86400s in the authorative answer. Repeating the query will show the TTL value dropping as the cached record ages. We can't tell from this listing exactly where the entry was cached. If I wanted to know this, I would repeat the query with recursion disabled and step through the name server tree manually. Note, however, that my local name server has almost certainly cached the record itself at this point - as has every name server that handled the record during the exchange.
This reply answers the original question - why isn't ns.adnc.com listed as a name server for its own domain? The answer is that it is listed - in a way. Note that ns's IP address of 205.216.138.22 is the same as gemini.adnc.com, which was listed as a name server. The two are, in fact, the same host, since they have the same IP address.
We also got nine authority records for the Internet root name servers, and nine additional address records to go along with them. This is a product of my local name server configuration, which sends all requests first to my Internet service provider's name server. In general, I suggest configuring leaf name servers in this manner - just pass the request on to the next name server up, hoping for a cache hit. However, since adnc.com is in a different domain than my ISP, its name server decided that mine should have asked the root name servers for information about adnc.com (since that's what it had to do), and passed along authority information with that "suggestion".
Just for informative value, I had a TCPdump running on my PPP link during the last Dig exchange. Here's part of what I saw (note that I increased the snarf length with the -s flag to decode as much of the message as possible):
39 vyger# tcpdump -s512 -i ppp0 udp port domain
tcpdump: listening on ppp0
16:38:21.817066 ppp-blt-1-03.netrail.net.domain > skipper.netrail.net.domain:
21213+ A? ns.adnc.com. (29)
16:38:22.072951 skipper.netrail.net.domain > ppp-blt-1-03.netrail.net.domain:
21213 1/9/9 A 205.216.138.22 (160) (frag 36540:168@0+)
My local name server (which handled the Dig request, remember?) passed the question on to skipper.netrail.net, my Internet service provider's name server. The query ID is 21213, and the + indicates that recursion was requested. All this is documented in the TCPdump Manual Page. A? indicates that the packet is a question for an address (A) record for nn.adnc.com. The next packet is the reply - or actually, the first packet of the reply, which was so big it had to be fragmented (because of the 18 extra RRs for the various root name servers). 1/9/9 indicates one answer, nine authority records, and nine additional records. The answer is also shown - an A record containing 205.216.138.22.
Dig II
Try out Dig with the Web form below. Use some of the demonstration queries, and then make up some of your own. Lookup the IP address of your local host. Track how this information is obtained by disabling recursion and demanding authoritative answers (Dig options +norecurse and +aa), which renders caches effectively invisible.

Dig, the domain information grouper, is a tool that sends DNS queries to servers and prints the replies. I think most network engineers find it more useful than nslookup, which has now be depricated by ISC, anyway.
The Web form below uses a Perl script to make DNS queries. It's output is similar to Dig, but not identical, since it's purely Perl-based, so that it works on Windows systems that don't have Dig. If it doesn't work, try the original version at freesoft.org. Dig is available as part of the Bind distribution.
Domain Name: Class:
Nameserver: Type:
Recursion Enabled Authoritative Answer Only

\\$ dig @localhost www.freesoft.org A IN
;; query(www.freesoft.org, A, IN)
;; send_udp(127.0.0.1:53)
;; answer from 127.0.0.1:53 : 154 bytes
;; HEADER SECTION
;; id = 428
;; qr = 1 opcode = QUERY aa = 1 tc = 0 rd = 1
;; ra = 1 rcode = NOERROR
;; qdcount = 1 ancount = 2 nscount = 2 arcount = 2

;; QUESTION SECTION (1 record)
;; www.freesoft.org. IN A

;; ANSWER SECTION (2 records)
www.freesoft.org. 21600 IN CNAME ars.freesoft.org.
ars.freesoft.org. 21600 IN A 64.7.33.254

;; AUTHORITY SECTION (2 records)
freesoft.org. 0 IN NS dns.arsnet.com.
freesoft.org. 0 IN NS freesoft.org.

;; ADDITIONAL SECTION (2 records)
dns.arsnet.com. 75141 IN A 64.7.33.200
freesoft.org. 21600 IN A 64.7.33.254
;; query status: NOERROR
Zones
One of most confusing aspects of DNS is its subdivision of the naming tree into Zones of Authority. It's really not that difficult to understand. The top node of each zone has an SOA (Start of Authority) resource record, along with NS (Name Server) records to identify its name servers. The parent zone also has the same set of NS records to identify servers for the sub zone. All of these resource records have the same domain name - the top name of the zone. The parent zone may also need address (A) records for the sub zone's name servers.
Now read what RFC 1034 says about zone division, then read both the subsections. The discussion of class division is largely irrelevant - the Internet Class is the only one we're really interested in.

The domain database is partitioned in two ways: by class, and by "cuts" made in the name space between nodes.
The class partition is simple. The database for any class is organized, delegated, and maintained separately from all other classes. Since, by convention, the name spaces are the same for all classes, the separate classes can be thought of as an array of parallel namespace trees. Note that the data attached to nodes will be different for these different parallel classes. The most common reasons for creating a new class are the necessity for a new data format for existing types or a desire for a separately managed version of the existing name space.
Within a class, "cuts" in the name space can be made between any two adjacent nodes. After all cuts are made, each group of connected name space is a separate zone. The zone is said to be authoritative for all names in the connected region. Note that the "cuts" in the name space may be in different places for different classes, the name servers may be different, etc.
These rules mean that every zone has at least one node, and hence domain name, for which it is authoritative, and all of the nodes in a particular zone are connected. Given, the tree structure, every zone has a highest node which is closer to the root than any other node in the zone. The name of this node is often used to identify the zone.
It would be possible, though not particularly useful, to partition the name space so that each domain name was in a separate zone or so that all nodes were in a single zone. Instead, the database is partitioned at points where a particular organization wants to take over control of a subtree. Once an organization controls its own zone it can unilaterally change the data in the zone, grow new tree sections connected to the zone, delete existing nodes, or delegate new subzones under its zone.
If the organization has substructure, it may want to make further internal partitions to achieve nested delegations of name space control. In some cases, such divisions are made purely to make database maintenance more convenient.
The in-addr.arpa Domain
DNS has a few special cases you need to be aware of. Probably the most important of these is the in-addr.arpa domain, which is used to convert 32-bit numeric IP addresses back into domain names. This is used, for example, by Internet web servers, which receive connections from IP addresses and wish to obtain domain names to record in log files. Remember that IP addresses are written as four decimal numbers (one for each byte), separated by periods.

The Internet uses a special domain to support gateway location and Internet address to host mapping. Other classes may employ a similar strategy in other domains. The intent of this domain is to provide a guaranteed method to perform host address to host name mapping, and to facilitate queries to locate all gateways on a particular network in the Internet.
Note that both of these services are similar to functions that could be performed by inverse queries; the difference is that this part of the domain name space is structured according to address, and hence can guarantee that the appropriate data can be located without an exhaustive search of the domain space.
The domain begins at IN-ADDR.ARPA and has a substructure which follows the Internet addressing structure.
Domain names in the IN-ADDR.ARPA domain are defined to have up to four labels in addition to the IN-ADDR.ARPA suffix. Each label represents one octet of an Internet address, and is expressed as a character string for a decimal value in the range 0-255 (with leading zeros omitted except in the case of a zero octet which is represented by a single zero).
Host addresses are represented by domain names that have all four labels specified. Thus data for Internet address 10.2.0.52 is located at domain name 52.0.2.10.IN-ADDR.ARPA. The reversal, though awkward to read, allows zones to be delegated which are exactly one network of address space. For example, 10.IN-ADDR.ARPA can be a zone containing data for the ARPANET, while 26.IN-ADDR.ARPA can be a separate zone for MILNET. Address nodes are used to hold pointers to primary host names in the normal domain space.
Network numbers correspond to some non-terminal nodes at various depths in the IN-ADDR.ARPA domain, since Internet network numbers are either 1, 2, or 3 octets. Network nodes are used to hold pointers to the primary host names of gateways attached to that network. Since a gateway is, by definition, on more than one network, it will typically have two or more network nodes which point at it. Gateways will also have host level pointers at their fully qualified addresses.
Both the gateway pointers at network nodes and the normal host pointers at full address nodes use the PTR RR to point back to the primary domain names of the corresponding hosts.
For example, the IN-ADDR.ARPA domain will contain information about the ISI gateway between net 10 and 26, an MIT gateway from net 10 to MIT's net 18, and hosts A.ISI.EDU and MULTICS.MIT.EDU. Assuming that ISI gateway has addresses 10.2.0.22 and 26.0.0.103, and a name MILNET- GW.ISI.EDU, and the MIT gateway has addresses 10.0.0.77 and 18.10.0.4 and a name GW.LCS.MIT.EDU, the domain database would contain:
10.IN-ADDR.ARPA. PTR MILNET-GW.ISI.EDU.
10.IN-ADDR.ARPA. PTR GW.LCS.MIT.EDU.
18.IN-ADDR.ARPA. PTR GW.LCS.MIT.EDU.
26.IN-ADDR.ARPA. PTR MILNET-GW.ISI.EDU.
22.0.2.10.IN-ADDR.ARPA. PTR MILNET-GW.ISI.EDU.
103.0.0.26.IN-ADDR.ARPA. PTR MILNET-GW.ISI.EDU.
77.0.0.10.IN-ADDR.ARPA. PTR GW.LCS.MIT.EDU.
4.0.10.18.IN-ADDR.ARPA. PTR GW.LCS.MIT.EDU.
103.0.3.26.IN-ADDR.ARPA. PTR A.ISI.EDU.
6.0.0.10.IN-ADDR.ARPA. PTR MULTICS.MIT.EDU.
Thus a program which wanted to locate gateways on net 10 would originate a query of the form QTYPE=PTR, QCLASS=IN, QNAME=10.IN-ADDR.ARPA. It would receive two RRs in response:
10.IN-ADDR.ARPA. PTR MILNET-GW.ISI.EDU.
10.IN-ADDR.ARPA. PTR GW.LCS.MIT.EDU.
The program could then originate QTYPE=A, QCLASS=IN queries for MILNET- GW.ISI.EDU. and GW.LCS.MIT.EDU. to discover the Internet addresses of these gateways.
A resolver which wanted to find the host name corresponding to Internet host address 10.0.0.6 would pursue a query of the form QTYPE=PTR, QCLASS=IN, QNAME=6.0.0.10.IN-ADDR.ARPA, and would receive:
6.0.0.10.IN-ADDR.ARPA. PTR MULTICS.MIT.EDU.
Several cautions apply to the use of these services:
Ã‚Â· Since the IN-ADDR.ARPA special domain and the normal domain for a particular host or gateway will be in different zones, the possibility exists that that the data may be inconsistent.
Ã‚Â· Gateways will often have two names in separate domains, only one of which can be primary.
Ã‚Â· Systems that use the domain database to initialize their routing tables must start with enough gateway information to guarantee that they can access the appropriate name server.
Ã‚Â· The gateway data only reflects the existence of a gateway in a manner equivalent to the current HOSTS.TXT file. It doesn't replace the dynamic availability information from GGP or EGP.
Section 2 Conclusion
Congratulations! You've completed the DNS component of the Programmed Instruction Course. By now, you should have a good basic understanding of DNS theory and operation.
I suggest you review the DNS page in the Topical Core, skim through RFC 1034 and RFC 1035, then use the Web form on the Encyclopedia's Dig Page to explore the structure of the Internet DNS tree.
Now, proceed into Section 3...
IP Protocol Overview
The Internetwork Protocol (IP) provides all of Internet's data transport services. Every other Internet protocol is ultimately either layered atop IP, or used to support IP from below.

IP is the Internet's most basic protocol. In order to function in a TCP/IP network, a network segment's only requirement is to forward IP packets. In fact, a TCP/IP network can be defined as a communication medium that can transport IP packets. Almost all other TCP/IP functions are constructed by layering atop IP. IP is documented in RFC 791, and IP broadcasting procedures are discussed in RFC 919. The Encyclopedia's Programmed Instruction Course includes an IP Section.
IP is a datagram-oriented protocol, treating each packet independently. This means each packet must contain complete addressing information. Also, IP makes no attempt to determine if packets reach their destination or to take corrective action if they do not. Nor does IP checksum the contents of a packet, only the IP header.
IP provides several services:
Ã‚Â· Addressing. IP headers contain 32-bit addresses which identify the sending and receiving hosts. These addresses are used by intermediate routers to select a path through the network for the packet.
Ã‚Â· Fragmentation. IP packets may be split, or fragmented, into smaller packets. This permits a large packet to travel across a network which can only handle smaller packets. IP fragments and reassembles packets transparently.
Ã‚Â· Packet timeouts. Each IP packet contains a Time To Live (TTL) field, which is decremented every time a router handles the packet. If TTL reaches zero, the packet is discarded, preventing packets from running in circles forever and flooding a network.
Ã‚Â· Type of Service. IP supports traffic prioritization by allowing packets to be labeled with an abstract type of service.
Ã‚Â· Options. IP provides several optional features, allowing a packet's sender to set requirements on the path it takes through the network (source routing), trace the route a packet takes (record route), and label packets with security features.