Talking to a Web serverTopReading from a URLHow client/server systems work

How client/server systems work

Now, let's talk a bit about how networking programs work. When we write a networking program like the HTMLLinkFinder or the SpamFilter from lab, our program is communicating with another program over the network. Our program is a client. The other program is a server. In the case of HTMLLinkFinder, we are communicating with a Web server. In the case of the SpamFilter, we are communicating with a POP (Post Office Protocol) server which is one type of server that can be used to read mail.

Let's talk about how a Web client/server pair works a bit more. First, we'll look at URLs and understand what they mean better. A simple URL may look like:

http://www.pomona.edu

A more complicated URL is:

http://www.pomona.edu:80/events/news/home.shtml

What are all these pieces?

http
This specifies the protocol that will be used in communication. "http" means HyperText Transfer Protocol. This is the protocol used by Web servers. More on protocols later.

www.williams.edu
This identifies the name of the machine where the Web server is running.

80
This is the port number the Web server is running on. Think of this as your SU Box number. One machine may run several servers, so we identify the specific server we want by identifying its port. Common servers, like Web servers, have default port numbers. The default port number for a Web server is 80. If a server is running at its default port, we can omit the port number in the URL.

everything else
The rest is information that is passed to the Web server to identify the Web page we want to view.

To connect to a server, we need to identify the machine and the port to use. Once connected, the protocol defines a set of commands that we can use to talk to the server. The protocol also specifies the arguments that the command requires and the return values that it produces.

The HTTP protocol is very simple. It supports one command:

GET
The GET command takes an argument which is the name of the content we want the Web server to return. It returns a long string, which is the Web page content.

For example, if I am connected to cortland.cs.williams.edu, I can get the home page for CS 134 there by sending the string "GET /~cs134/f03/" to the Web server.

The easiest way to get a feeling for protocols is to run a program called telnet from a command line (like a UNIX or DOS shell, or Mac Terminal). [Unfortunately, many computers no longer respond to telnet commands for security reasons. Even more unfortunately, Pomona computers don't accept telnet. Hence all my examples will be from Williams College.] With telnet, we can send commands like the GET line above to the server just by typing them on the keyboard and hitting return. The responses made by the server appear on the terminal.

Here is a sample session using telnet to download a page from a Web server:

First we connect to the Web server. Using telnet we must explicitly state what the port number is.

-> telnet cortland.cs.williams.edu 80

The Web server responds:

Trying 137.165.8.5...
Connected to cortland.cs.williams.edu.
Escape character is '^]'.

Now we request the home page for cs134:

GET /~cs134/f03/

The server responds with the Web page:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
   "DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<!-- XML file produced from file: 134page.tex
     using Hyperlatex v 2.6 (c) Otfried Cheong
     on Emacs 21.1.1, Wed Nov 19 01:12:59 2003 -->
<head>
<title>Computer Science 134</title>

<style type="text/css">
.maketitle { align : center }
div.abstract { margin-left: 20h3.abstract  { align : center }
div.verse, div.quote, div.quotation {
  margin-left : 10  margin-right : 10}
</style>


</head>
<body BGCOLOR="FFFFFF">




<table><tbody><tr><td colspan="1" align="LEFT">

<img alt="" src="http://www.cs.williams.edu/~tom/courses/CSlogo.gif">
</td><td colspan="1" align="LEFT">
<H1> CS 134<BR> Introduction to Computer Science</H1>

</td><td colspan="1" align="RIGHT">
<ul>
<li><a href="134page_1.html">Instructors</a>
<li><a href="134page_2.html">Lectures and Readings</a>
<li><a href="134page_3.html">Programming Assignments and Laboratories</a>
<li><a href="134page_4.html">Documentation and Handouts</a>
<li><a href="134page_6.html">Exams</a>
<li><a href="134page_7.html">Text</a>
</ul>

</td></tr></tbody></table>

<hr />

<p>Computer Science 134 is an introduction to algorithm development
emphasizing structured, object-oriented design.  Algorithms will be
implemented as programs in the Java programming language.  We will
introduce data structures and recursion as tools to construct correct,
understandable, and efficient algorithms.  These topics will be
developed further in <a href="http://www.cs.williams.edu/ifg/courses/cs136.html">Computer Science 136</a>.  We highly
recommend the combination of Computer Science 134 and Computer Science
136 for those who wish a good introduction to the science of
computing.
<p>This course is a prerequisite for all upper level <a href="http://www.cs.williams.edu/ifg/allcourses.html">Computer
Science courses</a>.  In
Computer Science 134 <em>we do not assume that you have had any
previous computer programming experience</em>.  
If you have had extensive previous experience you might consider
<a href="http://www.cs.williams.edu/ifg/courses/cs136.html">CS 136</a>.  Please discuss this with
any member of the department's faculty
if you feel you fall into this category.
<p><hr />To simplify printing of the information about CS 134 found
on these pages, a single page on which all the 
<a href="full134page.html">information about CS 134</a> is grouped
together is also available.
<hr />
</body></html>
Connection closed by foreign host.

The HTTP protocol automatically closes the connection after returning the Web page. If I wanted another Web page, I would need to create another connection to get the second page.


Talking to a Web serverTopReading from a URLHow client/server systems work