| Reading from a URL |
How different are Web pages from files? There are two obvious differences:
The fact that the text in a Web page is HTML is only an issue if we want to display it nicely, like a Web browser does. HTML is, in fact, just text but with a funny syntax, in the same way that a Java file is just text with a funny syntax.
What we care about is the second point. The Web page is probably on a different machine. Because of this when we read a Web page in Java, we can't use a FileReader as above. Fortunately, Java makes it easy for us to read web pages with a URL object. A URL object can be constructed using new URL (urlString) or new URL(currentURL,urlString, where currentURL is the url for the current web page (this is used when urlString can be a relative URL).
We need to construct the stream reader differently, but after we do that the code to read from the stream is identical. Here it is:
// Note the different way of constructing a stream to read a URL
// over the network
pageReader = new BufferedReader(new InputStreamReader(url.openStream()));
// Read the first line
nextLine = pageReader.readLine();
// Loop until all the lines are read
while (nextLine != null) {
// Code to process next line omitted
nextLine = pageReader.readLine();
}
// Close the stream
pageReader.close();
HTMLLinkFinder w/ BufferedReader shows a complete example reading a Web page in this way.
| Reading from a URL |