Note: If you get a warning about "blocked active content", don't worry. It is because of the JavaScript drag and drop elements in Section 1.8.5. You can allow the content, or block it, as you see fit; if you block it though, the drag and drop demo won't work.
Sebesta has great tutorials for several useful topics that aren't covered in HFSJ, including practical details of XHTML, CSS, JavaScript, and MySQL (all technologies we'll definitely be using this semester.)
In addition there is coverage of Perl, PHP, and ASP.NET, and Java Applets, which we may or may not get to, but you are likely to encounter in the real world, and might decide to use in your projects.
Finally Sebesta also provides coverage of Java Servlets and JSP that can help summarize and reenforce what we learn in HFSJ.
Having said that, my advice to you as you read Chapter 1 is: don't judge a book by its first chapter.
I have some advice to Sebesta at the end of this document on how the book could be improved; I'd be interesting to hear opinions from CISC474 students on that advice! But read the chapter and the reading notes first, so that you'll have an informed opinion.
If you are an impatient reader you might want to skip pages 1-13 (i.e. up through Section 1.5) on your first reading. Start with Section 1.6, and come back to pages 1-13 afterwards.
However, be sure to read through these reading notes, even for Section 1.1 through 1.5. There may be details that are only in these notes that you are responsible for on the exams.
These notes will tell you what I think is important in the readings (and since I'm designing and grading your exams, you probably will think those things are important too.) I'll also fill in gaps, offer contrary points of view, refer you to related material in your HFSJ textbook, and point out online resources that related to what you are reading.
Be particularly vigilant about anything that appears in a box such as this one. These boxes contain material specifically set aside as important (e.g. for your exams) and often not covered in the textbook.
On the other hand, notes that are in beige shaded boxes like this one are ones that are provided for your background information only. They are unlikely to appear on an exam.
You may also see the following background color used for code listings, or transcripts of terminal sessions. That color is just to help you find them more easily, and separate them from the text that surrounds them.
<html> <head> <title>P. Conrad's Web Page</title> </head>
<body> ...
Ok, let's start reading!
Gack. A textbook that seems, on the whole, quite good, gets off to an inauspicious start by reenforcing one of the most common misconceptions about the origin of the Internet.
It doesn't have a whole lot to do with the central themes of the course, but you might as well know:
"Surviving a nuclear war" was not, per se, a design goal of the early ARPAnet. This is an urban legend.
To be fair, Sebesta doesn't say that, but by emphasizing the following point, he does tend to lend support to that wrong view:
"One fundamental requirement was that the network be sufficiently robust so that even if some network nodes were lost due to sabotage, war, or some more benign reason, the network could continue to function." (Sebesta, p. 2)
It is true that the designers of the Internet wanted it to be robust to failure. However, as for being worried about sabotage, or war—while there is some dispute about this— the best sources indicate that the designers of the internet has no such idea in mind.
Probably the best source that debunks this canard is a history of the Internet from the Internet Society's own web site written by nine co-authors, several of whom are well-known to be have been present at the very creation of the Internet, including:
According to the most reliable accounts, the real purpose of the Internet was to enable the DoD, specifically the Advanced Research Projects Agency (ARPA), to better exchange data with the various universities that were doing DoD sponsored research, and to enable those universities to exchange data with one another. Although it was military money that paid for the network, the military itself wasn't even a primary user of the network—the users were academic scientists, and DoD civilians that oversaw their work.
Anyway, for purposes of this course, this is all just an aside, so let's move on. While the material in Section 1.1.1 is probably good background for your general knowledge, the only details from Section 1.1.1 I want you to know for this course are:
The key idea in this section is that the Internet is a network of networks, and that it is the TCP/IP protocol suite that allows all the computers on the Internet to communicate.
Again, though, Sebesta is slightly misleading on a key detail:
TCP/IP is not a "single low-level protocol", as the book suggests (p. 3). Rather, the name TCP/IP (with the slash between the TCP and the IP) refers to the entire suite of protocols used on the Internet.
The TCP/IP protocol suite includes the protocols TCP and IP, but also includes protocols such as HTTP, DHCP, UDP, and others. The details of the Domain Name System (DNS) are part of the TCP/IP protocol suite as well.
Another name for the TCP/IP protocol suite is the "Internet Protocol Suite". We can say that HTTP is an "Internet protocol" or that it is a "TCP/IP protocol" to indicate that it is part of that protocol suite.
A few facts you should know about the TCP/IP protocol suite.
The TCP/IP protocol suite is standardized by the Internet Engineering Task Force (IETF). Documents called RFCs (Requests For Comments) specify Internet Standards. Their web site is www.ietf.org.
An organization called the Internet Society (ISOC) oversees both the IETF, and the Internet as a whole. Their web site is www.isoc.org
The TCP/IP protocol suite can be divided into a number of layers. The number of layers differs from author to author. For our purposes, a brief overview of five layers will be enough (see box below).
If you are interested in more details than are provided here, you can take CISC450 (Computer Networks)
Finally, while HTTP is a TCP/IP protocol, and as such is standardized in RFC2616 by the IETF, languages such as HTML and CSS are not part of the TCP/IP protocol suite . Those languages are standardized by the World Wide Web Consortium (W3C) (web site: www.w3.org). We'll revisit the topic of the World Wide Web Consortium in Section 2.1.1.
The five layers we'll concern ourselves with are (from top to bottom):
application (e.g. HTTP)
transport (e.g. TCP)
network (e.g. IP)
data-link (e.g. Ethernet)
physical (e.g. CAT 5 twisted pair cable)
We'll now look at each of those layers in more detail (this time from bottom to top).
The lowest layer is the physical layer at the bottom, and concerns the representation of bits 1s and 0s as voltage, light, or radio waves),
The data-link layer sits directly on top of the physical layer. It's main purpose is framing: turning a sequence of bits into a sequence of packets. (What is a packet? See the next bullet point.)
Examples data links layers include:
Ethernet, which typically uses "CAT 5 Twisted Pair Cable" as its physical layer.
Wireless LANs. Data Link Layer standards include 802.11b or 802.11g which use radio waves as its physical layer (standards include DSSS and OFDM
A packet is a sequence of bytes with a well-defined beginning and end, and clearly separated into header (containing address, sequence number and other information) and payload (containing user data to be transmitted.)
the network layer sits on top of the data-link layer and provides a way to move packets from any node int network to any other node in the network. The network layer protocol in the Internet is IP.
The current version of IP is version 4 (IPv4).
A new version, IPv6 is about 3-4 years away, and has been for about a decade. Ten years from now, it will likely still be 3-4 years away. (That was a joke, but not entirely). I can say more about this if you are interested, but it isn't really a topic for this course; we'll only deal with IPv4 in CISC474.
There are at least three separate concerns at the network layer:
addressing (making sure each node has a unique address, i.e. an IP address (see also: Sebesta, Section 1.1.3)
routing (figuring out how to get from point A to point B, for every (A,B) pair in the network), and
packet forwarding (actually moving the packets on the calculated routes.
The transport layer sits at the end hosts and takes care of the fact that the IP network layer sometimes can lose, corrupt, duplicate, or reorder packets. TCP is the main transport layer protocol in the Internet.
Other transport layers include UDP, which is sometimes used for streaming of multimedia, and SCTP, a new protocol that was developed primarily for voice-over-IP signalling. We'll probably have no occaision to encounter either of those in CISC474.
TCP provides a reliable-byte stream services to application layer protocols that sit on top of it. TCP does error checking, and resequencing on network-layer packets. TCP manages retransmission of lost or corrupted packets and throws out duplicates. These functions of TCP will be mostly invisible to us in CISC474.
The main transport-layer function we have to be directly aware of in CISC474 is multiplexing. TCP allows multiple applications (e.g. web browsing, email, file-transfer, remote login) to communicate between two IP addressses, by maintaining separate logical connections. TCP does this by providing port numbers. (HFSJ p. 21 has more detail on port numbers).
A TCP connection is defined by four numbers:
local IP address
local port number
remote IP address
remote port number
A few facts to know from this section: IP addresses are 32-bits, and usually written in "dotted-decimal form". For example, the IP address of strauss.udel.edu is 128.175.13.74. This corresponds to 32-bits as follows:
128. | 175. | 13. | 74 |
1000 0000 | 1010 1111 | 0000 1101 | 0100 1010 |
The book is mostly right when it says that the four parts (i.e. four bytes) of an IP address are used separately to route messages. Back in the day, this was literally true. Now, something called Classless Inter-Domain Routing (CIDR) is being used to route messages on portions of an IP address that don't necessarily fall on 8-bit boundaries. But the basic idea is still the same.
CIDR has helped to slow the demand for IPv6 by allowing organizations to use blocks of addresses that don't necessarily correspond to 8-bit boundaries.
Another technology that has slowed the need for IPv6 is the use of Network Address Translation (NAT), where a boundary router converts between public IP addresses and private IP addresses (for example, those that start with the first byte being 10, such as 10.0.0.1.) If you use a wireless router with a cable or DSL modem to share a single ISP connection among multiple computers, you are probably doing it by using NAT. Thus, you are using a single "public" IP address (the one registered by your cable modem or DSL modem) to connect multiple computers to the Internet. Many small and medium size businesses do this as well, but on a larger scale.
Both NAT and CIDR have had the effect that you should be a bit skeptical of predictions that "IPv6 is soon to be essential becuase the number of unused IP addresses is diminishing rapidly". The larger point is valid, but the pace of change is liable to be years, not months. There is even doubt among some as to whether IPv6 will ever take firm hold in the marketplace.
The unix nslookup utiltiy provides one way to look up IP addresses (this is a transcript of a terminal session from strauss.udel.edu)
> /usr/sbin/nslookup Default Server: localhost.udel.edu Address: 127.0.0.1 > strauss.udel.edu Server: localhost.udel.edu Address: 127.0.0.1 Name: strauss.udel.edu Address: 128.175.13.74 > www.mit.edu Server: localhost.udel.edu Address: 127.0.0.1 Name: www.mit.edu Address: 18.7.22.83 > www.microsoft.com Server: localhost.udel.edu Address: 127.0.0.1 Non-authoritative answer: Name: lb1.www.ms.akadns.net Addresses: 207.46.20.30, 207.46.19.30, 207.46.19.60, 207.46.20.60 207.46.18.30, 207.46.199.30, 207.46.225.60, 207.46.198.30 Aliases: www.microsoft.com, toggle.www.ms.akadns.net g.www.ms.akadns.net > www.gnu.org Server: localhost.udel.edu Address: 127.0.0.1 Name: gnu.org Address: 199.232.41.10 Aliases: www.gnu.org > exit >
Be sure to know the following terminology from this section:
Another thing to point out: the book suggests that telnet is a good way to determine the IP address of a fully-qualified domain name (as illustrated in Section 1.7.1). This is true. Just be advised that telnet isn't necessarily a good way to connect to a system if you are going to be typing in a password—in those cases, use ssh instead. (The example in Section 1.7.1 does not involve typing in any passwords, so it is fine.)
One other correction: in the second to the last paragraph on page 5, Sebesta indicates that telnet
and ftp
are protocols. This is true:
However, mailto
is not a protocol. Sebesta's error is understandable; mailto appears in the "spot" in a URL where a protocol name normally goes. However, strictly speaking, mailto is a "URI scheme" (see Section 3.5 of RFC1738, as well as Section 1.5.1 of Sebesta itself.)
Be sure to know the terms
Know the name Tim-Berners Lee and the significance of the date 1989 (twenty-years after 1969, when the ARPAnet first came to life).
Also know that the terms document, page and resource are used more or less interchangably to talk about items available on the web.
The Web is not the Internet, and the Internet is not the Web.
Hopefully, you could explain that statement on an exam.
Particular concepts to pay attention to in this section
You should also know that Mosaic, released in 1993, was the first graphical browser for the Web.
The last paragraph on p. 7 contains a very nice overview of web architecture, particularly the part that starts "However, more complicated situations are common", so I commend this to your attention.
Sebesta mentions Internet Explorer and Netscape as the main browsers. It is likely that the Firefox browser came to prominence after the text of Sebesta's book was already finalized.
Our focus in CISC474 will be primarily on the current versions of
As the introduction to Section 1.4, points out, the most common web servers are Apache and IIS.
For our part, we'll be using a web server called Tomcat, which comes from the Apache project. It is a special web server that is designed to be a Java Servlet Container. (We'll read more about Servlet Containers in HFSJ, in particular, pages 39 through 43.) However, it can also do all the "basic" functions of a web server as well (as described in Section 1.4).
All the stuff in this box is stuff that is covered elsewhere in the course in much more detail. This is just a quick summary to help you tie all the pieces together as you read about Web Servers in Sebesta.
The web server we'll be using is called Tomcat. Tomcat is allows us to serve not only static web pages (e.g. .html files), but also web pages where the content is the result of running some Java code. That Java code is specified in one of two forms: a Servlet or a Java Server Page (JSP)
Servlets are Java classes with methods that can take a request for a web page, and turn that request into a response.
So whether you write a servlet directly, or write a JSP, either way, your page ends up being generated by a Servlet.
Another technology for generating web pages as a result of some calculation on the server side is PHP, which is based on the scripting language Perl.
Tomcat is not designed to be able to serve PHP pages "out of the box". However, Tomcat can be configured to serve PHP pages .
Serving PHP with Tomcat is perhaps not the best architecture for a production environment—there are probably other servers that are more efficient or effective at serving PHP. For a learning/testing environment though, serving PHP with Tomcat might be "good enough". We may end up using Tomcat as a PHP server as well to avoid the overhead of having to install yet another piece of software.
Some key ideas from this section
Which takes more resources: serving a file, or displaying it? Justify your answer.
There is a also a nice summary on p. 9 of the details of how browsers and web servers interact. Compare this summary with the five layers discussed in the box on the Five Layers of the TCP/IP protocol suite in Section 1.1.2 of these reading notes, and see if you can find the parallels.
Sebesta describes the document root and the server root. Read to find out what these terms mean.
Then explore the concept for yourself.
How could you check whether the assertions above are still accurate, and if they are, show evidence that they still hold?
You'll also see these concepts when we work with Tomcat. I might ask you on an exam to relate the definitions from this section to our work with Tomcat.
This section also includes mention of virtual document trees. Broadly speaking, virtual document trees allow you to serve documents from places other than subdirectories that are under the document root.
If you've ever maintained a personal web site on copland, you are aware of another mapping from URL to file system that is not mentioned in this section. What mapping am I referring to?
Also be familiar with the terms: virtual host and proxy server.
Two things to know from this section:
The rest is both too much and not enough to be useful: too much detail about a program we probably aren't going to use this semester, and way too little detail about Apache if we were going to use it! So once you've answered the two questions above, you can skip the rest of this section for purposes of CISC474.
Two things to know from this section:
The rest is just like Section 1.4.3: both too much and not enough to be useful (see details there). So once you've answered the two questions above, you can skip the rest of this section for purposes of CISC474.
See also, p. 20 in HFSJ.
You'll encounter these various terms in the W3C literature all over the place. So uou should have some idea of the differences between and among these terms.
However, this time, I'm not going to deprive you on the opportunity to research this on your own. Search engines such as Google are your friend. See what you can find.
Extra credit points for the best postings to the WebCT discussion board about what these three stand for, the subtle differences in meaning among the three, and an explanation of why the confusion exists in the first place, and how the meanings of these have changed over time. You'll probably also discover whether the "U" in fact stands for "Uniform" or "Universal".
(Use the board marked marked URL vs. URI vs. URN ). Points will be given for both the best summaries in your own words, as well as the best links to web sites that explain the difference.
Once these postings have been made, I'll summarize on that WebCT discussion board what you should know for the exam(s).
A few things to know from this section:
One thing to note: the book says that ampersand ("&") is a character that cannot be part of a URL. In general, this is true.
However, there is a circumstance where "&" characters have a particular meaning in a URL; they are used separate name/value pairs for parameters in a query string. For example, the URL below can be used to look up information about CISC474 for Spring 2006. Note the & characters that separate the name/value pairs term=06S
and course_sec=CISC474010
. (Try clicking on it; it uses a Java Server Page to return the information!)
http://chico.nss.udel.edu/CoursesSearch/courseInfo.jsp?&term=06S&course_sec=CISC474010
For more information on parameters and query strings, see pp. 110-111 in HFSJ, and Section 10.3 in Sebesta.
Here also, for you convenience, is the URL cited at the end of Section 1.5.1 in Sebesta as a hyperlink:
http://www.w3.org/Addressing/URL/URI_Overview.html
In the first paragraph of this section, Sebesta says something suprising about the direction of slashes in a URL path—if what he says is true, it is news to me. Extra credit for anyone who can find an independent source to verify (or authoritatively refute) Sebesta's assertions here, and/or show an example where his claims check out in practice.
The rest of this section is stuff you probably already know, but read it over to be sure. In particular, know what gets served up if the URL you specify maps to a directory name, and not to a specific file (there are two possible cases; know what happens in each case.)
If you've ever sent or received an email with an "attachment", you can thank MIME. MIME is the "under-the-hood", "behind-the-scenes" technology that makes email attachments work.
So what is MIME doing in a web technologies course? Well, it turns out the format used to specify how attachements are handled in email was "repurposed" to serve as the way that Web content is identified.
So, even though the second M in MIME stands for Mail (the original purpose of MIME), today MIME types are just a "way to identify the type of some content".
On the Unix operating system, a file is just a sequence of bytes, so unless you know the type of the file, it is not possible to correctly "interpret" the contents. At some point you've probably accidentally opened a Microsoft Word document or a JPEG image as a text file (e.g. in vi or emacs); as you know, you just get a screen full of nonsense. (If you've never had that experience, do it once just so you can say that you have done it.)
Most of us are used to identifying the type of files by their file extensions—e.g., a web file ends in .htm or .html, an image file ends in .jpg, .jpeg, .gif, or .png, a sound file ends in .wav or .au, and a Microsoft Word document ends in .doc, etc. Most software is pre-programmed to only open files with the right kind of file extension, and/or to interpret the contents based on the extension.
For example, programs like PhotoShop can typically open both .gif and .jpeg files, such programs will look at the file extension when choosing what algorithm to use to decode the file and load it into a buffer for editing, one algorithm for .gif, and a different algorithm for .jpeg.
As it turns out, that creates some problems on the web, since some browsers (e.g. Internet Explorer) tend to follow that convention, while others (e.g. Firefox) tend to rely strictly on the MIME type set in the HTTP headers by the server. The problem is that the server doesn't always set the MIME type correctly, so sometimes we poor dumb users have to give the web server a little help.
MIME types turn up over and over again in working with web technologies. A few examples:
Anyway, all of this is just to say: MIME types are important. So read Sections 1.6.1 and http://www.joelonsoftware.com/articles/Unicode.html
Before I tell you what you should know from this section, you need to read this Blue Box:
Sebesta makes a technical distinction among
However Sebesta's terminology differs from that used in RFC2045/RFC2046 (the Internet Standard for MIME), and, if search engine results and our HFSJ textbook are any indication, from common practice.
Here's the more common practice, and the one we'll use in this course:
In fact, compare the use of the term "MIME type" on p. 17 of HFSJ.
So, we'll use this more common terminology.
In fact, if you type "MIME specification" (in quotes) into a search engine, chances are that if you look at where that phrase appears in context, most often "MIME specification" is referring to RFC2045/RFC2046 themselves—that is, the "specification for MIME", the standards documents in which MIME is "specified."
Having said that, from this section, know the following:
From here on out, I'll usually omit the (in Sebesta: foo) stuff and just assume you've made the switch to the common terminology.
From this section, know
Sebesta refers you to The W3C web site (http://www.w3.org) for the HTTP protocol spec, RFC2616.
While RFC2616 is available at the W3C, it would be more appropriate to go to the IETF web site for that particular spec.
Remember:
Before proceeding into Sections 1.7.1 and 1.7.2, take a moment to familiarize yourself with the following terms defined in the intro to Section 1.7:
Before I tell you what you should know from this section, you need to read this Blue Box:
Because we'll be spending lots of time with both HTTP and Object Oriented Programming in Java, we'll use the word "method" frequently.
First, make sure you are clear that the word method has two meanings in this course:
The most common HTTP methods that you'll deal with are GET and POST. You'll hardly ever need to know about any others. Most requests for web pages use GET.
POST is only needed when you are sending information to the server along with your request (e.g. you've filled out some fields in a form on a web page.) Even then, you sometimes use GET, and sometimes POST. HFSJ has pages and pages (13-19, and 110-118) about when to use GET and when to use POST, so we'll leave that discussion for the HFSJ reading notes.
So, to review, altogether there are eight HTTP methods. Sebesta mentions only five of them, while HFSJ (on p. 109) mentions all eight. Which ones did Sebesta leave out?
This section discusses all the parts of an HTTP request. While this is a nice detailed discussion of all the pieces of an HTTP request object, one thing that is missing is the "big picture".
So you might find it helpful to refer to pages 15 and 16 of HFSJ while you read this section, where you can see a complete GET request (p. 15) and a complete POST request (p. 16).
I strongly encourage you to try the experiment that starts at the bottom of p. 17, where you use telnet to talk directly to a web server. Note that this is NOT a security problem, since you are never sending a password. Here, rather than using telnet as a login client, you are using telnet as a "general purpose TCP connection client", to establish a text-only connection directly with the web server. Essentially you are "pretending to be a web browser", and sending what the web browser "would send".
Instead of typing "telnet blanca.uccs.edu http", though, try this. The part you type is in bold.
It doesn't matter where you system you type this on as long as you have a telnet client and Internet access. Note however that
Host: copland.udel.edu
. The part that starts with HTTP/1.1 200 OK is the response from the server, which is covered in the next section (Section 1.7.2)
$ telnet copland.udel.edu 80 Trying 128.175.13.92...
Connected to copland.udel.edu. Escape character is '^]'. GET /~pconrad/index.html HTTP/1.1 Host: copland.udel.edu HTTP/1.1 200 OK Date: Thu, 19 Jan 2006 01:46:46 GMT Server: Apache/1.3.26 (Unix) mod_ssl/2.8.10 OpenSSL/0.9.6g Last-Modified: Sun, 05 Sep 2004 13:52:13 GMT ETag: "619e7b-29d-413b1a0d" Accept-Ranges: bytes Content-Length: 669 Content-Type: text/html X-Pad: avoid browser bug <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd"> <html> <head> <title>Phillip Conrad, udel.edu home page</title> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> </head> <frameset rows="*" cols="133,*" frameborder="YES" border="1" framespacing="0"> <frame src="index2c.html" name="leftFrame" scrolling="YES" noresize> <frameset rows="115,*" cols="*" framespacing="0" frameborder="YES" border="1"> <frame src="index2b.html" name="topFrame" scrolling="NO" noresize> <frame src="index2a.html" name="mainFrame"> </frameset> </frameset> <noframes><body> </body></noframes> </html> Connection closed by foreign host. $
Now try this same experiment with some other servers and pages.
Read through the description of the HTTP response, and compare it with the responses shown on
Sebesta notes that in HTTP/1.1 the default is to leave the connection open, and that this results in signficant increases in efficiency.
These changes to HTTP were based on the work of Jeff Mogul, and were first published in an ACM SIGCOMM conference paper in 1995.
This intro is a great overview of the main topics that you should become familiar with this semester, and a good starting point for your Concept Map assignment.
Even if you already think you know a lot about HTML, this section is important to read, because it puts XHTML in context.
Some things you should get from this section:
There is more to say about XHTML, but we'll save that for Chapter 2 (which is entirely devoted to XHTML.)
From this section, you should know what a WYSIWYG editor is (you probably know already).
A couple of updates to this section, and things that may be helpful to know about particular WYSIWYG editors:
Adobe Page Mill was discontinued in March 2000. Their new product is called Go Live.
Adobe bought out Macromedia (former makers of Dreamweaver) this past year. Look for some realignment in this product area over the next year or two.
Dreamweaver MX is available on computers in a lab in Memorial 028. This lab is used to teach some classes in the English department (including one called "Designing Online Information" that covers Dreamweaver.) During times when the lab is not being used for classes, you can use Dreamweaver there if you like.
While Dreamweaver MX retails for $400, with academic discount it goes for around $200 through the UD Bookstore's software vendor (JourneyEd) as of January 2006.
You can also download Dreamweaver (fully functional) and try it free on any given computer for 30 days.
Dreamweaver is what I use. Although Sebesta says that Dreamweaver cannot handle all the tags of XHTML, I have yet to find any feature of XHTML that it does not support; in fact, I use it to create 100% XHTML 1.1 compliant documents. (This document itself was created in Dreamweaver MX, as a matter of fact.)
There is nothing in this section that you are responsible for for CISC474; feel free to skip it.
This is a good overview of XML, but without any examples it will probably seem hopelessly abstract.
The point is that in XHTML, you are given a set of tags to work with (e.g. <strong>, <p>, <h1>, <head>, <body>) but in XML, you come up with your own tags depending on what kind of data you are working with.
For medical data, your tags are things like <patient><diagnosis><condition><blood-pressure>.
For ski resorts, your tags are things like <lift-ticket-price><number-of-trails><snow-depth>
After you tag the data with its actual structure, you can write applications that format it into web pages, do queries on it (similar to database queries), format it for printing, or a variety of other transformations.
Taking a look at some examples may help you. You don't need to read these pages in detail, but at least glance at the code examples on the following pages:
Here are some more examples of XML files from various web sites. Note that in IE and Firefox, you'll see these files in a web browser with little minus signs in front of the elements. Clicking on these will change them into plus signs. Clicking again will change them back to minus signs. Observe what is happening. (See the explanation in Section 8.7, p. 322 through p. 324 if you aren't sure).
Example XML files from w3schools.com:
There are some nice examples on that same w3school.com page of more interesting things you can do with XML files.
But beware; only about half of these things are cross-browser. Many of them work only in IE and only on Windows. As far as I know, these IE pages use Microsoft-specific extensions and not W3C standards, so it is not that Firefox is broken, but rather that Microsoft is trying to get "out ahead" of the standards bodies.
I could be mistaken about this, and will offer extra credit to anyone that can shed light on this, either by documenting that my assumption here is correct, or documenting that it is not correct.)
Some things to know about JavaScript
The point about JavaScript being dynamically typed is important, but we'll cover that in much more detail when we cover Sebesta Chapter 4, which focuses on the JavaScript language.
You may have heard of Dynamic HTML (DHTML). It turns out that DHTML isn't really a separate technology at all, but is rather a set of techniques that involve using JavaScript, Cascading Style Sheets, HTML or XHTML, and something called the Document Object Model (DOM). DHTML allows you to to really cool things like drag-and-drop (for example, try dragging the pretty little Blue and Gold balls shown here around the page!) Chapters 5 and 6 in Sebesta will get us into that material.
JavaScript is also the basis for hot new approach to building web applications called AJAX. AJAX is the basis of lots of hot websites such as Google Maps and Gmail. The J in AJAX stands for JavaScript, and the X stands for XML.
The article that introduced AJAX to the world points out that AJAX isn't really a separate technology. Just like DHTML, it is a particular way of using a combnination of existing technologies to achieve a really cool result.
We may or may not have time to cover AJAX in detail, but I hope to at least introduce you to some of the basics. If you are interested, I encourage you to pursue applications of AJAX in your projects.
Given that CISC370 is a strict pre-requisites for CISC474, pretty much everything in this section should be review. In any case, be sure you know the following terms:
As long as we are on the subject, there is a longer list of things you should already know about Java in the file topics/java/thingsYouShouldKnow.txt on the course web site.
Also, somehow Sebesta mentions ASP.NET in the Java section, which is ironic, since with ASP.NET, Microsoft really seems to be trying to steer folks away from Java (and towards their own languages called Visual Basic and C#). We probably won't spend a lot of time on ASP.NET this semester; I was starting to move in the direction of including more ASP.NET in the course, but my contacts at Microsoft kind of dried up.
This section is probably useful for anyone who has "heard of Perl", but doesn't really know what it is.
This section will probably be more interesting if you read it in combination with pages 28 and 29 of HFSJ, where CGI programs are discussed, and the Kung Fu masters debate CGI/Perl vs. Java Servlets.
Two things you should know from this section
Beyond that, I don't plan to ask exam questions about the details of 1.8.7; if I ask you about CGI and/or Perl, it will only be after we actually do something with it. In that case, I'll probably look more to Sebesta Chapters 9 and 10 for questions.
However, I might ask you exam questions about p. 28 and p. 29 of HFSJ, so reading this section to help you understand those pages better might be very useful. This section might also help you with your Concept Mapping assignment.
Like the previous section, section is probably useful for anyone who has "heard of PHP ", but doesn't really know what it is.
As with Section 1.8.7, I want you to know
Beyond that, I don't plan to ask exam questions about the details of 1.8.8; if I ask you about PHP, it will only be after we actually do something with it. In that case, I'll probably look more to Sebesta Chapter 12 for questions.
This section may help you with your Concept Mapping assignment, though.
A nice summary. But beware of some inaccuracies that have already been covered in the detailed sections above. I'll leave it as an exercise to you to find them (I found at least three!)
Comments on these questions:
file
as used here is not a protocol, but rather a URI scheme. Having said that, we can ask: what does "file" at the beginning of URL signify? I wont be using the exercises from this Chapter. Don't worry; you'll be plenty busy without worrying about these.
Ok, before we get started, here's that promised advice for Sebesta.
I'd like to hear your opinions (those of CISC474 students) on this advice too—use the discussion board on WebCT with the title: "Reading Notes for Sebesta Questions/Discussions"
Here's my advice. Instead of the "yada yada yada" about how profound the Internet is, and the dry overviews of basic concepts, start the book this way: with a chapter that takes the reader on a tour of what various technologies can do.
This chapter would be an illustrated elaboration on the material currently in Section 1.8 ("The Web Programmer's Toolbox"). But instead of just describing what the different technologies can do, point the reader to example web sites that use the various technologies. Show some pictures in the book of the web sites described, but mostly, let the web sites speak for themselves. The text in the book would just be a brief description of how the technology is used to make that particular site "do what it does".
Get Addison-Wesley to host the sites so you'll know they'll stay up. Start with simple XHTML, and work your way through the technologies, so that the reader is motivated to learn more. This would get the book off to a much more exciting start!
You even include on the site links to other "real-world" site that use the technologies being described (since those links might change frequently, you probably wouldn't want to include them in the book, but on the web site, they could be updated as needed.)
Then provide a second chapter (a separate chapter) of "background and basic concepts" with the stuff that is currently in Sections 1.1 through 1.7. Separating this out, and putting it after the "tour" would make it easier to swallow. You can point out that all of these things are necessary background before we can get on to the rewarding task of building interesting web sites (like the ones we saw in Chapter 1.)