This document is targeted at biologists who want to build a web site or just want to understand how one is created. We've assumed that you are an experienced web surfer, but a novice at authoring web pages. This document can be printed and referred to later as all URLs are indicated in the text.
There's a wealth of information available both on and off the Web for most protein families. The challenge in building a protein family web site is finding that information and presenting it effectively. Like a review article, the information should be concise and general enough to be understood by scientists outside of the specialized field. Unlike a review however, a web site is easily interconnected to other resources on the net, easily updated, and the flow of information may be embedded rather than linear. To get an idea of how protein family information can be presented explore our lists of links to protein family web pages (http://www.proweb.org/other.html).
A protein family web site could easily develop from materials gathered to write a review article. Collect text and images which document interesting features of your favorite protein family. Collect links to other sites on the web (the home pages of researchers are one up-to-date source of how the field is being studied). Collect sequence names and accession numbers for all proteins in the family and references to the family in other databases on the net. (We've automated some of this process for you; see the Table-Making Tools Page.) Formulate a plan of how this info should be presented. Find a dedicated server on the net (usually a UNIX box) where the web page files will be stored. You're ready to begin.
A good first step in creating a web site is to make a flow chart. Make plans as to how you will organize your files and images on your server. Plan the structure of site do you want links from the top of each page back to the main page? Links to sub-pages at the bottom? Once you have a clear picture of how your site will look, you can get down and start on the dirty work.
All web pages are written in HTML, the HyperText Markup Language. Fortunately for all of us, HTML has relatively simple syntax and is easily learned by example. If you feel so inclined, view the source for this page the command to do so is under the "View" menu in most web browsers. Most web browsers are kind enough to highlight the syntax for you.
As you look at the source, here are some things to keep in mind:
The last few years have also seen the proliferation of so-called WYSIWYG ("What You See Is What You Get") HTML editors and export filters for word processors. While these programs, such as Adobe Pagemill and Netscape Composer (which is a component of Netscape Communicator) can make it very easy to compose web pages, their HTML is often less concise and more poorly written that a page written by hand. It often ends up being as much work to get the program to do what you want as it would be to write it out by hand. Caveat emptor!!
Whether you write your page in Notepad or Pagemill, it is still your responsibility to maintain good style. Nothing is more of a turn-off to the view than a poorly-constructed page that takes forever to load! Simplicity is almost always superior to flashiness.
A picture is often worth a thousand words. That does not necessarily mean that five pictures are worth five-thousand words, nor that a big picture is either. Far too many web pages today are full of big pictures placed indiscriminatingly on the page. Not only is this distracting, but it slows things down often, the user must wait for the images to load before he can see any of the page. If you must use large images, consider placing a "thumbnail," a small copy (generally no more than 100 x 100 pixels) of your image on a page, and have a link to the larger version. Then the user can decide personally whether the image is worth a 30 second wait. Some other ideas:
Links are what make the World Wide Web such a revolutionary way to distribute information. Never before has it been so easy to access different people's interpretations of the same ideas, or to provide access to background information or further in-depth discussion. Still, one must be careful not to use links in a distracting way linking every other word makes you page look busy, and people are less likely to use any links at all. Here are some pointers:
A map is an image, certain areas of which are designated as links to other pages. Since the "hot spots," as they are known, are defined by their coordinates, the maps can be a bit of a pain to write. While Netscape Composer cannot create image maps for you, Adobe PageMill and other freeware and shareware products can (http://dir.yahoo.com/Computers_and_Internet/Internet/World_Wide_Web/Imagemaps/Software/).
Tables are one of HTML's most powerful features. Once mastered, the table tags allow precise control of text layout and document structure. One important consideration, however, should be the amount of time it takes to render a table. In general, try to limit images inside tables, never put one table inside another, and, if possible, break up a big table into several smaller tables. This will speed up loading time. Even on a very fast computer, a large table takes an unacceptablely long time to load. A few tips from an experienced table-maker:
Most web users (and web authors!) have something of a love-hate relationship with frames. Used properly, they can make web site navigation much easier. Used poorly, they can make a window a confusing mess of teeny display areas. Here are some general tips for frame use:
Good organization is essential for any web site. This refers not only to how your site is visible to a visitor, but also how the files are laid out on the server. While it may seem tempting, at first, just to dump every single file into the same directory, try not to do this. Instead, organize your pages into subdirectories in a manner consistent with your overall organizational scheme. This will make it much easier to update your site and remove unnecessary files.
You've finally completed your pièce de résistance. But, wait! The fun is just beginning! Given the rapid pace of science today, and your desire (we hope) to keep your page up to date, continual revision is necessary. Here are some ideas on the best way to do this.
A word from the wise: Back up early, back up often! Before making any major changes to your web site, make a backup copy. This is a good habit in case you do the unthinkable or accidentally delete a critical page. On a UNIX machine, this is commonly done using the tar program for instructions, ask your administrator or read the tar(1) manpage by typing "man tar" at the command line.
You should show your audience what you've changed. This is commonly done in several ways. First, include the date you last made changes to the content of the page at the bottom of the page. It's also a nice touch to list briefly what you changed or added, e.g. "June 12, 1999. Added section of mutants in C. elegans." Second, create or borrow little "new" and "updated" graphics, which you can place next to links to the updated pages. You might also keep a log of all the changes you've made to any pages on your site, and make it accessible from your home page.
The downside of the dynamic nature of the Web, with sites being constantly updated and changed, is that links go bad with depressing regularity. Nothing is more frustrating for your audience that finding a link they're really excited about, and then getting a "Status 404: Not Found." Thus it's a good idea to go through and check your links frequently, to make sure that they still work. Often the easiest was to do this is to use one of the many HTML Validators (http://dir.yahoo.com/Computers_and_Internet/Information_and_Documentation/Data_Formats/HTML/Validation_and_Checkers/).
NCBI has instruction for creating links to Entrez (http://www.ncbi.nlm.nih.gov/Entrez/linking.html) and to OMIM (http://www.ncbi.nlm.nih.gov/Omim/mimlink.html).
To link to the protein databases you'll need an accession numbers for each sequence entry in order to link it to the sequence databases.
Be sure that the community of biologists can find your web site. Contact us at proweb@fhcrc.org so that your site will be included in our proWeb network. Announce your site in the newsgroup bionet.software.www and in other appropriate bionet newsgroups. To be sure that your site is indexed by the major web search engines, register your site with Submit-It (http://www.submit-it.com/). The FAQ:How to Announce Your WebSite (http://ep.com/faq/webannounce.html) contains more useful links and suggestions.
This document is provided by Liz Greene and Nick Taylor for the proWeb network. It was developed to encourage the creation of proWeb sites. Contact us at proweb@fhcrc.org, if you would like assistance creating a protein family web site.
We welcome your comments and suggestions for improving this how-to page. Contact Liz Greene at proweb@fhcrc.org.
Created 18 September 1996 23:35 GMT
Modified 29 December 1999 3:38 PST