[proWeb Logo]

How to Create a Protein Family Web Site

Getting Started
Assembling the Information
Maintaining and Updating Your Site
Special Links for Protein Family Web Sites
Announcing Your Site
Contact Us

This document is targeted at biologists who want to build a web site or just want to understand how one is created. We've assumed that you are an experienced web surfer, but a novice at authoring web pages. This document can be printed and referred to later as all URLs are indicated in the text.


Getting Started

There's a wealth of information available both on and off the Web for most protein families. The challenge in building a protein family web site is finding that information and presenting it effectively. Like a review article, the information should be concise and general enough to be understood by scientists outside of the specialized field. Unlike a review however, a web site is easily interconnected to other resources on the net, easily updated, and the flow of information may be embedded rather than linear. To get an idea of how protein family information can be presented explore our lists of links to protein family web pages (http://www.proweb.org/other.html).

A protein family web site could easily develop from materials gathered to write a review article. Collect text and images which document interesting features of your favorite protein family. Collect links to other sites on the web (the home pages of researchers are one up-to-date source of how the field is being studied). Collect sequence names and accession numbers for all proteins in the family and references to the family in other databases on the net. (We've automated some of this process for you; see the Table-Making Tools Page.) Formulate a plan of how this info should be presented. Find a dedicated server on the net (usually a UNIX box) where the web page files will be stored. You're ready to begin.

Assembling the Information

A good first step in creating a web site is to make a flow chart. Make plans as to how you will organize your files and images on your server. Plan the structure of site— do you want links from the top of each page back to the main page? Links to sub-pages at the bottom? Once you have a clear picture of how your site will look, you can get down and start on the dirty work.

Writing HTML

All web pages are written in HTML, the HyperText Markup Language. Fortunately for all of us, HTML has relatively simple syntax and is easily learned by example. If you feel so inclined, view the source for this page— the command to do so is under the "View" menu in most web browsers. Most web browsers are kind enough to highlight the syntax for you.

As you look at the source, here are some things to keep in mind:

Most of the other tags are easily learned by example. For those who want to use some of HTML's more advanced features, an excellent book on the subject is O'Reilly's HTML: The Definitive Guide (http://www.oreilly.com/catalog/html3/). Any of the following web sites will also provide more in-depth information:

The last few years have also seen the proliferation of so-called WYSIWYG ("What You See Is What You Get") HTML editors and export filters for word processors. While these programs, such as Adobe Pagemill and Netscape Composer (which is a component of Netscape Communicator) can make it very easy to compose web pages, their HTML is often less concise and more poorly written that a page written by hand. It often ends up being as much work to get the program to do what you want as it would be to write it out by hand. Caveat emptor!!

Some Thoughts on Style

Whether you write your page in Notepad or Pagemill, it is still your responsibility to maintain good style. Nothing is more of a turn-off to the view than a poorly-constructed page that takes forever to load! Simplicity is almost always superior to flashiness.

Working with Images

A picture is often worth a thousand words. That does not necessarily mean that five pictures are worth five-thousand words, nor that a big picture is either. Far too many web pages today are full of big pictures placed indiscriminatingly on the page. Not only is this distracting, but it slows things down— often, the user must wait for the images to load before he can see any of the page. If you must use large images, consider placing a "thumbnail," a small copy (generally no more than 100 x 100 pixels) of your image on a page, and have a link to the larger version. Then the user can decide personally whether the image is worth a 30 second wait. Some other ideas:

Working with Links

Links are what make the World Wide Web such a revolutionary way to distribute information. Never before has it been so easy to access different people's interpretations of the same ideas, or to provide access to background information or further in-depth discussion. Still, one must be careful not to use links in a distracting way— linking every other word makes you page look busy, and people are less likely to use any links at all. Here are some pointers:

Working with Maps (clickable images)

A map is an image, certain areas of which are designated as links to other pages. Since the "hot spots," as they are known, are defined by their coordinates, the maps can be a bit of a pain to write. While Netscape Composer cannot create image maps for you, Adobe PageMill and other freeware and shareware products can (http://dir.yahoo.com/Computers_and_Internet/Internet/World_Wide_Web/Imagemaps/Software/).

Working with Tables

Tables are one of HTML's most powerful features. Once mastered, the table tags allow precise control of text layout and document structure. One important consideration, however, should be the amount of time it takes to render a table. In general, try to limit images inside tables, never put one table inside another, and, if possible, break up a big table into several smaller tables. This will speed up loading time. Even on a very fast computer, a large table takes an unacceptablely long time to load. A few tips from an experienced table-maker:

Working with Frames

Most web users (and web authors!) have something of a love-hate relationship with frames. Used properly, they can make web site navigation much easier. Used poorly, they can make a window a confusing mess of teeny display areas. Here are some general tips for frame use:

Organization

Good organization is essential for any web site. This refers not only to how your site is visible to a visitor, but also how the files are laid out on the server. While it may seem tempting, at first, just to dump every single file into the same directory, try not to do this. Instead, organize your pages into subdirectories in a manner consistent with your overall organizational scheme. This will make it much easier to update your site and remove unnecessary files.

Maintaining and Updating Your Site

You've finally completed your pièce de résistance. But, wait! The fun is just beginning! Given the rapid pace of science today, and your desire (we hope) to keep your page up to date, continual revision is necessary. Here are some ideas on the best way to do this.

Backup your site frequently

A word from the wise: Back up early, back up often! Before making any major changes to your web site, make a backup copy. This is a good habit in case you do the unthinkable or accidentally delete a critical page. On a UNIX machine, this is commonly done using the tar program— for instructions, ask your administrator or read the tar(1) manpage by typing "man tar" at the command line.

Note Changes

You should show your audience what you've changed. This is commonly done in several ways. First, include the date you last made changes to the content of the page at the bottom of the page. It's also a nice touch to list briefly what you changed or added, e.g. "June 12, 1999. Added section of mutants in C. elegans." Second, create or borrow little "new" and "updated" graphics, which you can place next to links to the updated pages. You might also keep a log of all the changes you've made to any pages on your site, and make it accessible from your home page.

Check links

The downside of the dynamic nature of the Web, with sites being constantly updated and changed, is that links go bad with depressing regularity. Nothing is more frustrating for your audience that finding a link they're really excited about, and then getting a "Status 404: Not Found." Thus it's a good idea to go through and check your links frequently, to make sure that they still work. Often the easiest was to do this is to use one of the many HTML Validators (http://dir.yahoo.com/Computers_and_Internet/Information_and_Documentation/Data_Formats/HTML/Validation_and_Checkers/).

Special Links for Protein Family Web Sites

NCBI has instruction for creating links to Entrez (http://www.ncbi.nlm.nih.gov/Entrez/linking.html) and to OMIM (http://www.ncbi.nlm.nih.gov/Omim/mimlink.html).
To link to the protein databases you'll need an accession numbers for each sequence entry in order to link it to the sequence databases.

When linking to SwissProt use
http://www.expasy.ch/cgi-bin/sprot-search-ac?P04406
When linking to GenBank use
http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?db=n&form=6&uid=Z12345&Dopt=g
When linking to GenPept use
http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?db=p&form=6&uid=2911084&Dopt=g

Announcing Your Site

Be sure that the community of biologists can find your web site. Contact us at proweb@fhcrc.org so that your site will be included in our proWeb network. Announce your site in the newsgroup bionet.software.www and in other appropriate bionet newsgroups. To be sure that your site is indexed by the major web search engines, register your site with Submit-It (http://www.submit-it.com/). The FAQ:How to Announce Your WebSite (http://ep.com/faq/webannounce.html) contains more useful links and suggestions.

Contact Us

This document is provided by Liz Greene and Nick Taylor for the proWeb network. It was developed to encourage the creation of proWeb sites. Contact us at proweb@fhcrc.org, if you would like assistance creating a protein family web site.

We welcome your comments and suggestions for improving this how-to page. Contact Liz Greene at proweb@fhcrc.org.

Created 18 September 1996 23:35 GMT
Modified 29 December 1999 3:38 PST