|
Much of the content of the web may lie hidden within the servers. Web "spiders" can only search for files that have hyperlinks to them. In other words, if a human can click it, then a spider can find it. Otherwise, it remains hidden. To optimize a web site so that hidden content becomes visible to search engine spiders, many information managers construct site maps to assist users and spiders alike. Listed below are some tools that may be useful in constructing site maps for your web. The list is minimal at present, but resources will be added based on recommendations from site managers. To recommend a web tool, please email Mark or Terry. |
| A superb web resource that summarizes the magic of search engines is available at http://www.searchenginewatch.com/webmasters/features.html |
This web script can be used to create a robots.txt file to ban unwanted robots from your site. It just takes two steps with this tool to create an effective robots.txt file for your site.
The generated text can be stored in a "robots.txt file that needs to be placed in the WEB ROOT of your domain (meaning it must be were your index page is). Every site practically needs a robots.txt file because the search engines use it to index your site, and you can specify whether the robots do or don't spider your site, or parts of it. So in general it's always good to have this file in place.
This script generates a tree view of all directories and web documents of your server. It recursively reads all subdirectories and determines automatically whether it was placed in the main directory of the server or in a subdirectory.
The script that is available on the server above has been customized to allow a wider variety of files to be displayed.
Xenu's Link Sleuth is a link verification utility that checks your Website for broken links. With many previously free online link validation services now asking for payment, Link Sleuth is a great free alternative.
Link Sleuth verifies normal links, images, frames, backgrounds, and even local image maps. The generated report includes a listing of all pages, the specific pages on which these bad links appear, and even a list of external sites that timed out.
One of the nice features of this software is that it will create a site map of your server's contents. It is a Windows application.
Metty presents you with simple forms that allow you to input the information
you want, then generate all of your meta tags. Once generated, you can copy
and paste the meta tags into one of your HTML files or insert them into a
new or imported HTML file. No knowledge of meta tags is necessary to use
Metty.
Features
Program Name: DIR2HTML 1.1.0
URL:
http://www.pc-tools.net/files/win32/freeware/d2htm110.exe
DIR2HTML creates an HTML index from a file system directory. This is useful for building file lists, cataloging contents of CD-ROMS, or creating recursive site maps. It will create an html file in each directory that you tell it to index. The program can index either a single directory or multiple directories at once recursively. DIR2HTML is only 58KB and is a Windows application.
Program Name: sitemap.pl and sitemap-win.pl
URL:
http://www.edseek.org/sitemap-scripts.tar.gz
These Perl scripts can be used by site managers to create "site maps" of all the files that are available within the server's document space. The scripts are highly configurable and currently look for htm, html, pdf, asp, php, php3, and txt file extensions. To learn more about these scripts, read the sitemap-script-howto tutorial (coming soon!).
Script Name: metatag-gen.php
URL: http://www.edseek.org/metatag-gen.php
This is a very simple javascript/php combo script that will generate most of the important metatags that are useful for search engine robots and spiders.