ahref.com > Guides >
Technology
Building a Site Submission Program, Continued
Defining the Search Engines
Line 37 defines the list of search engines we'll submit to. If you copy this program and add more search engines, you will want to add their names to this list. Before adding other search engines to this list, be sure to send them email and ask if they have a policy against remote programs submitting to their site. (We don't include Infoseek in our submission list because they have a policy against automated submissions.) Yahoo doesn't have a policy against such programs, but their site submission process requires navigating through several pages, a process beyond the scope of this simple script.
37 @site_list = qw (altavista excite hotbot lycos webcrawler);
Lines 39-65 define 1 hash for each search engine. The words we use to name each hash are the same as the names we used to describe each site in the @site_list on line 37.
Each hash includes a key/value pair describing the URL of the CGI that accepts submissions for the search engine. The key for the pair is the same in each hash: submission_page.
Each hash also includes a key/value pair describing text which appears on the search engine's response page when a successful submission has been made. The key here is success.
The other elements of each hash vary from search engine to search engine. To figure out what information each search engine requires, and the name of the variable that the information should appear under, view the source of the engine's submission form. The name of each variable will become a key in the hash; the value assigned the key will come from hidden variables on the form page, or from the URL and email address submitted.
For example: viewing source on Altavista's submission form reveals that the field in which you input the URL is named q and there is a hidden input value on the form, named ad, with a value of 1. So we assign the value of $input_url (which we got from our own form) to the variable $altavista{"q"}, and assign 1 to $altavista{"ad"}.
39 $altavista{"submission_page"} = "http://add-url.altavista.digital.com/cgi-bin/newurl";
40 $altavista{"success"} = "has been recorded by our robot";
41 $altavista{"q"} = "$input_url";
42 $altavista{"ad"} = "1";
43
44 $excite{"submission_page"} = "http://www.excite.com/cgi/add_url.cgi";
45 $excite{"success"} = "Thank you!";
46 $excite{"url"} = "$input_url";
47 $excite{"email"} = "$input_email";
48 $excite{"look"} = "excite";
49
50 $hotbot{"submission_page"} = "http://www.hotbot.com/addurl.html";
51 $hotbot{"success"} = "Got it!";
52 $hotbot{"newurl"} = "$input_url";
53 $hotbot{"email"} = "$input_email";
54 $hotbot{"ip"} = "$our_ip";
55 $hotbot{"redirect"} = "http://www.hotbot.com/addurl2.html";
56
57 $lycos{"submission_page"} = "http://www.lycos.com/cgi-bin/spider_now.pl";
58 $lycos{"success"} = "We successfully spidered your page.";
59 $lycos{"query"} = "$input_url";
60 $lycos{"email"} = "$input_email";
61
62 $webcrawler{"submission_page"} = "http://webcrawler.com/cgi-bin/addURL.cgi";
63 $webcrawler{"success"} = "has been scheduled for indexing.";
64 $webcrawler{"url"} = "$input_url";
65 $webcrawler{"action"} = "add";
For Hotbot, the name of the field in which you would normally type the URL is newurl. The field for email is email. Hotbot also has a hidden field: the name is redirect and the value is http://www.hotbot.com/addurl2.html. Another hidden field, which is dynamically generated whenever you access Hotbot's submission form, is ip. This is where you input the variable $our_ip from above.
Excite, Lycos, and Webcrawler were filled out in a similar manner. If you want to add other search engines to this program, you'll need to create a new hash for each new search engine using this procedure.
|