|
The text and programs here are from the first edition of CGI Programming 101. This has been replaced by the 2nd edition; please click here to view the updated material from the 2nd edition. |
Environment variables are a series of hidden values that the web server sends to every CGI you run. Your CGI can parse them, and use the data they send. Environment variables are stored in a hash called %ENV.
Variable Name Value DOCUMENT_ROOT The root directory of your server HTTP_COOKIE The visitor's cookie, if one is set HTTP_HOST The hostname of your server HTTP_REFERER The URL of the page that called your script HTTP_USER_AGENT The browser type of the visitor HTTPS "on" if the script is being called through a secure server PATH The system path your server is running under QUERY_STRING The query string (see GET, below) REMOTE_ADDR The IP address of the visitor REMOTE_HOST The hostname of the visitor (if your server has reverse-name-lookups on; otherwise this is the IP address again) REMOTE_PORT The port the visitor is connected to on the web server REMOTE_USER The visitor's username (for .htaccess-protected pages) REQUEST_METHOD GET or POST REQUEST_URI The interpreted pathname of the requested document or CGI (relative to the document root) SCRIPT_FILENAME The full pathname of the current CGI SCRIPT_NAME The interpreted pathname of the current CGI (relative to the document root) SERVER_ADMIN The email address for your server's webmaster SERVER_NAME Your server's fully qualified domain name (e.g. www.cgi101.com) SERVER_PORT The port number your server is listening on SERVER_SOFTWARE The server software you're using (such as Apache 1.3) Some servers set other environment variables as well; check your server documentation for more information. Notice that some environment variables give information about your server, and will never change from CGI to CGI (such as SERVER_NAME and SERVER_ADMIN), while others give information about the visitor, and will be different every time someone accesses the script.
Not all environment variables get set for every CGI. REMOTE_USER is only set for pages in a directory or subdirectory that's password-protected via a .htaccess file. (See Appendix D to learn how to password protect a directory.) And even then, REMOTE_USER will be the username as it appears in the .htaccess file; it's not the person's email address. There is no reliable way to get a person's email address, short of asking them outright for it (with a form).
The %ENV hash is automatically set for every CGI, and you can use any or all of it as needed. For example, if you wanted to print out the URL of the page that called your CGI, you'd do:
print "Caller = $ENV{'HTTP_REFERER'}\n";It's very simple to print out all of the environment variables, and some of the values in the ENV array will be useful to you later, so let's try it. Create a new file, and name it env.cgi. Edit it as follows:
#!/usr/bin/perl print "Content-type:text/html\n\n"; print <<EndOfHTML; <html><head><title>Print Environment</title></head> <body> EndOfHTML foreach $key (sort(keys %ENV)) { print "$key = $ENV{$key}<br>\n"; } print "</body></html>";Source code: http://www.cgi101.com/class/ch3/env.txt
Working example: http://www.cgi101.com/class/ch3/env.cgi
Save the above CGI, chmod it, and call it up in your web browser. Remember, if you get a server error, you'll want to go back and try running the script at the command line in the Unix shell, to see just where the problem is. (But note, if you run env.cgi in the shell, you'll get an entirely different set of environment variables.)
In this example we've sorted the keys for the ENV hash so they'll print out alphabetically, using the sort function. Perl's sort function, by default, compares the string value of each element of an array - which means it doesn't work properly for sorting numbers. Fortunately, sorting can be customized. We'll cover numeric and custom sorting in Chapter 8.
A Simple Query Form
There are two ways to send data from an HTML form to a CGI: GET and POST. These methods determine how the form data is sent to the server. In the GET method, the input values from the form are sent as part of the URL, and saved in the QUERY_STRING environment variable. With POST, data is sent as an input stream to the program. We'll cover POST in the next chapter, but for now, let's look at the GET method.You can set the QUERY_STRING value in a number of ways. For example, here are a number of direct links to the env.cgi script:
http://www.cgi101.com/class/ch3/env.cgi?test1 http://www.cgi101.com/class/ch3/env.cgi?test2 http://www.cgi101.com/class/ch3/env.cgi?test3Try opening each of these in your web browser. Notice that the QUERY_STRING is set to whatever appears after the question mark in the URL itself. In the above examples, it's set to "test1", "test2", and "test3", respectively. This can be carried one step further, by setting up a simple form, using the GET method. Here's the HTML for such an example:
<form action="env.cgi" method="GET"> Enter some text here: <input type="text" name="sample_text" size=30><input type="submit"><p> </form>Create the above form (call it form.html), and call it up in your browser. Type something into the field and hit return. You'll get the same env.cgi output, but this time you'll notice that the query string has two parts. It should look something like:
sample_text=whatever+you+typedThe value on the left is the actual name of the form field. The value on the right is whatever you typed into the input box, BUT you may notice if you had any spaces in the string you typed, they've been replaced with +. Similarly, various punctuation and other special non-alphanumeric characters are escaped out with a %-code. This is called URL-encoding, and it happens with data submitted through either GET or POST methods.Your Perl script can convert this information back, but it's often easier to use the POST method when sending long or complex data. GET is mainly useful for short, one-field queries, especially for things like database searches.
You can also send multiple input data values with GET:
<form action="env.cgi" method="GET"> First Name: <input type="text" name="fname" size=30><p> Last Name: <input type="text" name="lname" size=30><p> <input type="submit"> </form>This will be passed to the env.cgi script as follows:
$ENV{'QUERY_STRING'} = fname=joe&lname=smithThe values are separated by a &-sign. To parse this, you'll want to split the query string with Perl'ssplitfunction:
@values = split(/&/,$ENV{'QUERY_STRING'}); foreach $i (@values) { ($varname, $mydata) = split(/=/,$i); print "$varname = $mydata\n"; }splitlets you break up a string into an array of different strings, breaking on a specific character. In the first case, we've split on the &-sign. This gives us two values: "fname=joe" and "lname=smith", which are stored in the array named @values. Then, with a foreach loop, we further split each string on the = sign, and print out the field name and the data that was entered into that field in the form.Some warnings about GET: it is not at all a secure method of sending data, so don't use it for sending password info, credit card data or other sensitive information. Since the data is passed through as part of the URL, it'll show up in the web server's logfile (complete with all the data), and if that logfile is readable by any user (as most are), you're giving the info away to anyone who might happen to be looking. Private information should always be sent with the POST method, which we'll cover in the next chapter. (Of course, if you're asking visitors to send sensitive information like credit card numbers, you should also use a secure server, in addition to the POST method.)
GETs are most useful because they can be embedded in a link without needing a form element. This is often used in conjunction with databases, or instances where you want a single CGI to handle a clearly defined set of options. For example, you might have a database of articles, each with a unique article ID. You could write a single article.cgi to serve up the article, and the CGI would simply look at the query string to figure out which article to display. For example, clicking on
<a href="article.cgi?22">Article Header</a>would display article #22.
Remote Host ID
You've probably seen web pages that greet you with a message like "Hello, visitor from (yourhost)!", where (yourhost) is your actual hostname or IP address. Here is an example of how to do that:
#!/usr/bin/perl print "Content-type:text/html\n\n"; print <<EndHTML <html><head><title>Hello!</title></head> <body> <h2>Hello!</h2> Welcome, visitor from $ENV{'REMOTE_HOST'}!<p> </body></html> EndHTMLSource code: http://www.cgi101.com/class/ch3/rhost.txt
Working example: http://www.cgi101.com/class/ch3/rhost.cgi
This particular CGI creates a new page, but you'll probably want to use a server-side include (SSI), instead, to embed the information in another page. See Chapter 9 for more on SSIs.
One caveat: this won't work if your server isn't configured to do host name lookups. An alternative would be to display the visitor's IP address:
Welcome, visitor from $ENV{'REMOTE_ADDR'}!<p>Working example: http://www.cgi101.com/class/ch3/rhostip.cgi
Last Page Visited
This is a variation on the remote host ID script - only here, we show the last page you visited.
#!/usr/bin/perl print "Content-type:text/html\n\n"; print <<EndHTML <html><head><title>Hello!</title></head> <body> <h2>Hello!</h2> I see you've just come from $ENV{'HTTP_REFERER'}!<p> </body> </html> EndHTMLSource code: http://www.cgi101.com/class/ch3/refer.txt
Working example: http://www.cgi101.com/class/ch3/refer.cgi
The HTTP_REFERER value only gets set when a visitor actually clicks on a link to your page - if they type the URL directly, then HTTP_REFERER is blank.
Checking Browser Type
This script does some pattern-checking to see what browser the visitor is using, and displays a different message depending on browser type.
#!/usr/bin/perl print "Content-type:text/html\n\n"; print "<html><head><title>Welcome</title></head>\n"; print "<body>\n"; print "Browser: $ENV{'HTTP_USER_AGENT'}<p>\n"; if ($ENV{'HTTP_USER_AGENT'} =~ /MSIE/) { print "You seem to be using <b>Internet Explorer!</b><p>\n"; } elsif ($ENV{'HTTP_USER_AGENT'} =~ /Mozilla/) { print "You seem to be using <b>Netscape!</b><p>\n"; } else { print "You seem to be using a browser other than Netscape or IE.<p>\n"; } print "</body></html>\n";Source code: http://www.cgi101.com/class/ch3/browser.txt
Working example: http://www.cgi101.com/class/ch3/browser.cgi
This is a tricky example because IE actually includes "Mozilla" in the browser type line, so we have to try matching "MSIE" first, before matching "Mozilla". The =~ is a pattern matching operator; it checks to see if /pattern/ is contained somewhere in the string. You can also use the =~ operator to replace patterns; we'll see an example of that in the next chapter.
Resources
Visit http://www.cgi101.com/class/ch3/ for source code and links from this chapter.