First, let’s begin with a very simple form where we ask the visitor to supply an email address. A real world example could be a form used to subscribe or unsubsribe from your newsletter and since newsletters are delivered to an email address we would not want to collect anything but a valid email address.
The only real disadvantage to the method I am about to describe is that we will not verify if the email address itself is valid and if it really exists but we will check its formatting, which works well in over 90% of all cases.
To check if an email address really exists, there are ways to query the mailserver – though those do not work in many cases because it also opens the door for spammers – or the more popular method called “double-opt-in”, which involves sending an email to the subscriber with a mandatory action – for example, to click a link, or a reply – to confirm subscription or unsubscription from a service. Confirming a subscription is part of the CANSPAM act.
For purposes of this tutorial, newsletters and CANSPAM are not the objective, so let’s get started with the form.
<html> <head></head> <body> <form action="handler.php" method="post"> <label for="the_email_id">Email</label>:<br /> <input type="text" name="the_email" id="the_email_id" size="20" maxlength="60" /><br /> <input type="submit" name="submit_btn" value="check email address" /> </form> </body> </html>
The form is fairly self-explanatory, a single text field and a submit button. The script to handle the validation process will be named “handler.php”, as the form’s “action” suggests.
Here is the script:
Now let’s walk through this piece of code step by step.
At first, we verify that the form was submitted via “post” (remember the HTML?). Why is this important? Well, we do not want people to tinker with our code. Tinkering leads to exploiting, and since we expect the email address to be in PHP’s $_POST (which also hints on a required “post” method), this is a good way to start.
If we pass the “post”-check, we continue to check if anything was entered at all. This is not a necessary step as the following step will catch this as well, but a check performed on empty() is also a lot faster than a regular expression. Doing this gives us the possibility to exit early and actually save resources.
(On a sidenote: This is also a preferred measure when you deal with databases and maybe more critical data on other levels. You always want to verify what you got and if you got anything at all and prevent malicous code from entering further layers of your application.)
Last but not least we use a regular expression to test the format of the string/email supplied by the user.
A more closer look reveals that we allow characters from a to z, 0 to 9, a underscore, hyphen, and a dot to come in front of the “@”-symbol. Following the “@” we basically allow the same, but force an extension in the end. And the extension on email addresses are supposed to be characters only, with a minimum length of two characters and maxmimum length of (currently) four.
Using this full method, we have email validation up and running in virtually no time. The code is small and could be wrapped into a function – which for example returns true or false testing the email address – to refactor the code and could therefore be used inside your existing projects.