Defeating the web form spambots
Automated web form spam makes having a contact form (or any other feedback form) on your website extremely frustrating because of all of the automated crap that will find your form and post unwanted content to it, unless you use a paid-for service or user-instrusive features like CAPTCHA to prevent the bots from posting crap to your site.
Without requiring a login or user authentication of some kind, it is not possible to stop the bots from using your forms without also making your users lives more difficult; however, if you follow the steps in this artice, the only form posts you see should at least come from a real human!
In 2024, my (admittedly fairly low traffic) servers receieved over 1000 spam form submissions for every single genuine communication from a customer or site user. By using the following approach, virtually all automated form submissions can be identified, can be blocked, and therefore will not interfere with real users who interact with your forms.
Consider the following HTML:
<form class="contact" method="post">
<div>
<label for="contact-name">Name</label>
<input name="contact-name" id="contact-name" placeholder="enter your name">
</div>
<div>
<label for="contact-email">Email</label>
<input name="contact-email" id="contact-email" placeholder="please provide your email address">
</div>
<div>
{error-message}
<label for="contact-message">Message</label>
<textarea name="contact-message" id="contact-message" placeholder="enter your message"></textarea>
</div>
<div class="form-buttons">
<button name="contact-button">Send</button>
</div>
</form>
1. Create a honeypot.
Adding a special honeypot field to your form is the simplest thing you can do to catch most automated bots. The idea is to create a form field that the bot will try to complete that your users will never see, you can then reject any submission that has content in the honeypot field (or fields).
Because most automated form fillers look at the HTML and review the label and other code hints to work out what they should put into the form, we can trick them into filling in something that looks like a normal field. In the above case, let's trick the bot into providing a phone number. Let's add this as a field to our form after the "contact-email" field:
<div>
<label for="contact-phone">Phone number</label>
<input name="contact-phone" id="contact-phone" placeholder="please provide your phone number">
</div>
We now need to make sure that our users don't see the form, and we can do this in our CSS:
form.contact div:nth-child(3) {
pointer-events: none;
position: absolute;
opacity: 0;
}
Finally, we need to make sure that this honeypot doesn't interfere with normal users usability, so use something like the following script in your page:
<script>
document.addEventListener('DOMContentLoaded', e => {
const fm = document.querySelector('form.contact'),
el = fm.querySelector(':scope div:nth-child(3)>input');
if (el) el.tabIndex = -1;
}
</script>
If there is any content in the contact-phone field, we can be fairly sure that the form was filled in by a bot and not a real human being.
2. Detect human interaction.
If you've implemented a honeypot as above, but are still getting automated spam, then you need to detect actual user interaction with your form, replace the script above with:
<script>
document.addEventListener('DOMContentLoaded', e => {
let c = 0, f = 0, t = Date.now();
const fm = document.querySelector('form.contact'),
el = fm.querySelector(':scope div:nth-child(3)>input');
if (el) el.tabIndex = -1;
fm.querySelectorAll('input, textarea').forEach(el => {
el.addEventListener('input', e => {c++;});
el.addEventListener('focus', e => {f++;});
});
fm.addEventListener('submit', e => {
t = Date.now() - t;
fm.querySelector('button').value = `${c},${f},${t}`;
});
});
</script>
This will record:
- Any count of changes to the form (e.g. when a user presses a key)
- When the user switches between fields
- The time the user took to complete the form
Because our original form has a button element, and a button in HTML is treated the same as any other input field, this script will add a value to the button field before it is submitted, but no bot I have encountered actually runs the javascript. In the server-side form handler, we can now check to see if the button value has any content - if not then the form was not filled in by a human using a web browser and can be discarded as spam.
If the button field does contain content, it should contain three comma-delimited numeric parameters:
- The number of keys-presses used to complete the form (there should be at least as many as there are characters in the message submitted)
- The number of form focus events used to complete the form (there should be at least as many as there are visible fields in the form)
- The number of milliseconds it took the user to complete the form before submitting it.
No human interaction would have a zero for any of these parameters, however this should be a simple exercise to parse and review in the server handler code.