Security

Filtering user input in web applications: the basics

SQL Injection. Cross-Site Scripting. These are just two of web application security flaws that can be prevented by effectively filtering user input. Web developers can filter user-supplied input in two ways – either by utilizing white-list or black-list input validation. Each method of input sanitization has their own pros and cons, so I will go through each of them individually.

Black-list input sanitization

Black-list input validation is one of the most common ways user-supplied input can be validated. The way black-list input sanitization works is pretty simple: when a list of disallowed values is created and any of those values appear in a request, the request gets blocked. However, the issue with validating user-supplied input in such a way is that web developers, especially those who are not very keen in the field of web security, are very likely to block only one or very few attack vectors which means that potential attacker would have very many options to choose from when crafting his payload. Nonetheless, there is another option – white-list input validation.

White-list input sanitization

White-list input validation is very similar to black-list input validation in that it also uses values to understand which requests should be blocked, but it works in an opposite way – when using white-list input sanitization, developers provide a list of allowed values as opposed to providing only disallowed values. In most cases, white-list input sanitization is much more effective than black-list input sanitization, but in some cases, it might be very difficult to create an effective white-list filter because white-list input validation is only very effective when all good values are known.

Sanitizing input in PHP

Here’s some functions that can be useful when sanitizing input in PHP:

  • htmlspecialchars() or htmlentities() – protects against Cross-Site Scripting attacks by converting characters to HTML entities. It is worth noting that htmlspecialchars() only converts special characters while htmlentities() converts all of the applicable characters;
  • FILTER_SANITIZE_STRING – removes tags from a string (used with the filter_var() function);
  • FILTER_VALIDATE_EMAIL – checks if an email address is valid (used with the filter_var() function);
  • FILTER_SANITIZE_EMAIL – removes all illegal characters from an email address (used with the filter_var() function);
  • FILTER_SANITIZE_URL – removes all illegal characters from a given URL;
  • FILTER_SANITIZE_SPECIAL_CHARS – HTML-encodes special characters;
  • (int) $_GET / $_POST – allows developers to make sure a given parameter is an integer. FILTER_VALIDATE_INT or is_numeric() can also be used;
  • mysql_real_escape_string() – escapes any special characters that are used within a query;
  • strip_tags() – strips HTML and PHP tags from a string;
  • PHP Data Objects (PDO) – while PDO is not a function, it is one of the best ways to protect against SQL Injection attacks by using prepared statements.