Redirect Roulette

In this week's article I deep-dive into how to find Open Redirects in modern web applications, how to manipulate them to bypass validation checks, and briefly touch on the difficulties in preventing them.

What is an Open Redirect?

An open redirect attack occurs if an adversary is able to trick a user into visiting an external site by providing them with a URL from a legitimate site that instead redirects them elsewhere. This works because the URL looks valid and as though it would lead to a legitimate site, but in reality it automatically redirects to a malicious page.

Types of Open Redirect:

URL-Based:

Sites will often use URL parameters or HTTP messages to redirect a user to a specified URL without any additional action from the user. When an adversary is able to manipulate the value of this parameter to redirect the user offsite, we have what's known as an Open Redirect.

Referer-Based:

Another common redirect technique, adversaries can host a page that links to the victim site to ensure the request's referer header is under their control. This is useful in the event that the application uses the referer to determine where it should redirect a user to.

Why Should we Care?

Open redirects can be used to trick a user into thinking they have visited a legitimate site. With some clever social engineering on the adversary's behalf, they can:

  • Conduct phishing operations to steal credentials
  • Send a user to a malicious page that hosts malware

It's worth mentioning that open redirects can also be used to trick the sites themselves. Sometimes, an open redirect can even be used to bypass SSRF protections.


Finding Open Redirects:

A few usual tricks in our recon phase can assist us in the discovery of Open Redirect vulnerabilities. Let's dive into these here:

Looking for URL Parameters:

By searching for parameters used to redirect, we'll often find interesting targets in URLs. Using a proxy to record as we explore a site, we can later analyze traffic for any parameters that contain absolute, or relative, URL. We're able to then start identifying points of interest that we should look into more closely, and begin figuring out how a site implements its redirects.

In addition, we should take note of pages that don't contain URL parameters, but still redirect users. These are likely candidates for Referer-Based redirects, and can be identified by 3XX status codes like 301 and 302.

Google Dorking:

Google Dorks are an efficient way of finding additional redirect parameters. To look for these on a target site by using google, we should start with a simple:

site:example.com

%3D

We can then look for pages that contain URLs in their parameters by making use of the %3D, the URL-encoded version of the "=". By adding this as a search term, we're able to start looking for terms such as =http, =https, which are indicators of URLs in a parameter:

inurl:%3Dhttp site:example.com
inurl:%3Dhttps site:example.com

%2F

Another thing to try is using %2F, the URL-encoded version of the "/". This let's us look for URL parameters containing relative paths, rather than absolute ones:

inurl:%3D%2F site:example.com

Common Parameters:

Alternatively, we can search for the names of common URL redirefct parameters:

inurl:redir site:example.com
inurl:next site:example.com
inurl:forward site:example.com
inurl:view site:example.com
inurl:url site:example.com

Testing Parameter-Based Open Redirects:

We should pay attention to the functionality of each redirect parameter we've found, and test each for an open redirect. Start by inserting a random hostname, or a hostname we own as the redirect parameter. We can see if the target site automatically redirects to the one we've specified by monitoring traffic on the page we control.

Some sites will redirect to the destination immediately after visiting the URL, not even requiring any user interaction. That said, a lot of pages, won't trigger a redirect until after some form of user action. Things like:

  • Registering an Account
  • Logging in
  • Logging out

In these cases, we need to be sure to carry out the required user interaction before confirming the redirect as valid.

Testing Referer-Based Open Redirects:

Finally, we should test for referer-based open redirects on the pages that we found that redirected a user despite not containing a redirect URL parameter. To test these, we'll need to set up a page on a domain we own, and host a page like this:

<html>
	<a href="https://example.com/login">Click Me</a>
</html>

When using our malicious page to visit our suspected target, we're able to see if after logging in, we're returned back to our hosted page automatically as a result of the referer-header.


Preventing Open Redirects:

To prevent open redirects, the server needs to ensure that it doesn't redirect users to malicious locations.

URL Validators:

To this effect, sites will often implement URL Validators to ensure that the user-provided redirect points to a legitimate location. These validators will typically rely on either a blocklist or an allowlist.

Blocklists:

When a validator implements a blocklist, it checks whether the redirect URL contains certain indicators of a malicious redirect, and will then block those requests accordingly.

For example, a site may block known malicious hostnames or special URL characters often used in open-redirect attacks.

Allowlists:

When a validator implements an allowlist, it checks the name portion of the URL to ensure it matches a predetermined list of allowed hosts. If the hostname portion of the URL matches a known host, the redirect is allowed to go through, otherwise, the server will block the redirect.

Why's This so Hard?

While these defense mechanisms sound straightforward, the reality is that parsing and decoding a URL is extremely difficult to do properly.

Validators often have a hard time identifying the hostname portion of the URL, making open redirects one of the most common vulnerabilities in modern web applications.


Bypassing Protections:

Sites often prevent open redirects by validating the URL used to redirect the user, making the root cause of open redirects failed URL validation. Unfortunately, URL validation is extremely difficult to get right.

Difficult Validation:

Here, we can see the components of a URL:

scheme://userinfo@hostname:port/path?query#fragment

The URL validator needs to be able to predict how the browser will redirect the user and reject URLs that redirect offsite. Browsers will typically redirect users to the location indicated by the hostname provided in the redirect, however, URLs don't always follow the strict format listed above. Instead, they can:

  • Be malformed
  • Have their components out of order
  • Contain characters that the browser doesn't know how to decode
  • Even have extra or missing components altogether

For example, how would the browser redirect the following URL:

https://user:pass:8080/example.com@attacker.com

When you visit this link in different browsers, you'll see that each one handles it a bit differently. Sometimes, validators don't account for all the edge cases that can cause the browser to behave unexpectedly. In cases like these, you can try to bypass the protection using the following strategies.

Abusing Browser Autocorrect:

Depending on the browser, we can abuse its autocorrect features to construct alternative URLs that redirect offsite. Modern browsers often autocorrect URLs that don't have the correct components in order to prevent error due to user typo. A good example of this, Chrome will interpret all of the following requests as pointing to https://attacker.com:

https:attacker.com
https;attacker.com
https:\/\/attacker.com
https:/\/\attacker.com

These quirks can help us bypass URL validation blocklists. If the validator is using string matching and is looking for strings that contain https:// or http://, the above alternatives will all bypass that filter.

Most modern browsers also automatically correct backslashes to forward slashes, meaning they'll treat these URLs as the same:

https:\\example.com
https://example.com

If the validator doesn't recognize this behavior, the inconsistency could lead to other issues. Let's take the following example:

https://attacker.com\@example.com

If the backslash is treated as a path separator, the validator will interpret the hostname to be example.com, and treat attacker.com\ as the username portion of the URL. But, if the browser corrects the backslash to a forward slash, it will result in a redirect to attacker.com and treat @example.com as the path portion of the URL, forming the valid URL:

https://attacker.com/@example.com

Flawed Validator Logic:

As a common defense against open redirects, the URL validator often checks if the redirect URL starts with, contains, or ends with the site's domain name. This type of protection can be bypassed by creating a subdomain or directory with the target's domain name on our adversary controlled server:

https://example.com/login?redir=http://example.com.attacker.com
https://example.com/login?redir=http://atacker.com/example.com

To prevent attacks like these from succeeding, the validator might only allow URLs that both start and end with a domain listed on the allowlist. However, we can easily create a URL that fulfills both of these conditions, such as:

https://example.com/login?redir=https://example.com.attacker.com/example.com

This URL redirects to attacker.com despite the redirct parameter beginning and ending with the original domain. This is a result of the browser parsing the first example.com as the subdomain name, and the second as the filepath.

We could also get around this by making use of the "@" symbol, to make the first example.com a part of the username portion of the URL:

https://example.com/login?redir=https://example.com@attacker.com/example.com

Other special characters can be applied in similar ways when being used in redirect URLs, often allowing us to bypass other filtering mechanisms, or manipulate discrepencies in how the Browser and the Site interpret our redirect:

https://example.com/login?redir=https://attacker.com\@example.com
https://example.com/login?redir=https://attacker.com?@example.com
https://example.com/login?redir=https://attacker.com#@example.com

The Key takeaway is that custom-built URL validators are prone to attacks like these because developers often don't consider all the possible edge cases.

Using Data URLs:

Similarly, we can also manipulate the scheme portion of the URL to fool a validator. By using the data: scheme to embed a base64-encoded redirect URL, we can bypass the validation process entirely in some cases:

data:text/html;base64,PHNjcmlwdD5sb2NhdGlvbj0iaHR0cHM6Ly9hdHRhY2tlci5jb20iPC9zY3JpcHQ-

Where the data encoded represents:

<script>location="https://attacker.com"</script>

Using this with our open redirect, would yield a payload that looks something like the following:

https://example.com/login?redir=data:text/html;base64,PHNjcmlwdD5sb2NhdGlvbj0iaHR0cHM6Ly9hdHRhY2tlci5jb20iPC9zY3JpcHQ-

URL Decoding:

When validators parse URLs, or when browsers redirect users, they have to first understand what is contained in the URL by decoding any URL encoded characters. If there is any inconsistency beetween how the validator and the browser decode these, we can try to exploit it to our advantage.

Validator Double Encoding:

We can try to double, or even triple, URL encode certain special characters in our payload. For example, we can turn:

https://example.com/@attacker.com

into:

https://example.com%2f@attacker.com
https://example.com%252f@attacker.com
https://exampe.com%25252f@attacker.com

Whenever a mismatch exists between how special characters are decoded, we should try to exploit this to induce an open redirect. Some validators, for example, might decode these URLs completely, and then assume the URL redirects to exmample.com, since @attacker.com is in the path portion of the URL.

That said, the browser might decode the URL incompletely -- instead treating example.com%25252f as the username portion of the URL and using attacker.com to redirect us.

Browser Double Encoding:

On the other hand, if the validator doesn't double decode URLs, but the browser does, we can use a payload like this:

https://attacker.com%252f@example.com

Flipping our pieces around, we would build our redirect payload such that the validator would see example.com as the hostname, but the browser would still redirect to attacker.com as @example.com becomes the path portion of the URL, like this:

https://attacker.com/@example.com

Non-ASCII Characters:

We can also exploit inconsistencies in the way validators and browsers decode non-ASCII characters. For the upcoming examples, let's say that the following URL has passed through validation already:

https://attacker.com%ff.example.com

Non-ASCII:

%ff is the character ÿ, which is a non-ASCII character. The validator has determined that example.com is the domain name, and attacker.comÿ is the subdomain name. From here, several things could happen:

  • Sometimes, browsers will decode non-ASCII characters into question marks. Which would leave us with .example.com becoming part of the URL query, not the hostname
https://attacker.com?.example.com
  • Other times, the browser will attempt to find a "most alike" character. For example, if the character , (%E2%95%B1), appears in a URL, the validator might determine that the hostname is:
https://attacker.com╱.example.com

Character Sets:

Browsers normalize URLs in this way often as they attempt to be user-friendly. In addition to similar symbols, we can use different character sets to bypass filters. The Unicode standard is a set of codes developed to represent all of the world's languages on the computer.

Using a Unicode chart to find look-alike characters, we can insert them into our payloads to attempt to bypass filters. The Cyrillic character set is especially useful since it contains many characters that are extremely similar to ASCII characters.

Combining Exploit Techniques:

To defeat more-sophisticated validators, we'll need to combine multiple strategies to bypass their layered defenses. The following is a good example of how these some of the things we've talked about can be combined:

https://example.com%252f@attacker.com/example.com

This URL bypasses protection that checks for:

  • The redirect starting with example.com
  • The redirect ending with example.com
  • Most browsers will interpret example.com%252f as the username portion of the URL
  • If the validator over decodes the URL, it will confuse example.com as the hostname

Escalating a Redirect:

Adversaries can use open redirects to make their phishing attacks more credible. For example, see the following, valid link:

https://example.com/login?next=https://attacker.com/fake_login.html

Though this URL would first lead a user to the legitimate website, it would redirect them to the attacker controlled page after login. The attacker could then host a fake login page on a malicious site that mirrors the appearance of the legitimate login, and prompt the user to log in again with a message like:

Sorry! The username/password provided are incorrect. Please try again.

Thinking that they've potentially just typoed when trying to log in, a user would then provide their credentials to the adversary's site. After which, the adversary can redirect the user back to the legitimate site without them even realizing that their credentials have been stolen.

google[.]com/amp/

A great, real-world redirect that I learned of from the Critical Thinking Bug Bounty Podcast can be found in google's accelerated mobile pages (AMP). Created to assist in helping mobile devices access and load modern web applications in low signal areas, Google AMP pages can be used to redirect to any page that has been indexed by Google. This means that requests such as:

https://google.com/amp/bing.com
https://google.com/amp/forfoxsake.dev

Work as valid redirections. This also gives us the potential to exploit any weaknesses in security if Google integrations have been incorporated into our target, as they'll have likely included redirects from Google into their allowlists, and can potentially serve as our ticket to bypassing validation.

Impact from a Security Perspective:

Since organizations can't implement a technical solution to prevent phishing completely, bug bounty platforms will often dismiss open redirects as trivial or low when reported on their own. That said, open redirects often serve as a part of a bug chain and can be used to achieve a bigger impact.

Chaining Open Redirects:

Because open redirects can be used to bypass even well-implemented URL validators, they help us maximize the impact of vulnerailities like Server-Side Request Forgery (SSRF). If a site relies on an allowlist to prevent SSRF and only allows requests to a list of predefined URLs, an adversary can utilize an open redirect within the allowlisted pages and is again able to redirect the request to wherever they choose.