6 digit OTP for Two Factor Auth (2FA) is brute-forceable in 3 days

by Luke Plant

Posted in:

— May 11, 2019 11:50

It is common these days to use “TOTP” as an additional factor in 2FA (Two Factor Auth) / multi-factor auth. If you have used Google Authenticator to log in to a site (you can do this with GitHub, for example), then you have used it, and many other apps and sites use the same scheme, and some SMS based 2FA systems may be based on the same concept. TOTP stands for Time-Based One-Time Password, and is specified in RFC 6238. It is based on HOTP, HMAC-Based One-Time Password.

What the RFC for TOTP does not mention at all, and the RFC for HTOP mentions with very little detail, is that the security of these methods depends critically on how you throttle request attempts and/or lock user accounts for repeated failures.

Some systems already have adequate throttling/locking in place, but some certainly don't, and this post is aimed at the latter. Getting the throttling right can be quite tricky too.

(I should mention that this post is not really original. The insight here I got from Why You don't Need 2 Factor Authentication, I am just presenting part of that page in a more detailed way, doing the maths for you, and discussing the consequences, without necessarily agreeing with the conclusions of that page).

To put it simply, with conservative assumptions and common defaults, without account locking (or something similar) an attacker can brute-force a TOTP password in just 3 days. In fact quite a bit faster might be possible.

The attack scenario here is that you have set up 2FA using Google Authenticator (or similar), and an attacker already has your username and password. After getting past the username/password dance they are presented with a screen asking for an OTP.

(If you had set up SMS instead, you will at least get an unexpected text that will alert you that someone has your password, but not with Google Authenticator.)

The whole point of 2FA is that it is supposed to stop an attacker from getting any further. For a high value account, a motivated attacker can and will continue at this point. (And if you don't consider your accounts high value, why are you bothering with 2FA?).

Now the attacker has to try to guess your OTP. How likely is that to succeed? Well, Google Authenticator provides a 6 digit code, giving one million possibilities, and it has a 30 second timeout. Let's assume the attacker can make 10 requests per second. (This is completely reasonable in many scenarios, and significantly higher might be possible). Since we don't have time to try all the possibilities, the chance of success is (30 × 10)/1000000 = 0.0003 = 0.03%, which seems pretty good. Right? Wrong.

We must remember that an attacker does not need to have 100% guarantee of success to attempt something. An attacker will try it if they think they have a 'good' chance of success. Let's assume that is 90%.

Without a timeout, the time to get to 90% chance of success is 0.9 × 1000000 / 10 requests/second = 90000 seconds = 1.04 days.

Now we add a timeout, say an hour, or 90 seconds, or whatever. What happens when the first password times out? According to the TOTP scheme, you can just try the next one. The timeout therefore stops an attacker from being able to try all the possibilities, and rules out a 100% effective attack. But they don't care about that, they just care about having a good chance of success.

Guessing randomly is pretty much our best strategy now. The probabilities look like this:

\begin{equation*} chanceOfSuccess = 1 - chanceOfFailure \end{equation*}

\begin{equation*} chanceOfFailure = {chanceOfFailingOnce}^{numberOfAttempts} \end{equation*}

This last step is a critical part – if you succeed once, you succeed, so you have to fail every time to fail overall. The chance of failing N times is the chance of failing the first time, times the chance of failing the second time.... etc. times the chance of failing the Nth time.

\begin{equation*} chanceOfFailingOnce = 1 - \frac{1}{numberOfPossibilities} \end{equation*}

\begin{equation*} numberOfAttempts = timeInSeconds \times requestsPerSecond \end{equation*}

Re-arranging for \(timeInSeconds\) and substituting:

\begin{equation*} timeInSeconds = \frac{ln(1 - chanceOfSuccess)}{ln(1 - \frac{1}{numberOfPossibilities}) \times requestsPerSecond} \end{equation*}

Putting in \(chanceOfSuccess = 0.9\), \(numberOfPossibilities = 1000000\) (6 digit code) and 10 requests per second we get 230258 seconds or 2.67 days. (To check, put that number of seconds back into the formulas and you'll see the probability of success does come out at 90%).

Note 1: The timeout does not appear in that formula! Reducing your timeout could make a big difference to usability, but makes zero difference to security. That may be counter-intuitive, but consider:

Reducing the timeout from infinity to a few minutes only increased the attack time from 1 day to 2.67 days (aiming for 90% success rate). Clearly the timeout isn't that critical.
Say you are thinking of a number between 1 and 10,000 and give me one thousand attempts to guess it. To make it harder, you change the number every 100 guesses. To make it harder still, you are thinking of changing it every 50 guesses. Would that work? Well, in the first case I get 100 guesses at 10 different numbers, in the second I get 50 guesses at 200 different numbers, but that makes no (practical) difference – I get the same number of guesses, all of them unlikely whether I guess randomly or in sequence, and I only have to guess correctly once to succeed. Mathematically, it boils down to the fact that \({(x^a)}^b = x^{ab}\).

Security is often counter-intuitive, and some security policies can often be nothing more than security theatre. Timeouts are a common target for “tightening” measures, because they seem to be easily understandable by the lay-person.

A while back there was a Django ticket filed that asked for the ability to reduce the password reset timeout to less than 1 day, because “In many applications a day is far too long and doesn't meet security requirements”. I explained that due to the way our password reset is implemented (very differently from HOTP/TOTP), changing the timeout makes precisely zero difference to the ability of an attacker to brute force, and with no timeout at all, or throttling, the mechanism is many millions of times stronger than many of the mechanisms that do indeed need timeouts for security. But I don't think anyone believed me, a short timeout seems better.

Note 2: 3 days is not very long, and entirely feasible for many attackers. If you don't have mitigation measures in place, your 2FA is broken.

In fact, TOTP also has a tolerance factor to allow for delay in transmission, that allows n previous tokens to be used, with a recommend default of n == 1. This effectively doubles your request rate (you are guessing two numbers at once, either will count), reducing the time required to less than 36 hours.

If you go for even odds (50%) rather than 90%, it comes down even further. Using an out-of-the-box installation of django-two-factor-auth, which builds on django-otp, on my development machine I was able to get 20 requests/sec for the 2FA handler without trying hard. I set up a Google Authenticator device for an account and achieved a brute-force in under 8 hours. An attacker could start after you went to bed and might be done by the time you were out of the shower.

One mitigation technique is attempt throttling. However, this has got to be done carefully. It clearly can't be done globally for the 2FA handlers or you could easily DOS yourself. It has to be done carefully on a per-user basis, which takes up storage and would be easy to get wrong. Doing it by IP address also doesn't work well — attackers can easily hire large number of IP addresses these days. Even if we reduced the number of attempts per second by a factor of 10, that attack time would only go up to 30 days, which would still be worthwhile for some attackers.

An alternative is account locking after a number of failures, which is much better. However it also brings problems. It means that your 2FA must only be accessible for people who already have passed one level of security, otherwise you have a denial of service vulnerability. Plus you need all the account unlocking procedures etc, and need to make sure they are secure, and not actually effectively another attack route.

Another option is to use some kind of back-off for failure attempts, which is what the HOTP RFC recommends briefly. For example you could use exponential back-off – you add enforced delays for attempting a token check, requiring the user to wait 1, 2, 4, 8, 16s etc. after each successive failure. This has a number of advantages: it doesn't slow down a legitimate user who just mis-typed once or twice; it requires very little storage; it is highly effective in terms of throttling, to the point of being a kind of account locking [1]; accompanied with appropriate error messages (“You've been temporarily blocked due to X successive failures”) it could alert the real account owner to the presence of an attacker; the soft “account locking” automatically expires, rather than requiring manual intervention of any kind, so that you don't get DOS for the case of unresponsive support staff.

(For the curious/worried regarding django-two-factor-auth/django-otp, a few weeks ago I implemented exactly this for the HOTP and TOTP backends in django-otp, and the fix is availabile in version 0.6.0).

SMS-based 2FA may or may not be better than TOTP, depending on how they are implemented, throttled etc. – some SMS systems just use TOTP and use an SMS message to send the current token, in which case they are equivalent.

SMS does have the advantage that at least the genuine account holder will probably realise that something is going on (although that isn't how the security of 2FA is supposed to work primarily). However, SMS does also have a ton of problems, so maybe it's not any better overall.

Increasing the length of the token does help, as does increasing the alphabet of characters used (although apparently that may have usability issues on phones). Every factor of 10 in the number of possibilities for the token results in a factor of 10 in the time required to brute force. But some apps (e.g. Google Authenticator) only supports 6 digit tokens.

Given the move towards 2FA, the disappointing thing is how little info there is about this. I found this stackexchange question complete with some misguided answers, and some good advice, but little by way of rigorous best practice.

The RFC's have plenty of details about the crypto and algorithms, possibly because those are pretty easy to define and implement in a small chunk of code, but little on the other security critical requirements which are much harder to pin down. This is a security problem in itself – programmers are attracted to something like TOTP because it looks like a properly thought out, defined solution, and the core of it is a nice programming exercise. You can get it ‘working’ relatively easily – but without the critical and more fiddly parts that need to be in place.

[Update 2019-06-01 - added footnote calculation showing effective rate for exponential backoff throttling]

[1]

Exponential increases very rapidly. The limiting factor for an attacker is how often the throttling can get reset by a successful attempt. Let's assume everything is in the attacker's favour:

The legitimate user does a successful 2FA login every day, at a predictable time.
The attacker times their guesses so that the backoff time expires just before this point, so the legitimate user doesn't see an error message and just enters the 2FA code successfully, resetting the backoff for the attacker.

Since we are starting with a 1 second delay and doubling for each failure, this gives the attacker approx \(log_2(secondsInADay)\) \(= log_2(60 \times 60 \times 24)\) \(\approx 16\) attempts per day.

This is an effective requests per second of \(0.000185\). At this rate, using our formula above (and accounting for the TOTP tolerance factor which doubles your effective rate, as above), it would require 59 years to get to even odds of achieving a brute force on a 6 digit token, which is probably OK.

By contrast, if we go with straight “N requests per second” type rate limiting, for the sake of usability for legitimate users you probably wouldn't want to throttle more aggressively than 1 request every 10 seconds per user i.e. 0.1 requests per second. In this scenario it takes just 40 days to get to even odds of a brute force, which is certainly realistic for some attackers.

You may also like: §

Comments §