Reddit authentication

Preparation

Note

Before digging in, check the Prerequisites page first.

After the traffic was captured, there will probably be lots of HTTP requests that are irrelevant to the authentication. Start by removing all static files (.png, .js, .pdf, etc…). When you’re left with a fewer requests to deal with, it’s time to dive deeper and understand how the authentication works.

The easiest way to start this is by going backwards starting with one authenticated request. This should be some kind of request that only works when the user is already authenticated. I choose the unread_message_count one for reddit, and the request looks like this:

GET https://s.reddit.com/api/v1/sendbird/unread_message_count HTTP/1.1
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0
Accept: application/json
Accept-Language: en-US,en;q=0.5
Content-Type: application/json
Origin: https://www.reddit.com
DNT: 1
Authorization: Bearer [REDACTED TOKEN]
Referer: https://www.reddit.com/
Connection: keep-alive
Host: s.reddit.com

As you can see from this, the only information we sent to this URL from our authentication is the Bearer token.

We define a new Flow that will check for the unread messages in hy:

(setv get_unread_messages
      (Flow
         (Request.get
         "https://s.reddit.com/api/v1/sendbird/unread_message_count"
         :headers [(Header.bearerauth access_token)])))

In Hy, setv is used to set up new variables. Here we created the variable get_unread_messages that will hold the information about this Flow.

The only required parameters for Flow objects is the request and it contains the actual HTTP request definition as a Request object.

The Request object requires only the method and url. Other parameters are optional. We translate the original request into Raider config format, and to use the access token we need to define it in the request header. Since this is a bearer header, we use Header.bearerauth with the access_token which we will create later on.

Getting the access token

The next step would be to find out where is this token generated and how we can extract it. Searching for this token in previous responses, we can see it was first seen in a request to the main reddit page. It’s located inside the <script id=”data”> part of the response, and it looks like this:

[...] "session":{"accessToken":"[REDACTED_TOKEN]","expires":"2021-06-23T19:30:10.000Z" [...]

The easiest way to extract the token using Raider, is to use the Regex module. This module searches for the regex you supplied and returns the value of the first group that matches. The group is the string in between ( and ) characters. The final object I configured looks like this:

(setv access_token
      (Regex
        :name "access_token"
        :regex "\"accessToken\":\"([^\"]+)\""))

We are setting up the variable access_token to the Regex object, with the internal name access_token and that’ll return the value of the string between double quotes after the “accessToken” part.

Now we need to define the actual request that will get us this access token. To do this, we take a closer look to the actual request where this response was created:

GET https://www.reddit.com/ HTTP/1.1
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
DNT: 1
Upgrade-Insecure-Requests: 1
Connection: keep-alive
Cookie: csv=1; edgebucket=PPJTEvVRvoolrqFkYw; G_ENABLED_IDPS=google; loid=[REDACTED]; eu_cookie={%22opted%22:true%2C%22nonessential%22:false}; token_v2=[REDACTED]; reddit_session=[REDACTED]
Host: www.reddit.com

Now we can see there are several cookies being sent with this request. Most of them are irellevant here. To see which one is required for the request to succeed, we remove them one by one and see if we get the information we need inside the response. By doing this, I found out that the only cookie we need is reddit_session. As long as we supply it in the request, we do get the access_token in the response. With this information, we can now write the definition of the request:

(setv get_access_token
      (Flow
         (Request.get "https://www.reddit.com/"
                   :cookies [reddit_session])
        :outputs [access_token]
        :operations [(Print access_token)
                     (Next "get_unread_messages")]))

Here we can see that we specified the reddit_session cookie to be sent with the request, and access_token as the only output generated from the response.

Now we define the cookie like this:

(setv reddit_session (Cookie "reddit_session"))

When the stage is complete, two operations will be executed. The first will print the value of the access_token on the command line, and the next will tell Raider to go to the next Flow that we defined previously.

Multi-factor authentication

To show how Raider works with multi-factor authentication, I have enabled it on my reddit account, and added this step to the configuration. In the web proxy, the request looks like this:

POST https://www.reddit.com/login HTTP/1.1
User-agent: digeex_raider/0.0.1
Accept: */*
Connection: keep-alive
Cookie: session=[REDACTED]
Content-Length: 154
Content-Type: application/x-www-form-urlencoded
Host: www.reddit.com

password=[REDACTED]&username=[REDACTED]&csrf_token=[REDACTED]&otp=566262&dest=https%3A%2F%2Fwww.reddit.com

Now we translate the request in the Raider Request type:

(Request.post
   "https://www.reddit.com/login"
   :cookies [session_id]
   :data
   {"password" password
    "username" username
    "csrf_token" csrf_token
    "otp" mfa_code
    "dest" "https://www.reddit.com"})

Here we use the new cookie called session_id that we define as:

(setv session_id (Cookie "session"))

To use the username and password of the active user, we create two new inputs of type Variable:

(setv username (Variable "username"))
(setv password (Variable "password"))

The nickname can be extracted with a Regex:

(setv nickname
    (Regex
      :name "nickname"
      :regex "href=\"/user/([^\"]+)"))

The multi-factor authentication code will be given as an input to the CLI manually, so we define the mfa_code as a Prompt plugin:

(setv mfa_code (Prompt "MFA"))

The csrf_token value will be defined later on.

I defined the multi_factor stage as shown below:

(setv multi_factor
      (Flow
         (Request.post
            "https://www.reddit.com/login"
            :cookies [session_id]
            :data
            {"password" password
             "username" username
             "csrf_token" csrf_token
             "otp" mfa_code
             "dest" "https://www.reddit.com"})
        :outputs [reddit_session]
        :operations [(Print reddit_session csrf_token)
                     (Http
                       :status 200
                       :action
                       (Next "get_access_token"))
                     (Http
                       :status 400
                       :action
                       (Grep
                         :regex "WRONG_OTP"
                         :action
                         (Next "initialization")
                         :otherwise
                         (Error "Multi-factor authentication error")))]))

The only useful output that this stage will generate is the reddit_session cookie.

Now looking at the operations, several things are happening here. The first operations will just print to the CLI output the values of the csrf_token and reddit_session.

The second operation will instruct Raider to go to the get_access_token Flow if the HTTP response code is 200.

The third operation will run only if the status code is 400, which means the authentication failed. Inside the response body of a failed request will be a message indicating why it failed. Raider will then Grep the response for the string “WRONG_OTP” in case we gave the wrong multi-factor authentication code. If it matches, Raider will go to the initialization Flow starting the authentication from a clean state again.

We will define this stage later in this tutorial. If the string “WRONG_OTP” isn’t found, Raider will quit with the error message “Multi-factor authentication error”.

Login

On reddit, the login request looks similar to the multi-factor one, so the Flow definition is pretty similar:

(setv login
      (Flow
        (Request.post
           "https://www.reddit.com/login"
           :cookies [session_id]
           :data
           {"password" password
            "username" username
            "csrf_token" csrf_token
            "otp" ""
            "dest" "https://www.reddit.com"})
        :outputs [session_id reddit_session]
        :operations [(Print session_id reddit_session)
                     (Http
                       :status 200
                       :action
                       (Grep
                         :regex "TWO_FA_REQUIRED"
                         :action
                         (Next "multi_factor")
                         :otherwise
                         (Next "get_access_token"))
                       :otherwise
                       (Error "Login error"))]))

Getting the CSRF token

Only piece of information we’re missing at this point is the CSRF token.

And now, for the csrf_token we need to find out where it was created. Searching inside the web proxy for the value of the token, we find it in a previous response. The relevant part of the HTML code looks like this:

<input type="hidden" name="csrf_token" value="8309984e972e6608475765db68e25ffb8c0bedc9">

So we have its value inside the input tag, of type hidden, with the name csrf_token. The actual value is a 40 character string made out of lowercase hexadecimal characters. We define this as a Html plugin:

(setv csrf_token
      (Html
        :name "csrf_token"
        :tag "input"
        :attributes
        {:name "csrf_token"
         :value "^[0-9a-f]{40}$"
         :type "hidden"}
        :extract "value"))

This object will extract the csrf_token value, and use it as an input where necessary.

The token can be found by multiple means. The simplest way I found is by sending a simple GET request to https://www.reddit.com/login/ with no additional information. Now we can define this Flow:

(setv initialization
      (Flow
         (Request.get
           "https://www.reddit.com/login/")
           :outputs [csrf_token session_id]
           :operations [(Print session_id csrf_token)
                        (Next "login")]))

Finishing configuration

Adding one more Flow get_nickname, and the complete configuration file for reddit looks like this:

(print "Reddit")
(setv base_url "https://www.reddit.com/")

(setv username (Variable "username"))
(setv password (Variable "password"))
(setv mfa_code (Prompt "MFA"))

(setv csrf_token
  (Html
    :name "csrf_token"
    :tag "input"
    :attributes
    {:name "csrf_token"
     :value "^[0-9a-f]{40}$"
     :type "hidden"}
    :extract "value"))

(setv access_token
  (Regex
     :name "access_token"
     :regex "\"accessToken\":\"([^\"]+)\""))

(setv session_id (Cookie "session"))
(setv reddit_session (Cookie "reddit_session"))


(setv initialization
  (Flow
    (Request.get
       "https://www.reddit.com/login/")
     :outputs [csrf_token session_id]
     :operations
     [(Print session_id csrf_token)
      (Next "login")]))

(setv login
  (Flow
     (Request.post
       "https://www.reddit.com/login"
       :cookies [session_id]
       :data
       {"password" password
        "username" username
        "csrf_token" csrf_token
        "otp" ""
        "dest" "https://www.reddit.com"})
     :outputs [session_id reddit_session]
     :operations
      [(Print session_id reddit_session)
       (Http
        :status 200
        :action
         (Grep
          :regex "TWO_FA_REQUIRED"
             :action
              [(Print "Multi-factor authentication required")
               (Next "multi_factor")]
             :otherwise (Next "get_access_token"))
        :otherwise (Error "Login error"))]))

(setv multi_factor
  (Flow
    (Request.post
       "https://www.reddit.com/login"
       :cookies [session_id]
       :data
       {"password" password
        "username" username
        "csrf_token" csrf_token
        "otp" mfa_code
        "dest" "https://www.reddit.com"})
   :outputs [reddit_session]
   :operations [(Print reddit_session)
                (Print csrf_token)
                (Http
                  :status 200
                  :action
                  (Next "get_access_token"))
                (Http
                  :status 400
                  :action
                  (Grep
                    :regex "WRONG_OTP"
                    :action
                    (Next "initialization")
                    :otherwise
                    (Error "Multi-factor authentication error")))]))


(setv get_access_token
  (Flow
    (Request.get
       "https://www.reddit.com/"
       :cookies [reddit_session])
  :outputs [access_token]
  :operations [(Print access_token)
               (Next "get_unread_messages")]))

(setv get_unread_messages
  (Flow
    (Request.get
    :headers [(Header.bearerauth access_token)]
    :url "https://s.reddit.com/api/v1/sendbird/unread_message_count")))

(setv nickname
      (Regex
        :name "nickname"
        :regex "href=\"/user/([^\"]+)"))

(setv get_nickname
      (Flow
        :name "get_nickname"
        :request (Request.get base_url
                   :cookies [session_id reddit_session])
        :outputs [nickname]
        :operations [(Print nickname)]))


(setv users
  (Users
   [{"user1" "s3cr3tP4ssWrd1"}]))

Running Raider

Now, with the configuration finished, we can run Raider:

$ raider run reddit
Reddit
INFO:root:Running flow initialization
session = [REDACTED]
csrf_token = [REDACTED]
INFO:root:Running flow login
WARNING:root:Couldn't extract output: session
WARNING:root:Couldn't extract output: reddit_session
session = [REDACTED]
reddit_session = None
Multi-factor authentication enabled
INFO:root:Running flow multi_factor
reddit_session = [REDACTED]
csrf_token = [REDACTED]
INFO:root:Running flow get_access_token
access_token = [REDACTED]
INFO:root:Running flow get_nickname
nickname = [REDACTED]