Intro – GraphQL

GraphQL (GQL) is a popular data query language that makes it easier to get data from a server to a client via an API call. GQL is commonly deployed as a critical piece of the technology stack for modern web and mobile applications, and as a result, Carve has worked with GQL in numerous security assessment and security engineering engagements. Carve consultant Aidan has put together Carve’s GQL Security Pack to support engineers and security researchers, and it includes:

  1. This post highlighting the top 5 issues we see with GraphQL implementations
  2. An intentionally vulnerable “GQLGoat” application to demonstrate vulnerabilities
  3. Tooling for security engineers to evaluate a GQL implementation

What is GraphQL?

GraphQL is a standardized language for describing and making queries to APIs. Originally built by Facebook in 2015 for use in their mobile applications, GraphQL provides a number of benefits to application developers when compared to a traditional REST API:

  • Client applications are able to request only the information they need, minimizing the amount of data sent.
  • GraphQL allows for more complicated queries to represented, reducing the number of API requests that must be made.
  • All input data is type-checked against a schema defined by the developer, assisting with data validation.

As a result of these benefits, GraphQL has become increasingly popular for building APIs since its initial public release. However, these benefits also come with added complexity. This complexity leads to several security pitfalls we have seen developers fall into.

Demo API

In order to demonstrate each of the common vulnerabilities in GraphQL APIs, we’ve built a simple API featuring each of these issues. The API provides a very minimal notes service, allowing API clients to create their own notes (public and private) and view the public posts belonging to other users.

The source code for this sample API is located here. If you’d like to run your own instance and follow along, you can run it locally with Node or with a Docker container available at carvesystems/vulnerable-graphql-api on Docker Hub. The demo application exposes a webserver with an instance of the GraphiQL IDE for experimentation, which is available on port 3000.

1. Inconsistent Authorization Checks

When assessing GraphQL-based applications, flaws in authorization logic are by far the issue we see most often. While GraphQL helps implement proper data validation, API developers are left on their own to implement authentication and authorization methods on top. Worse, the “layers” of resolvers typical to a GraphQL API make doing this properly more complicated – authorization checks have to be present not only at the query-level resolvers, but also for resolvers that load additional data (for example, to load all of the posts for a given user.)

We generally see two types of authorization flaw in GraphQL APIs. The first, and more common, occurs when authorization functionality is handled directly by resolvers at the GraphQL API layer. When this is done, authorization checks must be performed separately in each location, and any instance where this is forgotten could lead to an exploitable authorization flaw. The likelihood of this occurring increases as the complexity of the API schema increases, and there are more distinct resolvers responsible for controlling access to the same data.

In our demo API, there are multiple ways to retrieve a listing Post objects – a client can retrieve a list of users, then retrieve all their public posts, or simply retrieve a post by its numeric ID. For example, the following query might be used to read all of the posts that belong to the currently logged-in user:

query ReadMyPosts {
  # "me" returns the current user
  me {
    # then, resolve the posts
    posts {
      # finally, return the content
      # and whether this is a public post or not.
      public
      content
    }
  }
}

However, each of these different paths to retrieve a post its own set of logic to check accessibility. In particular, if examining the code to retrieve a post by ID (the GetPostById function in lib/gql/types/post.ts of the source repository), it should be noted that there are no authorization checks in place at all – they have been forgotten. This allows an attacker to perform the GraphQL equivalent of a traditional insecure direct object reference attack and retrieve any post they’d like, public or private. Thankfully, our database assigns Post object IDs in ascending order:

query ReadPost {
   # we shouldn't be able to read post "1"
   post(id: 1) {
       public
       content
   }
}

This example may seem simple, but similar issues are regularly found in real-world GraphQL deployments. For example, a similar issue was recently disclosed to the HackerOne bug bounty program – allowing for an attacker to read the email addresses belonging to users they send an invitation to by username. (The intended behavior is to only allow access to the email address if that was originally used to create the invitation object.)

GraphQL documentation provides guidance on performing authorization securely. The advice is relatively simple – instead of performing authorization logic inside of resolver functions, all authorization logic should be performed by the business-logic layer living underneath. That way, all necessary authorization checks can be performed in one location – making it easier to apply constraints consistently.

2. REST Proxies Allow Attacks on Underlying APIs

When adapting an exisitng API for use by GraphQL clients, it’s common to begin the transition by implementing the new GraphQL interface as a thin proxy layer on top of internal REST APIs (for example, acting as a unified front-end interface to a set of microservice-based internal APIs). A very simple implementation of this will have API resolver endpoints simply “translate” requests to the REST API format, and format the response in a way that the GraphQL client can understand. For example, the resolver for user(id: 1) could be implemented in the GraphQL proxy layer by making a request to GET /api/users/1 on the backend API. If implemented unsafely, an attacker may be able to modify the path or parameters passed to the backend API, presenting a limited form of server-side request forgery. For example, by providing the ID 1/delete, the GraphQL proxy layer might instead access GET /api/users/1/delete with its credentials… a far more destructive result than originally intended. Though this is not ideal REST API design, similar scenarios are not uncommon in real-world implementations, often allowing for modification or retrieval of unintended data.

In our demo API, the getAsset resolver is implemented as follows:

        getAsset: {
            type: GraphQLString,
            args: {
                name: {
                    type: GraphQLString
                }
            },
            resolve: async (_root, args, _context) => {
                let filename = args.name;
                let results = await axios.get(`http://localhost:8081/assets/${filename}`);
                return results.data;
            }
        }

This resolver function simply takes in the name of a desired asset (in this case representing a file — by analogy, representing something similar to a file attachment service). Notice that the type of the name input parameter – and it is directly used to build the path on the backend service being accessed. Because of this, we can modify the path passed to the backend however we want – we can add ../ to the beginning of the name we pass in to exit the assets folder, and if we wanted to, we could add to the end of the path as well. In this case, there is a secret file accessible in the root directory of the backend service, which can be accessed with the following query:

query ReadSecretFile {
   getAsset(name: "../secret");
}

In order to protect against this type of vulnerability, it is necessary to properly validate (and encode – in this case, the filename parameter should be URL encoded) any parameter that is passed to another service. One solution to fix this would be to leverage the GraphQL schema type validator to require a number for the file name, as all valid inputs for this request are numbers. A more general solution is to implement validation of input values – GraphQL will validate the types for you, but leaves format validation to you. A custom scalar type (for example, a AssetId scalar) could be used to consistently apply any custom validation rules that apply for a commonly-used type.

3. Missing Validation of Custom Scalars

When using GraphQL, a scalar type is a type used to represent raw data. Ultimately, the data passed as input or returned as output data by an API is of a scalar type. There are five built-in scalar types – Int, Float, Boolean, String, and ID (which is really just a string). This basic set of scalar types is sufficient for many simple APIs, but for scenarios where additional raw datatypes are useful, GraphQL includes support for application developers to define their own scalar types. For example, an API might include its own DateTime scalar type, or an extended scalar type that provides extended input validation, such as “odd integers” or “alphanumeric strings”.

If an API developer implements their own scalar type, they are responsible for performing any input sanitization and type validation to be performed. In the JavaScript implementations, this is done by implementing the parseValue and parseLiteral functions, which deserialize the input from a JSON representation and the GraphQL abstract syntax tree representation, respectively. Safely implementing these functions to reject invalid input is critical to maintaining the type safety provided by GraphQL. In particular – it may be tempting to reach for a library like graphql-type-json, which adds a pair of new scalar types that allow for API clients to pass any object that can be represented in JSON. In effect, the use of a similar library sacrifices the type-safety guarantees provided by GraphQL in favor of convenience.

In our demo application, we’ve pulled in the graphql-type-json library and used it as the input type for our password reset mutation. The source code for our mutation is as follows:

export const PasswordReset: GraphQLFieldConfig<any,any,any> = {
    type: UserType,
    args: {
        input: {
            type: GraphQLJSON
        },
    },
    resolve: async(_root, args, context) => {
        console.log(args);
        if (args.input.username === undefined || args.input.reset_token === undefined || args.input.new_password === undefined) {
            throw new Error("Must provide username, new_password, and reset_token.")
        }
        let user = await db.User.findOne({where: {username: args.input.username, resetToken: args.input.reset_token}})
        if (user) {
            // Update the user in the database first.
            user.password = await argon2.hash(args.input.new_password);
            user.save();

            // Now, return it.
            context.user = user;
            context.session.user_id = user.id;
            return user;
        }
        else {
            throw new Error('The password reset token you submitted was incorrect.')
        }
    }
}

Our password reset function takes in a JSON object containing a username, a new password, and a password reset token that is checked to ensure validity. (In a real application, we might send an email containing a token to the user upon reset attempt.) To check if our token was correct, the API backend queries the database – directly passing the username and reset token values from the input that haven’t been properly type checked. As our application is using the Sequelize ORM, which allows for complex operators to be embedded in queries, providing an object instead of a string allows us to modify the query in a fashion similar to common NoSQL injection techniques. We may send the object {"gt": ""} as our password reset token – and the query generated will return the user, regardless of the fact that we don’t know the user’s reset token. The following mutation will reset a user’s password to “CarveSystems!”:

mutation ResetPassword {
  passwordReset(input: {username:"Helena_Simonis", new_password: "CarveSystems!", reset_token:{gt:""}}) {
    username
  }
}

4. Failure to Appropriately Rate-limit

The increased complexity of GraphQL APIs makes the implementation of rate-limiting and other denial-of-service protections much more difficult. Whereas with a REST API each HTTP request performs exactly one action, a GraphQL query can take arbitrarily many actions, and thus take an arbitrarily large amount of server resources. Because of this, the same rate-limiting strategies used for REST APIs – to simply limit the number of HTTP requests received – is generally not adequate for protecting a GraphQL API.

One common source of high-complexity queries is a consequence of the graph-based nature of the GraphQL specification. If there is a loop in the relationships between two types of object, it is usually possible to craft short queries that quickly balloon in execution complexity. For example, in our test API, a User has a set of Posts, which has an Author, which has a set of Posts, and so on for as many iterations as desired. A small, but very complex query could be formed as such:

query Recurse {
  allUsers {
    posts {
      author {
        posts {
          author {
            posts {
              author {
                posts {
                  id
                }
              }
            }
          }
        }
      }
    }
  }
}

Adding an additional layer of recursion further increases the complexity. Though this is a difficult problem to solve well, there are generally two strategies that are used to defend against this type of denial of service attack. The simpler of the two is to simply put a limit on the recursion depth, so that nested queries returning such massive sets of results would be rejected. This solution, however, ignores the possibility of an expensive query that doesn’t require a large depth. The other popular solution, adding a complexity score system, addresses those concerns. In a complexity scoring system, each portion of a query is assigned a complexity score, and any request with a total complexity greater than a chosen maximum value would be rejected. For example, allUsers could be assigned 100 points, while the posts field of a user could be assigned 10. In this system, the scores of nested queries would multiply, for a combined query score of 1000. In this type of system, accurately assigning scores to each type of query is important, but may be a difficult task to perform in practice.

Returning to our demo API, in an attempt to protect the password reset mutation from attempts to brute-force the password reset token, it performs IP-address based rate-limiting on the number of requests. For a traditional API, this may be sufficient to discourage attackers, but once again, GraphQL allows queries to include many actions in one HTTP request. As the reset token is simply a 6-digit number, we can make a mutation which passes a large number of guesses to the server at once:

mutation BruteForce {
  p000000: passwordReset(input: {username:"Helena_Simonis", new_password: "CarveSystems!", reset_token:"000000"}) {
    username
  }
  p000001: passwordReset(input: {username:"Helena_Simonis", new_password: "CarveSystems!", reset_token:"000001"}) {
    username
  }

...

  p999999: passwordReset(input: {username:"Helena_Simonis", new_password: "CarveSystems!", reset_token:"999999"}) {
    username
  }
}

In this case, rate-limiting on individual mutation types may be a useful mitigation, along with using harder-to-guess password reset tokens.

5. Introspection Reveals Non-public Information

Oftentimes, it’s tempting to add so-called “hidden” API endpoints providing functionality that isn’t supposed to be accessible to the general public. This could be hidden administrative functionality, or an API endpoint for facilitating server to server communications. If accessible without proper authorization, this is not a good practice in REST-based APIs, but a GraphQL feature called introspection makes discovery of hidden endpoints trivially easy. As part of an effort to be developer-friendly, the introspection feature, which is enabled by default in most GraphQL implementations, allows API clients to dynamically query information about the schema, including documentation and the types for every query and mutation defined in the schema. This is used by development tools, like the GraphiQL IDE, to dynamically retrieve the schema if not provided one. When applied to a public API, introspection can greatly improve developer experience, but may

The demo API includes a hidden mutation that will execute a shell command on the server. As a result of introspection, it should be easy to find it using the GraphiQL IDE – simply entering mutation { will cause it to pop up in the list of suggestions.

Conclusion

GraphQL, as a new standard for interacting with APIs, includes some protections against data validation issues commonly seen in REST APIs. However, as a more complex solution, the complex nature of GraphQL makes certain weaknesses more likely. With a sample API, we demonstrated some of our common findings when testing these APIs.