Categorizing GraphQL Directives into Schema or Query Type

| | | | --- | --- | | | Here at StepZen we've built a number of custom GraphQL directives (both schema-type and query-type) that enable stitching or linking types, sequencing queries, connecting various backends, and more. We've invited Leo to follow up on a previous guest post where he explored custom directives. Here he digs deeper into the categories of this powerful aspect of the GraphQL spec and the pros and cons of query-type directives. |

Some time ago I wrote article Exploring GraphQL Directives in the Wild, exploring the custom directives created by several GraphQL servers and tools.

This article explored the landscape of custom directives, but it did not categorize them as being schema-type directives, which are those used for building the GraphQL schema, or query-type directives, which are those used for modifying the output in the response on runtime. This oversight was pointed out by a reader on Reddit:

An excellent compilation of directive use cases! I just wish directives provided by the clients (as a part of a query) and directives used in developing the schema were clearly separated.

Furthermore, a few weeks ago Andrew Ingram posited in GraphQL's Discord channel that query-type directives which modify the value of the response should not be allowed, because they break Relay's caching behavior, which does not take directives into account. He specifically pointed at the directives from the GraphQL API for WordPress (which is my server) as examples of inappropriate query-type directives (thanks Andrew! 😂).

Andrew has a point: the definition of what's a query-type directive in the GraphQL spec is not unambiguously defined. In my opinion, this is more virtue that defect, as it provides room for experimentation that helps improve the GraphQL spec. As I expressed in article GraphQL directives are underrated:

. . . the reason why I believe directives are a wonderful (and largely unappreciated) feature in that they’re unregulated. Other than describing their syntax, the spec doesn’t say much about directives, giving each GraphQL server implementer free rein to design their architecture and decide what features they can support and how powerful they can become.

In fact, directives are a playground for both GraphQL server implementers and end users alike. GraphQL server implementers can develop features not currently supported by the spec, and users can develop features not yet implemented by the GraphQL server.

However, due to this ambiguity, tools such as Relay and servers such as the GraphQL API for WP could have a different interpretation of the spec, making their implementations incompatible with each other, and that's a problem that we should attempt to solve. So I do agree that improving the GraphQL spec with a clearer definition is worth pursuing, as it will enable all clients and servers to talk to each other.

In order to contribute to this discussion, I've decided to add the categorization of directives from my previous article, indicating if they are query or schema-type directives, or maybe both. If we can understand which directives belong to one band or the other, and how important these functionalities are, we can then decide if banning query-type directives that modify return values is sensible or not.

Categorizing the directives in the wild

The following directives, from several GraphQL servers and tools, were described in my previous article. In this table, I have grouped them by their common goal, and categorized them as being schema-type, query-type, or both.

| Type | Goal | Directives | | --- | --- | --- | | Schema | Building the schema | @dbquery, @materializer, @sequence, @search, @uniqueID, @computed, @sdl, @secret, @id, @unique | | Schema | Fetching data from external services | @rest | | Schema | Federation | @external, @requires, @provides, @key, @extends | | Schema | User authorization | @auth, @isAuthenticated, @hasScope, @hasRole | | Schema | Rate limiting | @portara, @rateLimiting | | Schema | Validating field input constraints | @length, @constraint | | Schema | Caching | @cacheControl, @cache, @cdn, @ttl | | Schema | Handling errors | @principalField | | Schema | Query complexity configuration | @complexity | | Schema | Event notifications | @broadcast, @event | | Schema | Mocking data for testing | @mock | | Schema | Tracing tools | @traceExecutionTime | | Query | Improving performance by returning data in batches | @defer, @stream | | Query | Improving performance by combining queries | @export , @sequence| | Query | Modify the shape of the query | @_, @normalize, @removeIfNull | | Schema & Query | Provide fallback value to the response of a field | @default | | Schema & Query | Format or modify the response of the query | @lowerCase, @upperCase, @titleCase, @camelCase, @trim, @formatCurrency, @formatDate, @formatNumber, @formatPhoneNumber, @convertLength, @convertSurfaceArea, @convertVolume, @convertLiquidVolume, @convertAngle, @convertTime, @convertMass, @convertTemperature | | Schema & Query | Format or modify the response of the query by interacting with a 3rd-party service | @translate |

Let's see how and why these directives are placed into one category or another.

Schema-type directives

The directives in this category are all concerned, in one way or another, with building or configuring the schema, and as such they make no sense being provided via the query. After all, if there's no schema, then what will the query query? Schema-type directives are satisfied within the context of the server, never the client.

Even if it were technically possible to send the schema-type directive via the query, it would be a terrible idea.

As an example of a directive building the schema, StepZen's @dbquery directive indicates from what table to retrieve the data:

customerById (id: ID!): Customer
  @dbquery (
    type: mysql
    table: "customers"
  )
}

Imagine if the client could indicate the name of the table? How dangerous would that be?

As an example of a directive configuring the schema, GraphQL Tools' @auth directive makes sure that only users with a certain role can access selected types or fields:

directive @auth(requires: Role = ADMIN) on OBJECT | FIELD_DEFINITION

enum Role {
  ADMIN
  REVIEWER
  USER
  UNKNOWN
}

type User @auth(requires: USER) {
  name: String
  banned: Boolean @auth(requires: ADMIN)
  canPost: Boolean @auth(requires: REVIEWER)
}

If this directive could be provided via the query, then malicious actors could gain access to the field by providing their own value:

{
  users {
    name
    banned @auth(role: UNKNOWN)
  }
}

This is clearly a security risk, and can also lead to the malfunctioning of the server from not handling unexpected cases.

Query-type directives

The directives in this category are all concerned, in one way or another, with manipulating the execution of the query at runtime, and as such they only make sense being provided via the query.

Query-type directives are the ones that, according to Andrew, should be better regulated, if not outright banned. So let's explore how useful and conflictive these directives are.

These directives perform one among the following functionalities:

Modifying the behavior of the server, as with @defer, @stream or @export
Changing the shape of the response, as with @_, @normalize or @removeIfNull
Modifying the returned values, as with @default, @titleCase or @convertLength

Let's see some examples. Concerning the modification of the server's behavior, the GraphQL API for WordPress' @export directive allows us to execute two queries in the same request, while having them share data with each other:

query GetUserName {
  me {
   name @export(as: "_authorName")
  }
}

query GetPostsContainingUserName($_authorName: String = "") {
  posts(searchfor: $_authorName) {
   id
   title
  }
}

This functionality is allowed (possibly even encouraged) since it's the expected intention for directives, i.e. to help explore how the GraphQL spec can be improved with new features.

Concerning the change of shape of the response, GraphQL Lodash's @_ directive allows us to arrange the results as a map of key => value, such as person => list of films:

{
  peopleToFilms: allPeople @_(get: "people") {
   people @_(
    keyBy: "name"
    mapValues: "filmConnection.films"
   ) {
    name
    filmConnection {
      films @_(map: "title") {
       title
      }
    }
   }
  }
}

...producing this response:

{
  "data": {
   "peopleToFilms": {
    "Luke Skywalker": [
      "A New Hope",
      "The Empire Strikes Back",
      "Return of the Jedi",
      "Revenge of the Sith"
    ],
    "C-3PO": [
      "A New Hope",
      "The Empire Strikes Back",
      "Return of the Jedi",
      "The Phantom Menace",
      "Attack of the Clones",
      "Revenge of the Sith"
    ]
   }
  }
}

Altering the shape of the response is not allowed, hence these directives will most likely never become part of the GraphQL spec, and we should be aware of the consequences of adding them to our GraphQL server.

There are potential exceptions though, such as flat chain syntax (still in Strawman, the lowest RFC stage) which, if anyone comes with a decent proposal for the spec, a viable implementation for graphql-js, and users find it valuable, it may be considered for inclusion. But these are exceptions; by default, these directives are not spec-compatible.

Concerning the directives modifying the returned values, I've deemed these directives to belong to both Schema and Query categories, so we'll explore these in the next section.

Schema & Query-type directives

The directives in this section are those that make sense both within the context of the server, to help configure the schema, and the client, to decide at runtime how to customize the values in the response.

When executed from the client as part of the query, any of these directives breaks Relay's caching behavior; when the same directive is used in the server to configure the schema, its behavior is already inherent to the GraphQL server, and as such it works well with Relay.

Let's see why these directives could be both schema or query type with an example.

The GraphQL API for WP's @default directive provides a configurable fallback value when the response is null. This behavior makes sense to configure the schema, such as assigning image "default.jpg" (with ID "1505") whenever the post has no featured image:

type Post {
  id: ID!
  hasFeaturedImage: Boolean!
  featuredImage: Image! @default(value: 1505) # id for "default.jpg"
}

However, this behavior also makes sense to customize the response for a specific layout. For instance, posts can have the category "Politics" and "Sports", and if we are displaying posts from one or other section, we may want to fetch a different default feature image for each, such as "politics.jpg" and "sports.jpg".

In this case, we can provide the default value in the query itself:

query GetFeaturedImages {
  posts(limit: 3) {
   id
   hasFeaturedImage
   featuredImage @default(value: 1505) {
    id
    src
   }
  }
}

Sure, we could also add extra fields to the schema, at one field per category, like this:

type Post {
  featuredImage: Image! @default(value: 1505) # id for "default.jpg"
  featuredImageForSports: Image! @default(value: 35) # id for "sports.jpg"
  featuredImageForPolitics: Image! @default(value: 47) # id for "politics.jpg"
}

But I find this solution less elegant, since each section would also require its custom query, to fetch its specific field. When the site has 20 categories, I certainly wouldn't want the schema to have 20 extra fields that are all of them basically the same field, and having to copy/paste the same query 20 times just changing one field, or one big query with the 20 fields with @include(if: $isThisOrThatCategory) to select the one to fetch among the 20.

Alternatively, @default could also be satisfied via a field argument (and some extra logic in the client to check if post.featuredImage is null, then use defaultFeaturedImage), like this:

query GetFeaturedImages {
  defaultFeaturedImage: image(id: 1505) {
   id
   src
  }
  posts(limit: 10) {
   id
   hasFeaturedImage
   featuredImage {
    id
    src
   }
  }
}

But I'm also not convinced by this solution, which involves a bigger query, potentially sending unneeded data (such as when all posts have a featured image), and extra logic in the client (that must be replicated across all clients executing the GraphQL query, so there could also be code duplication).

Alternatively, field featuredImage could itself have a fallbackID argument to replace the logic from @default, like this:

query GetFeaturedImages {
  posts(limit: 10) {
   id
   hasFeaturedImage
   featuredImage(fallbackID: 1505) {
    id
    src
   }
  }
}

But I'm not a fan of this solution, since it is mixing different concerns in the same piece of code. @default is a generic functionality that can be applied transversally across fields, to any field. It is implemented once, will work everywhere, and its use is consistent across the schema. And it can be injected into the schema by a 3rd-party component. The fallbackID field argument would need to be implemented on a field-by-field basis, it may be given different names (fallback, defaultValue, default) so the schema could become inconsistent, and cannot be injected into a field which does not have this functionality already, as a directive makes possible.

Even more, using field arguments to replace directives @titleCase, @formatCurrency and @translate is even less compelling.

Take @translate for instance. If we want to translate the response from any field to a specific language, then we'd need to add an additional translateTo: String! field argument, to every single field in the schema:

type Post {
  title(translateTo: String): String
  content(translateTo: String): String
  excerpt(translateTo: String): String
}

type Category {
  name(translateTo: String): String
}

type Tag {
  name(translateTo: String): String
}

type User {
  description(translateTo: String): String
}

#...

That would render the schema bloated, and would override the separation of concerns across functionalities.

Moreover, we can only add additional field arguments when the API is fully under our control, but when the schema is composed of pluggable parts, such as when doing stitching or federation, or having types injected by 3rd-party components, then it can't be done anymore.

So I believe that using a directive for this functionality is justified, making query-type directives useful enough, that they should not be banned.

Conclusion

I agree with Andrew that breaking Relay's caching behavior is a problem, and it should be solved. But instead of banning query-type directives, which are indeed useful, I'd instead suggest having the GraphQL spec express that query-type directives are allowed to modify the value of the field in the response.

With such a change, Relay could take the directive into account when storing entries into its cache, thus solving the conflict.

To learn about StepZen's custom directives and features that help you build GraphQL APIs, see the StepZen docs.

Categorizing GraphQL Directives into Schema or Query Type

Categorizing the directives in the wild

Schema-type directives

Query-type directives

Schema & Query-type directives

Conclusion

About the Author

Leonardo Losoviz

Freelance Developer

More From the Blog

Build a game voting system using CockroachDB, Steam and StepZen

Streamline Your API Development With Neo4j GraphQL, and StepZen

Measure GraphQL API Performance in the Dashboard