Writing Better Practical Tests For The Games Industry

This is a repost of my article from the Tiny Hydra blog posted May, 31st 2021.

I have been doing design and development tests for over a decade now (I even did some art tests early in my career) and I have come to the conclusion that most tests are bad. The idea of testing people isn’t bad, in fact, it’s imperative in the games industry. Truth is, most tests fail to accomplish what they set out to do. Which is: to find out whether a potential recruit has the skills required for the role they applied for. Out of the dozens of tests I have completed in my career, I can count on one hand the tests that were well crafted. I’ve written this article in the hope that it will help you (the test giver) to write better tests for almost any discipline.

When I refer to tests I do not mean multiple-choice exams that you might expect from school, or even the short evaluations with minutes long exercises often given as a preliminary evaluation to see if people have a baseline understanding of a knowledge base. The types of tests that might be used to see if people actually know how to code in a language they claim on their resume, or understand universal concepts of UX design. When I talk about tests in this article, I am referring to practical exercises given to candidates to evaluate their broader skill set. These tests often require candidates to plan or build something based on an existing project, or even create a wholly original experience.

What do I mean when I talk about a ‘well crafted’ test? Oh, I’m so glad you asked. By well crafted I mean that the tests were well scoped, the turn around requirements were respectful, the deliverables were well defined, and the test had been well vetted.

So what does it take to make a good design test? Typically it’s a fairly straightforward exercise, related to the work the potential hire will be doing, which is primarily geared toward testing the way a person approaches a problem, communicates solutions, and which gives you an opportunity to assess how a person handles feedback. You will typically also want to validate a person’s ability to use a specific tool or language. In this article I’ll expose some common issues with test exercises and the kinds of instructions you tend to see out there, and discuss what we can do better as an industry. Also, I’ll walk you through what I think makes a good test for a professional in the games industry.

Why Should You Listen To Me?

The common question when reading something like this is, “well what does he know?” In the last ten years I have been in 3 separate management roles on game development projects where I was responsible for writing tests (and one on an IoT project). I was also a professor of game design and development, and during my tenure in that gig I wrote a ton of tests and created other forms of evaluations. I have made ALL the mistakes I talk about in this blog, and have outlined some things we do differently here at Tiny Hydra. I’ll admit, providing a good candidate testing experience does take more time and effort. But, I think you’ll find that it’s really worth it. It leaves the candidate feeling valued (whether they passed the test or not), gives you meaningful insights into your own process, and helps build better connections even with the people you don’t hire.

When Should The Test Be Due?

This is actually a really hard question to answer. You want to be fair and make sure you give people equal time to do the test, right? You want them to have ample time to turn the test around, and you want to be sure you are testing enough.

The problem is that each case is different, and you don’t know what a person’s schedule is like. If you tell your applicant, “Here is the test, please return it by this date and within so many hours after downloading the files,” you’re failing to take their schedule into account. Why?

Let me walk you through a scenario. You’re working full-time but for whatever reason you’ve been applying around and have a couple jobs you’ve been interviewing for. Because this is a highly skilled industry, every position has a test. With: fielding tests, interviewing, and doing your full time job – you already have a lot on your plate. Another company you are very interested in gets back to you several weeks after you applied. They request you sign an NDA so they can give you a technical evaluation. ‘Great,’ you think to yourself. You sign the NDA on a Thursday and they send you the test at the end of that day. They don’t interview you first (a problem we will talk about later) and when you open up the test it has the following requirements:

Complete a task that will take a full day of work +/- 2 to 4 hours
Complete the task within 24 hours of starting it.
You need to return it by Wednesday the following week.

No big deal right. I mean you were just already spending your nights and weekends working on tests, you can slot in one more… Right? Given your busy schedule, what would you do in this case?

This example demonstrates how important it is to discuss the deadline with your applicant. Telling someone when a test is due without a discussion disregards and disrespects their other responsibilities and timelines. The thing is: you don’t know other people’s schedules.

How Long Should The Test’s Exercise Take?

Most of the time tests are written by people working within that discipline on the team. But, because they are written by people who already know the everyday way of things, I would say that almost all of them are poorly scoped. They don’t plan time for discovery and they rarely allow for learning the intricacies of someone else’s project or technology. Also, we tend to forget how much value comes from working with a team in contrast to working alone. Consider a situation where you are unsure about something you are doing, you can turn to a peer and ask, “Hey, how would you do this?” But, someone working alone has to do extensive research, and when you are working in an innovative field like games maybe there is more than one way to do something and none of the common practices are super well documented.

And then there’s tests with ridiculous time requirements. It is never OK to expect an applicant to spend 30 to 40 hours of (free labour, ahem) working on a test. Firstly, let’s go back to the previous example where you are interviewing with multiple companies, and all the positions are pretty exciting to you at this stage. How would a 30-40 hour test fit? What if you’re still at another full time job? If you believe that a job at your company is well worth the 30-40 hour test investment (next to the time dedicated to applying, interviewing, and repeating those steps at X number of other companies), well, have I got news for you: your company is not the center of another person’s universe.
A good rule of thumb is to choose a small task that will allow a person to showcase their competency. You want to give them a task that should take a MAXIMUM of 8 hours. If you can’t tell from 8 hours worth of work whether someone is competent then there is probably something wrong with your task. My suggestion is to figure out what you (or a proficient member of your team) could do in a day’s worth of work. Then, cut the task in half to account for complexity. If this sounds unreasonable to you, ask yourself, “Why don’t programmers actually program 40 hours a week?” or, “Why can’t concept artists spend all their time drawing?”

How Do You Define Test Deliverables?

One of the biggest problems I see with tests today is a lack of clarity in the deliverables. For example, an applicant might receive the following task:

Create a “research plan” and a presentation of recommendations to the team based on your preliminary findings from playing through the first time user experience.

But what counts as a research plan and a presentation? How much information is enough? One applicant might put in over 20 hours of effort to create an extensive research plan that accounts for conducting the research over the course of production for the game, and a 15 minute presentation aimed at peers breaking down all the findings from the preliminary research and going through their recommended design changes based on those findings. Another applicant might put in 6 hours of work and come back with a simple outline for a research plan and a 5 minute presentation with a plan of attack suitable for an executive level audience. Both would fulfill the requirement but demand a different tax on time, and level of involvement. In the first case, an interviewer might even say the applicant has done too much. But how is an applicant supposed to know from the instructions that were given?

Be very clear about what you want from a deliverable. If you are asking someone to build a level for a game, give them a max and minimum size, and how long it should take a player to complete the experience. If you are asking for a presentation, clearly define how long it should be and who the audience will be. Also, be very specific about what you are looking for. Where should a person focus their efforts?

Here’s an example of a poor deliverable description:

Using simple mechanics, produce a pitch document for a minigame.

Maximum one page of text.
Theme the game to a sci fi setting
Include a short visual pitch on a separate page.
Include a list of required assets to build the game on a separate page.
Explain your thought-process and reasoning on a separate ‘designer notes’ page.

Let’s set aside the fact that this could easily take several days worth of work to do well. What else is wrong with the deliverables description:

What are they looking for?
- Minigames could be interpreted in many ways. Checkout this video on the 7 Best Minigames That Deserve Their Own Game. Are these good examples of the type of thing the test is asking for? Arguably most of these have very “simple mechanics”. But, maybe the writer of the test is looking for something much more narrow in scope like the Mass Effect 2 hacking minigame. Also, is this minigame part of a 3D Action RPG? Or maybe a 2D Visual Novel? How often will the player see the minigame? Do you need to be able to produce a bunch of different versions of it, or will this be a one off experience? This task lacks specificity and thus is way too open to interpretation.
What is the scope of the visual pitch?
- Who is the audience for this pitch? Is it the design team? Maybe it’s a product manager? Should this be a slide deck or a one pager? A tester might read this and decide they know exactly what they need to do, then find that the test’s authors expected something completely different.
The list of requirements itself is a huge task.
- The scope of this task is directly related to how the tester interprets the definition of minigame. If the author asks the tester to create a minigame for an existing game, then at least they have a reference they can use to figure out what resources are already available to them. But, designing in a vacuum with no frame of reference the tester might do a great job on their due diligence for a fairly ambitious design that wasn’t even close to what the authors were looking for.

There are some good things about the above deliverables description. It tells the designer that they have to be succinct enough to fit their design documentation into a single page of text. It also solicits designer notes and asks again for brevity.

Here is one way I would retool this description, keeping the same scope:

Using simple mechanics, produce a pitch document for a 2D minigame, using puzzle mechanics, for our game <NAME OF RPG GAME>. We are looking for an experience that takes less than 5 minutes on average to complete, requires minimal instructions to learn, and uses a non-diegetic UI. We want to be able to present this minigame multiple times within the larger game’s experience and still have it feel fairly fresh each time. Good references include the hacking minigames from Mass Effect 2, or Prey.

Please keep your written description of the minigame to one page or less.
Include a visual pitch in the form of a slide deck. This pitch will be meant to get buy-in from department heads and should allow them to quickly understand the concept from a high level. Your pitch should take about 5 minutes, please be prepared to present it to the team during your test review.
Include a list of required assets (above and beyond those that already exist in the game) to build the game on a separate page. If you are able to repurpose existing game assets please give a brief explanation of how you intend to do this.
Explain your thought-process and reasoning on a separate ‘designer notes’ page.

Still this isn’t perfect and some people will interpret these instructions in different ways. This brings me to my next point…

Should You Encourage Questions About The Test?

Even the most experienced people write tests poorly. As a professor I revised every test I gave every semester based on feedback from my students. Still, every year I had students come to me asking for specifics about one exercise or another. Honestly, I don’t believe that you can give a perfectly clear test. One of the problems is personal interpretation.

It’s a mistake to think that you will get the best applicant by giving them a test and having them perform in isolation. In reality, the better applicant might be the one who asks questions and looks for opportunities to collaborate, and to investigate. Often I find companies putting up a wall between those evaluating the test and those taking the test. Many companies don’t encourage applicants to ask questions, and when they do there is often an involved process where the test taker asks the HR representative, who asks the team, and sometimes the answer applicants get back isn’t to the question they originally asked.

My suggestion for a great candidate experience is to encourage candidates to ask questions. If you can afford it, setting up a quick meeting to onboard someone to the test exercise you are giving them is a great way to do this. That way the test taker has an opportunity to ask and follow up on questions, and you find out whether your applicant has a collaborative mindset. You also get something out of this – you get feedback on your test.

If you don’t make it clear that questions are welcome, many applicants will assume that their ability to interpret the questions is part of the test. This might well be the case, but is usually a bad measure of someone’s critical thinking ability. The flaw inherent in this way of thinking, again, is the assumption that you have written the instructions well.

A test is just like a product. It needs iteration if you want to achieve a really good result. The first exercise you put together is very unlikely to be perfect. The level design test you made last week should probably be very different from the one you’re giving three years from now. You can get some meaningful feedback from looking at the numbers and statistics on how many people did well versus who failed, and whether the people who tested well later performed as expected on the job. But, you can also supplement this. Do those on-boarding meetings, and ask test takers to fill out a short post-test survey about their experience.

Is A Project You Work On Regularly A Good Testing Resource?

We talk about this all the time when we are running tests on the games we make. The worst testers are those on the development team. They have too many expectations, and know too much about the project. Consider the story from James Portnow’s Extra Credits where he talks about a consulting job where he was asked to try a game his client was developing. He was completely unable to figure out how to open the player character’s inventory. Eventually, he was told he had to triple-click the player character. Everyone there was so used to it that they had completely lost touch with the fact that it was not intuitive to anyone else. This is an example of excess familiarity being a problem.

Bottom line, you are too close to the work and have too many expectations. In recruiting there is this idea of culture-add rather than culture-fit. The principle is that you should be trying to find people who will add to your team rather than be a reflection of who you already have. The same principle should apply to the work. But this takes a lot of effort. It is hard to evaluate someone’s ideas on the basis of whether they are good even when they aren’t what you would have come up with given the same assignment. To help achieve that objective it is a good idea to remove some of your partiality.

The primary reason we test with projects we already have, is that it’s easy. We look for people who can come up with the same, or at least similar solutions. Someone who provides them at a high quality. But there’s an inherent flaw there. We forget that we spent so much time in pre-production refining our ideas, and then tested and iterated on them until they were good. We can’t expect someone starting fresh with a project to have the same frame of mind and the knowledge we have. However, when you give someone a test derived from a project you’ve been working on for several years, no matter how impartial you try to be about the results the evaluation will be colored by your personal experiences and biases.

I suggest designing functional tests that get at core principles and knowledge without being colored by best practices in the context of a specific situation. You want to evaluate how well someone uses a tool or knows a language. Give them a test that you don’t have a perfect answer for. If you work for a studio with multiple games, one way to accomplish this concept while still keeping the overhead cost low is frame the test for your team around the work of another’s (just keep anyone who transferred from that team out of the test review).

How Often Should You Give Applicants Feedback On Their Test?

Very early on in my career I took a test for Insomniac (I only mention this because the story doesn’t make much sense without this information). I didn’t get a follow up interview after submitting my test, a huge disappointment for a then-student who was super excited to get a test from a major AAA developer. When I asked for feedback on my submission I was only told, “It wasn’t Ratchety enough.” In retrospect I have to laugh at that response, given the urban dictionary definition of ratchet. But, I ask you, what could I have done with that feedback in order to grow and improve?

Maybe the test results aren’t what you expected. Maybe you were looking for something very different. Maybe you don’t think the way they presented the material was structured correctly. Or maybe, their test results just didn’t show the seniority level you were expecting. Fine. If they completed your test, you should do a follow up interview with them anyway. They just spent hours if not days on a test for you! The least you could do is help them improve by giving them 30 minutes of your time and a little feedback. Who knows, their explanations might surprise you.

Who Should Be Attending The Test Review Meeting?

There is a tendency in the industry to leave the test review process up to “someone else.” Especially in development it tends to get seen as a chore, or “not real work.” I totally understand this instinct. I remember early on as a lead I got really tired of doing the post test reviews even with candidates I thought were qualified. I would foist that meeting off on more junior members of the team. People who I thought had the chops to do the review, but who I thought had less demands on their time than me. Or, if I’m being real, I thought I had more important things to do.

This was a terrible mistake. Honestly, in my opinion, it’s a mistake to even bring someone junior to one of these meetings. They have a tendency to be less secure about their position and see the testing interview as an opportunity to showcase how well they understand things. Even if it is subconscious, they have a tendency to want to show off how they understand the team’s way of working, pipeline, design thinking, or whatever they think should be the focus. The problem is they also tend to demonstrate this by verbally attacking the candidate. They often don’t mean to, but it is the rare person in this position who doesn’t aggressively latch onto and dissect everything they think a candidate did wrong.

There are two problems here. One is obvious, the candidate who receives this treatment has a terrible experience and may decline to move forward even if you liked them. The other one is that these evaluations are often unfair or narrow minded. The junior team member is so focused on how this person didn’t do things the “right way,” that they might fail to realize the value in the candidate’s approach.

Who Tests The Test?

Can your people pass your test? Is the assignment clear? Do they produce the deliverables you expect? Can they do it in significantly less time than you would require a prospective applicant to finish in?

Ideally you can answer these questions by testing internally before disseminating your test to applicants. But, maybe you don’t have the inhouse expertise. Maybe you had to hire someone to write a test for you to build a department you don’t have yet. Or maybe you have a one person team you want to expand. You still have a network you can leverage. If you want your test to be reasonable in scope, a good indicator of whether someone will do the job, and elicit consistent results from well qualified candidates; it’s probably a good idea to reach out to people you trust and get some initial feedback.

TL;DR

Given that most functional tests out there are poorly done, here are some guidelines for making sure your test can be effective and your testing process will be both beneficial to you and the candidate:

Discuss and be flexible with your test deadlines, because people are usually interviewing with multiple companies, have full time jobs, and real lives.
Scope your test to take hours rather than days. If you think your test might be over-scoped, assume it is.
Describe deliverables clearly being specific about what skills and knowledge you are trying to test.
Encourage questions so candidates really understand what is being asked of them. If possible, have a test onboarding session with the candidate and someone from the functional department they are applying for.
Don’t test with projects you work on every day, you have too many biases about the “right way” to do things.
Always give feedback to candidates who finish your test. This helps them learn and grow, and might help them come back in six months and be right for the job. Also, they just did a bunch of work for you, don’t treat them like that’s nothing.
Don’t have junior and new employees be part of the review process. They tend to have something to prove and their inclusion can result in a hostile candidate experience.
Test your test. Make sure that people you already have (or people you know that are like the people you want to have) understand the test and can complete it in the time you expect. Also, tests are like a product, make sure you are constantly gathering feedback (going back to point 4), and spending the time to improve the test.

Hope you found the article helpful. Let us know in the comments below if you did!