Psychometric Test Design. People having colour fight.


Rob Williams Assessment Ltd specialises in custom psychometric test design services for SME’s and scale-ups. Plus we’ve helped SHL and Kenexa IBM with larger EU, BA and CItibank psychometric design projects; designed video assessments (HireVue); and been commissioned by leading psychometric consultancies, such as SOVA and CAPP.

Contact us Today

Psychometric Designs

Many of the UK’s and US’s leading test publishers have used Rob Williams Assessment’s to calculate and prove the reliability of their Personality Tests and Situational Judgement Tests.In addition to situational judgement test design, we also specialise in other forms of psychometric test design, such as personality test design and verbal reasoning test design.

Our specific psychometric test design specialities

Strengths Assessment ~ Disability Assessment ~ Leadership Assessment ~ Situational Strengths ~  Intelligence Strengths Test ~ Skills test design / Psychometric ~ Assessment standards ~ Competency Design ~ Realistic Job Preview Design.

Our Bespoke Psychometric Design Principles

For training consultancies, Rob Williams Assessment Ltd has built coaching and employability profiling tools.With recruitment agencies we’ve designed culture and role fit personality-based assessments to improve the efficiency of matching clients to the best-fitting jobs.

Our Personality Fit Designs

An increasing number of recruitment agencies use Rob Williams Assessment Ltd’s bespoke algorithms for culture fit to improve their shortlisting process for key clients. More accurate shortlisting created efficiencies for both the agency and for their clients. Shortlisted candidates are also happier; knowing that the agency is only putting their details forward to the jobs and companies that are the best fit for them, personally.This reflects a growing recognition of the need to match individual applicants to organisational culture. Typically graduates favoured corporate cultures but now often find the culture of SME’s/start-ups to be more engaging and ‘meaningful’.

Best practice situational judgment test design

Rob Williams Assessment Ltd successfully applied the British Psychological Society’s best practice in personality questionnaire design.Both bespoke personality questionnaire met the client’s aim of measuring the most commonly sought graduate personality traits.Employers are increasingly using bespoke situational judgement tests (where the candidate is presented with scenarios and asked to select the best and the worst thing to do next) as a way to learn more about their character and attitudes to work.

Psychometric Test Design. Man at desk working at laptop, picture taken from the top

Our branching situational judgement test designs

As psychometric tests have become more commonplace, the bigger users have commissioned their own bespoke situational judgement tests.Rob Williams Assessment has worked on several such projects for High Street banks and for the European Union. Another recent innovation of test developers has been online adaptive tests. With these tests, if you are doing well, you will find that the questions get progressively harder.The innovative design of shorter and more efficient tests was driven by an increasingly aware of the immediacy of the Internet and our increasing use of emails and social media, in short, sharp bursts.This discourages test takers from spending 30-40 minutes online doing the same questionnaire. It’s better for everyone to keep test takers engaged when being tested – not bored!

Adaptive psychometric test design

So what will adaptive tests mean for you as a prospective test taker?  The biggest difference is the shortness of the test. The second major difference is that you will find an adaptive test more challenging. Without getting into their highly technical make-up, the test adapts to your ability level. More specifically it adapts to find the most challenging question that you can answer correctly.In the past, you may have found questions on a test fluctuating in difficulty or generally becoming more and more difficult the further on you get in the test. Consider a test of twenty questions with the first the easiest and the twentieth the most difficult.

Knowledge-based situational judgement test design

Some or all of the scenarios presented in an SJT can test specific job knowledge. For example, a retail marketing SJT may ask questions about the 3Ps (price, position, promotion) of product marketing. Alternatively, both an SJT measuring generic decision-making skills may be used alongside a knowledge-based test.

Our video-based situational judgment test designs

Simulated situational judgement tests are increasingly common as recruitment sifts. Adding 2D or 3D workplace scenario graphics brings the situational judgment test scenarios to life. This can only promote the company brand and make employers using simulated situational judgment tests more desirable employers.

psychometric test design

UK and US psychometric test publishers have produced both video-based and animated SJT scenarios. Animated SJTs are easier – and therefore cheaper – for global companies to develop.

Our consultancy work focuses on aptitude test practice and bespoke psychometric test design. We believe in the benefits of practice and ensuring fairness in testing. We, therefore, offer some practice aptitude tests and some practice psychometric tests. The intention is to promote as ‘level a playing-field’ as possible.

We always follow BPS Standards in Psychometric Test Design

To be psychometrically sound a test must be:
  • Objective – the results obtained are not influenced by the administrator’s personal characteristics or irrelevant factors such as the colour of a test taker’s socks.
  • Standardised – the test is administered and scored according to standard procedures and people’s scores are compared to known standards.
  • Reliable – the test measures in a consistent way. The potential error is small and is quantifiable.
  • Valid – the test measures the characteristics which it set out to measure. A test used to select a job applicant should predict job performance. A test of verbal ability should predict this area and not some other skill.
  • Discriminating – the test should show clear differences between individuals on the behaviour being tested. It should not be discriminatory I.e. unfairly discriminating against minority groups on the basis of irrelevant characteristics.

Our Bespoke Psychometric Tests have validity

While reliability deals with consistency and accuracy of the measurement device, validity deals with the theoretical concept that the test supposed to measure. In general terms validity refers to the extent in which a test measures what it is meant to be measuring. For example, if a test supposed to measure extroversion, it is expected to ask questions about social behaviour, and to correlate with other tests that measure similar concepts such as outgoingness. There are five types of validity, these are:
  • Firstly, Predictive validity.
  • Secondly, Discriminate validity.
  • Thirdly, Content validity.
  • And then next, Face validity.
  • Plus finally, Construct validity.

Predictive Validity

Predictive validity is an indication of how well the test score predicts future behaviour. For example, test used for employment selection are expected to correlate with job performance.

Content Validity

The extent in which the test samples relevant aspects of the psychological function. For example, the psychological function of extroversion is made of being socially bold, eccentric, sociable, and lively.A test measuring extroversion should use items that reflect all of these components. The content validity cannot be assessed numerically.

Face Validity

Psychological tests involve translating psychological functions into questions or requests. Sometimes the translation seems alien and hardly related to the actual psychological concept.Face validity refers to the feeling of the test users regarding how appropriate the test seems to them. For example, driving test seems very appropriate to measure driving behaviour as its requests are very similar to the ones required by actual driving.

Construct Validity

Psychological tests are translation of psychological concept into test items. The extent to which the test items actually measure the psychological concept, is known as construct validity.For example, if a test supposes to measure verbal reasoning and it uses uncommon words, it might measure other concepts such as language proficiency rather than verbal reasoning.The construct validity is assessed indirectly by correlating the test scores to another test score that is known to measure the same psychological concept. For example, the construct validity of a new test measuring dominance might be assessed by correlating it to the Humble versus Dominant scale on the 16PF or the Dominance scale on the EPPS.

psychometric test design

Example Psychometric Test Projects 

Rob Williams has over twenty years of experience of bespoke psychometric test design. Plus ten years prior to this spent working for several of the UK’s leading test publishers. These include IBM, OPP, SHL and HireVue.British Airways blended assessment project:
  • Firstly our BA Situational judgment test design;
  • Secondly,  our Numerical reasoning test design;
  • Thirdly our Verbal reasoning test design
  • Finally our Personality questionnaire design.


  • Best practice was followed throughout the design process in SJT design.
  • An SJT was produced which successfully incorporated a range of care home-specific scenarios.
  • The most suitable set of scenarios could be hand-picked at the SME panel meeting, as well as gaining buy-in and discussing implementation.
  • Providing some scenarios for telephone interview sifts.
  • Setting a suitable cut-off and validating the tool.
EPSO Test Design
  • Development of project management test.
  • Design of IT skills-based aptitude tests.

Psychometric test design

Our Work Styles assessment designs

We will work with you to design the most suitable work styles tool to suit your needs. Our Bespoke Personality Questionnaire design process aims to:
  • Firstly, include key role dimensions.
  • Secondly, reflect the personality, attitudinal and motivational aspects of the role-specific dimensions.
  • Thirdly, have face valid questions.
  • Also, to be capable of completion in 20 minutes approx.
  • And to adopt a single-stimulus question format (Likert scale)
  • Plus, adopting a normative format of scoring utilising a sten look-up table (for each personality scale)
  • And finally, using a Social Desirability scale to deal with the issue of faking or extreme scoring patterns

Psychometric test design

Recent situational judgement tests trends

Situational judgement tests (SJTs) have also become prevalent in graduate recruitment. These tests present scenarios to applicants and ask them to select the best and the worst thing to do next. SJT’s are very popular in the United States due to their excellent record of fairness across different ethnic groups.

Recent personality test trends

Candidates may also have to take a personality test as part of the recruitment process. There is a vast array of personality tests, which pose questions about a candidate’s behaviour and personal preferences. A typical question may ask whether you prefer attending parties or staying home with a good book. These personality tests help employers to determine whether a candidate has the right profile for the role.

Graduate Aptitude test designs

  • Our first point is that a well-designed selection procedure focuses on predicting a graduate’s competence within a particular work context.
  • Secondly, that psychometric assessments only form one part of the selection procedure.
  • Our third point is that personality assessments can give an indication of how well an individual applicant will fit into the existing workplace or team.
  • Finally, psychometric assessments can assess which applicants are most suited to the demands of the vacant job in terms of both ability and personality factors.

Other psychometric test design pillars

  • Standardization
  • Reliability – means if I take the same test next week, my results will be similar.
  • Validity – means the test measures what it says it measures. MindX knows which personality traits are measured because we have compared our results to well-established personality tests. HireVue validates its video assessments using high performer data and job analysis results.
  • Being Non-discriminatory


Since the first IQ tests were developed the whole point is to compare with a group of previous test-takers.


There are two key types of psychometric reliability:Firstly, Internal Reliability / Internal Consistency – Whether all the test items measure the same concept. It can be assessed in two ways. The first method is known as the split-half reliability which require correlating the score based on half of the test items with the score based on the other half (e.g., scores on odd and even items). Alternative method is item-total reliability which requires correlating each item score with the total score of the rest of the items. There is a statistical measure called Cronbach alpha that summarises all the correlations into one figure. A test should have an alpha of at least a = .80.Secondly, Test-retest Reliability relates to psychometric score consistency over time. In other words, how reliably a psychometric test measures. Taking as an example, some stable characteristic such as intelligence:
  • Then the psychometric test design must produce similar results if a group of candidates are examined using the same test at two points in time.
  • A time gap of at least two weeks between the two measurements is key. Since some psychological characteristics change considerably over time.
  • Then, the test re-test reliability is assessed by correlating the tests scores measured first time with the test scores measured the second time. Ability tests are expected to have a reliability of at least r = .75, yet personality tests might have somewhat lower reliability.


Every psychometric test has at least one norm group.The standalone score has little meaning, so a comparison with previous candidate performance is used. For example, percentile rank.

Piloting and Trialling

A trial item as part of a larger-than-necessary set of pre-test items. The  final, refined test is created by refining those items that work best. This is based on a statistical analysis of the trial item set’s results.

Aptitude Test Reliability

Firstly, the psychometric test must be a reliable measure. Most commonly the internal consistency index coefficient alpha or its dichotomous formulation, KR-20. Under most conditions, these range from 0.0 to 1.0, with 1.0 being perfectly reliable measurement. A reliable test may still not be a valid test.

Aptitude Test Validity 

Secondly, is the test valid? A measure of what it ‘says on the tin’? In our opinion, both the initial content validation and later criterion validation analysis are vital for any bespoke psychometric test.We recommend collecting additional recruitment data over time so that additional validation studies can be conducted. Such as assessment centre data.There are many other types of psychometric test validation evidence, and one-off studies investigating a psychometric test’s criterion validity are common.

British Army Aptitude Test Design Example

Psychometric lead role with Kenexa IBM; managing twenty associates.
  • We developed over twenty psychometric tests;
  • Situational judgement tests for Officers and for Soldiers;
  • Realistic job previews for Officers and for soldiers;
  • Ability tests (including problem-solving test) for Officers;
  • Ability tests (including a spatial reasoning test) for soldiers;
  • Officer personality questionnaire;
  • Soldier career guidance tools.

Aptitude Tests’ Vital Role

Many companies today are turning to testing and assessment tools to help them address these challenges and make more substantive and data-driven hiring decisions. Assessments are a great way to level the playing field and evaluate many candidates for the same skill sets in an objective fashion, using real-life technical scenarios that mirror the work they will be performing on the job. Automated assessment tools in particular can scale to make better use of your time and resources. Several key recruitment benefits are listed below.

Differentiation Benefits

Providing unbiased assessments is a great way to distinguish yourself to candidates in a crowded hiring environment. Candidates will see that you really care about hiring the most qualified tech workers in a manner that is unbiased and uncovers their true value.

Biodata tool psychometric design

Test questions asking about previous working and life history facts. Biodata questions can include personal attitudes, values, beliefs. There are therefore both autographical and biographical perspectives. For example, how effective previous working relationships were with managers and/or colleagues.

Biodata test design

Whilst biodata was popular in the 1970’s/80’s in the UK, it fell out of fashion due to concerns about face validity. Face validity is how job-relevant a test’s questions appear to be. This is difficult to show with biodata’s indirect approach; posing biodata fit questions about past behaviours which can seem intrusive.Firstly in our opinion, these suggestions cannot explain the predictive power of some biographical items.In our opinion, construct validity studies of biodata scores are mainly based on the notion that past behaviour is the best predictor of future behaviour.

NEO personality test example

  • This is for one of the most well-researched personality tests used in both the US and the UK.
  • There is considerable research supporting the “Big Five” model of personality.
  • Like the 16PF5 personality questionnaire – the NEO is based on academically rigorous factor analysis.
  • There is a systematic model behind the set of NEO personality questions.

psychometric test design

Personality Values Test Design

  • 3-4 scenarios to assess each value.
  • Totalling approx. 15-16 questions.
  • Provides accurate and meaningful feedback to each respondent.

Values-Based Simulation Exercises

  • Scenarios from the job analysis can be used to design simulation exercises.
  • Parallel version developed to maximise exercise integrity.
  • Minimise risk of applicants sharing details of tools.
  • Compromising the validity of the assessment process.

Values Assessment

We can design values assessment exercises design to focus on any values framework.Values based recruitment may involve values based structured interviews, values based selection centres, values based situational judgement tests and/or values based personality tests.

Motivation Tests

Values impact goal content whereas personality traits impact the efforts that individuals make towards their goals.Registered Office: Rob Williams, 31 Bruton Way, Ealing, London, W13 0BYCompanies House Registration No: 6572976