The Difficulty in Readability Testing

Since readability testing became a necessity for better patient understanding of leaflets, there has been a plethora of ways to test a leaflet.   To the outsider, presenting a leaflet to a participant, asking a set of preset questions and recording the responses could appear to be enough to gauge the user friendliness of a Package Leaflet (PL).  However, a simple yes/no dichotomic approach quickly yields an unforeseen variable which can aptly be demonstrated in the following hypothetical situation.

During a readability test, the participant is asked a question regarding a contraindication of the medicine.  After re-reading the entire leaflet, the participant is finally able to locate the correct information but over ten minutes have passed since the original question was asked by the interviewer.  As the participant has found the information, the answer is marked correct.

Translate this situation to real life: ten minutes of anaphylactic shock is ten minutes too long for a patient taking the medicine to realise the PL contained this contraindication.

Although dramatic, this example does highlight the need to also record the difficulty (or levels of difficulty) a participant exhibits in finding key safety messages within a PL.  It is noted that during readability testing, an interviewer should note both the participant’s ability to locate AND understand the pertinent information, but for these purposes, this article will solely focus on the difficulty analysis inherent in each aspect.

To date, the methodology for ascertaining difficulty in readability testing has yielded two common possibilities: a diagnostic-based approach or the more concrete time-based approach.  The later of the two is by far the easier to record in that, for each question asked of a participant during testing, the interviewer will record the time it takes for the participant to locate and/or understand the information asked.  The readability testing house can then set out its own protocol regarding evaluation of these times: 0-10 seconds could stipulate no difficulty, 10-20 minimal difficulty, etc.  The divisions are clear and the protocols can be made simple; however, the analysis of this concrete evaluation is far more problematic.

Again, an example to best illustrate the point: a 70 year old man sits as a participant of a readability test.  After being asked his first question, it takes this man 65 seconds to locate the correct information.  Another interview is conducted of a 22 year old university graduate.  She answers the same question in only 4 seconds.  Do they feature differing levels of difficulty?

The obvious answer is yes.  However, perhaps the older gentleman had poor eye sight or simply felt no rush to answer (this is a common aspect noted in testing).  The inverse situation can present other chasms: the 22 year old university graduate takes 65 seconds to answer a given question due to her dyslexia.

The above instances all highlight the influence of external forces impacting the evaluation method of a time-based assessment.  In either of the cases, it was not the inherent difficulty located in the PL that allowed for delayed answers but the difficulties that the participants had themselves which affected the results.  This distinction is paramount; readability testing must test the PL and not the participant.

Again it is via such examples that one can observe the inaccuracies of this is time-based approach for measuring difficulty.  Although this basis for evaluation can yield ‘neat’ results with clearly defined criteria, its evaluation can easily yield false data.

In overseeing the testing of countless leaflets in a variety of languages and cultures (again, time-based analysis can break down – does the 32 year old Italian male answer at the same pace as the 32 year old Brit?), this author favours the diagnostic-based assessment for difficulty which has, once properly trained and enacted, yielded far more accurate data and analysis of difficulty.

The training of the interviewer is the pinnacle of this diagnostic evaluation.  People with backgrounds in psychology or sociology are ideal to measure the intricacies of human behavior because it is here that difficulty can be ascertained.  Fidgeting, eye movement, sighing and/or turning over of the PL are only a few instances to be noted in assessment of difficulty by the interviewer.  Another invaluable aspect of this methodology is encouraging the participant to relay as much information possible.

This information does not have to be solely geared toward the actual testing process, but in easy, free flowing conversation, the interviewer can then note aspects of the participant’s personality, physical capabilities or even how they may feel that day.  All these aspects (and many more) will influence how a person performs during a readability test.  Throwing these aspects by the wayside via the time-based assessment of difficulty is to sell the process of readability testing short with questionable data for analysis.

With this diagnostic approach the challenge then becomes training and implementing these standards.  Time-based evaluation only requires a watch; diagnostic evaluation necessitates far more detailed SOPs and training.  And then of course, one must present all this in a clear, logical manner in the PL final report.

Author:   Ryan Smith

Arriello sro

Vinohradska 1188/58

Prague 3, 13 000, Czech Republic

(Tel)       +420 222 523 941

(Fax)      +420 222 523 942

(Mob)   + 420 775 359 341

This email address is being protected from spambots. You need JavaScript enabled to view it.

www.arriello.com

 

 

Share this post

Submit to FacebookSubmit to Google PlusSubmit to TwitterSubmit to LinkedIn