The other day, I bought a new shirt. As I unwrapped it and pulled out the dozens of hidden pins, plastic thingies, and cardboard inserts, I happened to check the pocket, and there it was: Inspected by No. 43. The little tag that No. 43 puts in every pocket of every shirt to reassure the buyers that this is a quality-checked garment, with all of the buttons in place, all the seams sewn, and with the correct size and color indicated on the tags. And then the thought came to me: Why have I never seen a tag like No. 43’s on any e-Learning product?
We encounter evidence every day of measures devoted to ensuring the quality of professional efforts, from little paper tags in clothing and in product boxes, to dress rehearsals in the theatre, galley proofs in publishing, and electrical and mechanical inspections in construction. Have you ever called a large company and heard “This call may be monitored to ensure quality?” Manufacturers and other businesses declare “Quality is our first priority!” and “The quality goes in before the name goes on.” There is no doubt that everyone takes quality very seriously, ever more so as legal, regulatory, fiduciary, and organizational demands for accountability continue to increase.
Of course, we have similar steps in the process for producing educational and training products. Usually this quality assurance work goes by the somewhat off-putting name of “evaluation.” We know that evaluation is important. In a 1997 survey on instructional design, William J. Rothwell found that designers ranked evaluation as their fourth most important duty, in a list of fifteen duties. The survey results are included in Rothwell’s book, Mastering the Instructional Design Process: A Systematic Approach, co-authored by H. C. Kazanas.
Something is missing, though. In Rothwell’s survey, designers also ranked evaluation as the seventh item in terms of how often they actually performed it. Is it possible that this difference might only be due to the small size of Rothwell’s survey group or to some defect in the study, or does it reflect the actuality that people believe that evaluation is important, but they just don’t do it?
The eLearning Guild has conducted several polls and surveys of its own in the last two years, and they tend to support the notion that evaluation may be more honored in the breach than in the observance. For example:
- March 2002: 23% of respondents said they didn’t measure e-Learning at all, 16% said they measured changes in job performance, 16% measured changes to the bottom line, but most measured effectiveness only by completion rate (40%) or by the learners’ test scores (33%).
- September 2002: About half of the respondents to the e-Learning Measurement Poll said they evaluate e-Learning only by asking whether learners liked the e-Learning experience or by testing whether they learned from it. Almost 41% reported measuring whether e-Learners could use on-the-job the skills and knowledge they gained, but only 13% said they measure the return on the e-Learning investment.
- February 2004: 36% of the respondents to the Usability Poll (not the same as the Usability Survey conducted in January, 2004) said they thought e-Learning courses could meet learning objectives even if they don’t have a high degree of usability. At the same time, 56% thought the opposite: that e-Learning without high usability would not be able to meet learning objectives. 7% weren’t sure either way.
- The Usability and e-Learning Survey conducted in January, 2004 surfaced several issues about evaluation methods, and about usability vs. “learnability.”
There could be many reasons for this state of affairs. Evaluation is time-consuming. In a world where everything moves faster all the time, “dotting the I’s and crossing the T’s” (if that is how evaluation is seen) may seem like a luxury. To decision-makers, insisting on evaluation may seem like academic fussiness. Out in the software world, it sometimes appears to be the norm to ship the product and let the users serve as the alpha testers to find and report the bugs. Pilot testing and learning labs (the equivalent of software’s beta tests) are expensive, and take people away from their jobs in under-staffed organizations.
In my opinion, another reason e-Learning evaluation doesn’t happen may just be that it’s complex and that not all of the steps are understood by everyone. e-Learning is a product of several skill sets and often involves the efforts of a team of designers, developers, project managers, and subject-matter experts. Each element contributed by these different team members requires particular evaluation measures, and there may not be an overall plan to coordinate them.
In this article, I’ll briefly review the elements of evaluation, and suggest some resources that may be useful to you in creating an evaluation plan to guide efforts during development.
How many kinds of evaluation are there?
If you asked educators and trainers, including e-Learning professionals, to name the activities and measures they include under “evaluation,” you might get a lot of different answers. Surely you would hear about evaluating learners based on their test scores. Some answers would involve learner responses on “smile sheets” or other course evaluation questionnaires. Measuring return on investment (ROI) would inevitably come up. e-Learning professionals with a software background would probably mention usability as an important element to evaluate. And there would be many other forms of evaluation included in the discussion, each of them important to someone.
Some of the evaluation elements mentioned will be more concerned with assessing the results of the learning experience. These measures are taken after development is complete and after the course or e-Learning has been completed by all or part of the intended audience.
Other evaluation activities will have more to do with finding, during development and before release, any problems in the design or the materials that may later interfere with learning. Once the problems are found, they can be fixed.
Figure 1 sorts out these two kinds of evaluation activity. The formal name given by instructional designers to the “find and fix” activities is formative evaluation, because it helps designers give form to the instruction. The other group of activities is called summative evaluation because it sums up the outcomes in the real world. Both sides are important.
FIGURE 1 Evaluation is a complex matter, with a number of different elements.
I like to think of formative evaluation as quality assurance for the manufacturing side of e-Learning development (research and development, engineering, and production). Summative evaluation is, to me, more a part of the customer relationship aspect of the business, verifying satisfaction and value, and providing support for future demand for our product.
However, there is also a lot of truth in the management maxim that, “You can’t inspect quality into a product.” The instructional design and software development processes must support the evaluation effort. Three things are critical to overall e-Learning quality:
- A system must be in place to do the front-end analysis of needs and learners, and to identify the desired outcomes of each e-Learning product;
- Learning objectives, stated in ways that can be measured, must support the outcomes; and
- Standards, supported by processes and templates, must be in place to give structure to the design and development efforts.
The summative evaluation elements are probably more generally familiar to everyone, including decision-makers. These are the smile sheets, end-of-course evaluations, and ROI measures that take up a lot of an e-Learning manager’s attention. None of them are likely to result directly in redesign, although unsatisfactory summative evaluation results can certainly spark review or abandonment of courses and materials.
There are a number of methods in use, but all summative evaluation measures results or outcomes of instruction in four areas external to the instruction itself: acceptance, effectiveness or validity, transfer to the job, and worth or value.
The traditional feedback forms and questionnaires provided to learners at the end of most synchronous and asynchronous e-Learning seek the reactions of the learners to the content and format of the e-Learning. Sometimes the managers of the learners are also surveyed for their opinions. These measures are very similar to those used in traditional classroom-based courses.
It is also common for e-Learning managers and decision-makers alike to attempt to “intuit” acceptance based on e-Learning completion rates and from the willingness of participants and their managers to use available e-Learning. However, there are so many factors that affect completion and enrollment rates that an additional specific questionnaire would be helpful.
Effectiveness or validity
Normally we judge the effectiveness of e-Learning from the criterion test results, if there is a criterion text associated with a given course. Frequently there is not a criterion test. In these cases, it is possible to obtain some measure of effectiveness by asking learners to rate themselves and to rate their confidence level after completing the e-Learning.
Transfer to the job
It can be much harder to evaluate skill and knowledge transfer to the job, and in fact it appears that this is done much less often. Usually evaluation is done subjectively by asking learners and their managers whether the skills and knowledge learned are being used when there is an opportunity to do so. This is normal for “soft” skills. It may also be possible to observe the learners on the job or review their actual measured production, error rates, etc., although there are many other factors besides what they learned that may affect the results.
Worth or value
If available, it is always best to use quantitative information when showing the worth or value of e-Learning. You may find my April 14, 2003 article in Learning Solutions Magazine (“Doing the Numbers: Return on Investment for e-Learning”) useful.
You can also get a sense of the measures used by organizations from the recent eLearning Guild Survey, “Showing the Value of e-Learning” (December, 2003). That survey assessed a variety of value measures according to whether they were used with internal or external customers. Some of these overlapped with measures described above as “acceptance” or “effectiveness.” For internal customers, the most commonly used measures were course enrollments and completion rates, and expressed customer satisfaction. For external customers, there were fewer standout measures: only course enrollments and customer satisfaction were used by over 50% of respondents. LMS or other tracking data and surveys were the most commonly listed sources of critical value measures.
In addition, I recommend that you always ask managers to estimate the worth, value, or other benefits. These are qualitative responses, but they are also a great source of testimonials and a valuable way of building a dialogue with decision-makers. It is always more effective to have your customers giving evidence of the value of your products than to only quote your own data.
Formative evaluation is a principal concern during design and development of e-Learning. There are multiple aspects to formative evaluation in e-Learning, as there were with summative. The focus is on finding deficiencies — things having to do with usability, “learnability,” and whether the software works that can be fixed.
(“Learnability” is a useful made-up word. It describes three elements: the training objectives, the training content, and the training methods.)
In classic Instructional Design models, formative evaluation appears as the last step, but in fact much of the evaluation function must be done concurrently with development. We no longer have the luxury of being able to wait until development is complete before we start the test-fix-retest cycle.
Guidelines for formative evaluation
First, don’t call it formative evaluation. Call it a tryout, a pilot test, or an executive preview.
Second, be ready to answer questions from decision-makers about the more formal parts of the evaluation process:
- Why do you want to do a formal pilot test?
- What information will we get from this tryout?
- How much will this cost?
- Will this delay product launch? If so, how much?
Third, have a plan for the evaluation that includes expert reviews, alpha tests and beta tests, and user tryouts with a small number of learners. Expert reviews should be done as development proceeds, not afterward. Factor in enough time before launch for correction of problems found in the alpha and beta tests and the user tryouts.
Expert reviews. Expert reviews address three important areas: delivery methods, content, and usability.
Delivery methods review: This is the function of the instructional designer(s) on the team. Ordinarily this will be completed as part of design work before the subject matter experts and the developers begin the actual work of creating the e-Learning application. However, this review of the instructional design may take different forms for different kinds of e-Learning. For example, a simulation will be evaluated along different factors (and perhaps later in the development process) than a tutorial, and synchronous e-Learning is likely to include evaluation factors not considered or not needed in reviewing an asynchronous application.
Another part of the delivery methods review involves evaluating the training objectives and test items:
- Objectives describe the intended outcomes. Do the objectives describe the standards we would like learners to equal or exceed?
- Criteria test items determine whether the learner has achieved the objective. Do the criterion test items closely match the objectives?
You will find much more on this subject in Robert F. Mager’s Measuring Instructional Results.
Another useful guide to evaluating e-Learning methods and design, based on actual research, is e-Learning and the Science of Instruction, by Ruth Clark and Richard E. Mayer. You will find a summary of this in Ruth’s article in the September 10, 2002 issue of Learning Solutions Magazine, “Six Principles of Effective e-Learning: What Works and Why.”
Content review: This is normally done by Subject Matter Experts. You will also want to have someone in an editorial role proofreading screens, graphics, etc. for typos, grammar, and sense.
Usability reviews: In his comments on The eLearning Guild “Usability and e-Learning Survey,” Joe Pulichino, Director of Research for The eLearning Guild, says that Heuristic Usability Testing was the clear leader in the “I’m not familiar with this” category (34% of the respondents). He goes on to comment, “When I discovered this method in the literature, I learned that this is a method that can save considerable time in the testing process while still yielding valuable and valid improvements in usability.” The method is described on the Web at http://www.useit.com/papers/heuristic/heuristic_evaluation.html .
Jakob Nielsen’s set of ten Usability Heuristics is probably most often cited for usability reviews (find it at http://www.useit.com/papers/heuristic/heuristic_list.html). While these are generally compatible with learning principles, be aware that you may have to do a bit of “adapting” in order to apply them to e-Learning. Other methods of testing usability are generally done in the “alpha” and “beta” testing stages. Some of these methods include cognitive walkthroughs (adapted from software engineering and described on the Web at http://www.cc.gatech.edu/computing/classes/cs3302/documents/cog.walk.html), testing with representative learners, and obtaining computer-supported feedback.
You will probably find it helpful to create your own worksheets to guide experts and team members in their work. Many of the articles published in Learning Solutions Magazine provide checklists or other information that you can use. I have listed several of these in the Sidebar.
Alpha testing of the software and infrastructure (does everything work correctly?) Development team members often perform the alpha testing. Usability testing can be started here, using walkthroughs. Alpha testing includes technical editing, which means assigning a designer or developer to follow instructions exactly, and then trying every possible response to everything in order to “break” the application. Finally, alpha tests should involve checking screen resolution problems and color palette inconsistencies, testing on both broadband and dialup connections, making sure plug-ins are not going to be a problem, etc.
Beta testing of the software and infrastructure. This is usually done with small numbers of individuals from the target population, with a design team member observing how learners interact with the application. Usability testing may take place here as well, mainly by observing the learners and collecting feedback from them and from the system. Beta testing is a software development function, checking the changes made as a result of the alpha tests.
User tryouts. The user tryout is the equivalent of a pilot run of a classroom course. It should duplicate as nearly as possible the conditions and delivery that will apply to the full target population. However, you will want to conduct follow up interviews with the users and their managers to more fully assess the experience and the results.
Evaluation is the key to quality e-Learning, and having a plan for the process is the key to evaluation. Include formative evaluation in your project management, and include summative evaluation in your implementation plan. This is the final step to ensuring that e-Learning applications are not themselves a barrier to learning.
You may find these books useful in developing your worksheets and evaluation processes.
Clark, Ruth Colvin and Mayer, Richard E. e-Learning and the Science of Instruction: Proven Guidelines for Consumers and Designers of Multimedia Learning. 2003. John Wiley & Sons.
Mager, Robert F. Measuring Instructional Results. 2000. The Center for Effective Performance.
Rothwell, William J. and Kazanas, H. C. Mastering the Instructional Design Process: A Systematic Approach. 1998. Josey-Bass, Inc., Publishers