Vol. 6. No. 1

INT

June 2002

Grow Your Own: Online Placement Testing

Maggie Sokolik
University of California, Berkeley
<sokolik@socrates.berkeley.edu>

Jim Duber
duber dot com
<jim@duber.com>

A few years ago, the scenario at the University of California, Berkeley's Summer ESL Workshop was this: a hundred and fifty students would arrive on a Saturday night from Europe and Asia. Sunday morning, they would seek out the lecture hall where the ESL placement test was to be held. Jetlagged and disoriented, they would sit with number two pencils and fill in bubbles on an answer sheet, while a team of teachers answered their individual questions and proctored the examination.

After the exam finished, a squad of instructors would sit inside on a sunny Sunday afternoon, hand-marking each exam (we had no scanner for automated scoring), while the administrative staff sorted them into piles according to scores, collating them with personal data in order to achieve gender and linguistic balance in each classroom. The staff would further type in class lists to be printed and posted for students to consult early Monday morning.

On a good exam day, we finished the placement process before midnight, and didn't make too many mistakes.

Internet technology has changed this scenario substantially, as we have, over the past three years, been creating, testing, and implementing our own online test. Here's how and why we did it.

WHY NOT TO DO IT YOURSELF

The advantages and process of creating an online placement test are outlined below. However, it is important to consider first the potential problems or disadvantages of doing it yourself. If you are considering creating a placement test, you should ask yourself these questions:

Do you really need to write your own test? The process, as described below, requires a substantial amount of work in writing, piloting, and constant revision. There are many pre-written, validated examinations that are available for licensing or purchase: ACT (http://www.act.org/esl/), the University of Michigan (http://www.lsa.umich.edu/eli/testpub.htm), and Oxford University Press (http://www.oup-usa.org/esl/isbn/0194535835.html) are among those that offer ESL placement examinations in various formats, and at a variety of price-ranges.

Does the client (student)-side technology match yours? Your institution may be able to offer a highly-interactive and technologically advanced examination, one including streaming video, recording capabilities, and so forth. But, will the students be able to receive this technology on their own computers? It is important to understand the installed base, or typical computer configuration, for the students who will take your test, including typical processor and modem speeds.

Do you have the resources to create and support it? To write and maintain a test, you will probably need the following expertise and equipment:

experienced, knowledgeable test-item writers
a computer interface designer
an HTML coder/designer
a database programmer
someone with statistical training (to check item validity)
a "help-desk" or other assistant for technical problems and questions
a representative population with which to pilot the test
a server on which to house the test and its results
software for:
- creating web pages
- delivering web pages
- database development
- statistical analysis

Of course, one person can serve different roles, but you should be sure that any person involved in the project has ample time and is compensated for doing so. (This is not a job that can be passed on to a couple of competent instructors or administrators to do in their "spare time," unless they are qualified, paid, and willing.)

Is test security of extreme importance? Unless you can offer an in-house exam in a proctored environment, students will be taking the test on their own machines, with online dictionaries, grammar references, and possibly friends and relatives available to help out. You must determine whether this is acceptable. There is no way to control the chance that students might cheat, although good test design can help lessen that possibility.

OUR ADVANTAGES[1]

The decision to create our own placement test arose out of several problems we had experienced with the "out of the box" placement tests.

Validity for Our Students Although the commercial examination we were using was no doubt reliable and valid, after performing our own statistical analysis, we discovered that it was not a valid examination for our students, who tend to have advanced English skills. In the last year of our using the paper-and-pencil test, the students' mean score was 86%. Further analysis on the grammar section of the test told us that the grammar questions were yielding no valid data at all. In other words, regardless of what "level" a student was placed into, s/he performed consistently well in the grammar section. Our test was not giving us enough information to place our advanced students, who make up nearly half of our program.

Space-Time Continuum Our students aren't on campus at the best time to take the placement test. They are in Japan, Taiwan, France, Italy, and dozens of other places around the world. Our short summer schedule does not allow us to take a day away from teaching, and the dormitory schedule does not allow students to arrive a day early for testing.

Cost Benefits In spite of the fact that there is an up-front investment in creating a placement test, in the long run, we calculate we will save both time and money. Our ongoing costs will consist primarily of validating and revising test items and maintaining the databases. Teacher time in proctoring and marking has been nearly completely eliminated (see one exception below). The administrative staff still has to place students and collate data to create balanced classrooms, but the good news is we are now getting home well before dinner time.

Revisability In spite of careful writing and piloting, after only six students had taken the placement test, we discovered two problems with our newly released online exam:

The essay question was not being interpreted correctly. We were able to change the question with a minimal impact on the overall testing situation.
We discovered we had omitted one important demographic question in the introductory part of the exam (asking for the student's gender). Again, this could be added on the fly.

With paper-pencil tests, this kind of revisability is more difficult, if not impossible, once students put the pencil to the paper. Online testing offers the advantage of quick revision at any (reasonable) point in time.

HOW TO CONSTRUCT A GOOD TEST

There are ten basic steps to constructing a good test, whether for delivery online or on paper:

Establish content validity for your objectives: In other words, decide what kinds of questions you will ask, and how they will measure what you want to measure
Write test items
Establish content validity of your test items: This is the single most important step in test development--test items should evaluate the skills that they are intended to assess. A common problem with ESL/EFL examination questions that lack content validity is that they test a skill other than that intended. For example, the following listening question would lack content validity:

The student hears: "John and Mary are going shopping today. They'll buy six oranges, two melons, and three candy bars. How many items will they purchase?"

The task required to answer this question uses memory and mathematical skills, and only secondarily, English skills. In other words, the student's listening skills could be excellent, but if s/he cannot recall numbers or do mental addition, s/he is likely to answer the question incorrectly. So, this is an invalid question.

Do an initial trial of your test: This trial run is to detect bad questions, problems with instructions, etc. At this stage, you can have any available group trial the test, although the closer in profile the group is to your intended test-takers, the more valuable information you will gather.
Perform an item analysis: It is beyond the scope of this article to explain the statistical tests that can be done in an item analysis. (Any good educational statistics book or website will give this information.) An item analysis will tell you which test items are performing well, and which are not.
Revise your test items and create parallel forms: After you have analyzed your pilot examination, you will need to create a new set of test items, ensuring that they are parallel in form to the original items. This will protect the integrity of the test.
Determine how scores will be used: This part of the process depends greatly on how the outcomes will be used. Primarily, you will need to decide if the whole group will be divided into roughly equal class-sizes of certain scores, or whether firm cut-off scores will be used to determine entrance into a certain level.
Conduct a pilot test: Sample a group of at least 30 people who can take the test twice. This group should be made up of the persons similar to those who will take the real test, and should represent the range of students you will ultimately test. To establish the reliability of your test, give the test a a second time under the same conditions, and compare the scores on the two tests by calculating a reliability coefficient (the consistency of scores on the test).
Report scores
Conduct ongoing maintenance: Tests must be maintained in order to stay reliable and valid. Item statistics should be checked periodically to ensure that there are no out-of-range values. Because of the lack of security of online testing, parallel items should be used for every new testing period.

WHAT WE DID, TECHNOLOGICALLY SPEAKING

Our item-writing and testing procedure followed roughly that listed above. We piloted the exam over a period of two years with a total of a little more than 300 "live" students, and created a new test for 2002 using parallel forms. Validity testing has been done on all of the items, and non-performing questions have been abandoned.

During the pilot period, we constructed the examination using WebCT, a course management system (CMS). Because this CMS is not built specifically for testing, we found it "too big" for our purposes. However, it does provide a relatively simple interface for test development, scoring, and statistics-keeping, which made it a good platform for the pilot phase.

After deciding to move from WebCT, we chose to use Macromedia Flash 4 to create and deliver our exam. This decision allowed us to keep the overall file size small, to deliver graphics and audio, to interface with databases, and to use a commonly-installed browser plug-in.

The exam is linked to two web-based MySQL databases using PHP coding. These technologies (MySQL and PHP) are freely available as open-source software and have been designed especially for the purpose of interfacing between web-based applications and databases.

The Student Experience

When the students log into the exam, they see the following screen (Figure 1):

Figure 1. Login Screen

The student types in the identification number supplied by the university and presses "Continue" to start a search of the first online database. If a student enters an invalid number, or has previously started the examination, s/he cannot not proceed with the login. A successful login attempt results in the following screen (Figure 2) which confirms the student's identity if his/her identification number is found in the database:

Figure 2. Identity Confirmation Screen

After the student confirms his or her identity, a series of instructions appears, followed by the test itself. The test includes a listening section, which allows the student to listen only twice to each sound file (Figure 3):

Figure 3. Listening Comprehension Example

The final question of the examination is a writing question, asking for a short essay. All results and tracking details are submitted to a second database. Essays are evaluated by a committee of instructors, who are paid to participate (this is the one area where instructor involvement is still required).

Results Reporting

E-mail notification is sent directly to designated test administrators when a student has completed an examination. A sample results page can be viewed here. Essay answers are posted for teachers to evaluate online. When accessing the results database directly, the results can be sorted on any of the variables, the most important being the aggregate total score. The results can be downloaded and easily imported into a spreadsheet, and further manipulation of the data, including statistical analysis, can be done from there.

Conclusions

As should be clear from the above details, creating one's own online placement test is not as simple as writing a few multiple-choice questions. Of course, depending on a program's need for tracking, and the importance of the placement outcome, different approaches to test creation can be adopted. However, good testing practices allow few corners to be cut, and in fairness to the students, and to the instructors in whose classes the students will be placed, all efforts should be taken to create good, reliable, and valid tests.

[1]: For reasons of security, we cannot allow interested parties to view the examination.

Acknowledgment: Our ESL placement test development was underwritten in part by a Classroom Technology Grant from the Educational Technology Services at the University of California, Berkeley.

Table of Contents

Top

TESL-EJ Main Page

Editor's Note: Dashed numbers in square brackets indicate the end of each page for purposes of citation.