The number of computers in schools for instructional use has increased dramatically in recent years to nearly nine million nationwide (Anderson and Ronnkvist, 1999). The ratio of students to computers is around 5 to 1. Internet connectivity continues to be a powerful force in education with technology being integrated more fully into the educational process (Market Data Retrieval, 1999). Though the statistics differ somewhat by grade, the vast majority (80-88 percent) of students report that they use computers at home or school to write short stories or papers and to learn things (U.S. Department of Education, 1998). Educators are using technology for instructional purposes including assessment and it is clear that the testing process can be enhanced through the use of technology (Sampson, 2000).
Educators use assessment for a variety of reasons that range from individual student assessment to program evaluation to system-wide educational accountability (Wiggins, 1993). Good assessment provides objective information that can assist in diagnosing problems and identifying curricular areas that need improvement (Anastasi, 1988). Assessments can help an individual find his or her way in the decision-making journey or help a school system travel the road to educational excellence. To navigate the journey accurately and safely, certain guidelines or rules of the road should be followed.
This paper highlights the rules or best practices that should be followed by educators in evaluating, selecting, and using technology-delivered assessments. The guidelines adhere to existing professional standards such as the Standards for Educational and Psychological Testing (1999), Responsibilities of Users of Standardized Tests (1985) and the Code of Fair Testing Practices in Education (1988). The guidelines are presented in checklist form so that educators can use it to perform an initial evaluation as to whether a technology-delivered assessment is of sufficient quality that it can be accepted for a particular educational use. Adherence to these guidelines is of vital importance in reviewing, selecting, and using technology-delivered assessment instruments.
* The test content has been described by the developer, reviewed by the user, and matches the purpose of the testing. Assessment items are aligned with the desired areas intended for assessment.
* Clear and supportable statements are provided by the test developer about what the test is intended to measure.
* Evidence is provided by the test developer and has been reviewed by the test user to determine if the assessment is appropriate for the test taker with regard to age, membership in a subgroup, educational level, disability, and language competence.
* There is ample evidence of comparability in the scores between the paper/pencil and technology delivered versions of the same test; where necessary, appropriate adjustments are made to scores to better assure comparability.
* Evidence is provided to indicate the conditions under which the test results have been found reliable. The strength of that reliability has been reported by the test developers and examined by the individual who is considering using the test.
* Validity evidence is provided by the test developer to substantiate suggested uses and interpretations. Test users apply and interpret the test results for purposes consistent with the validity evidence.
* Limitations of the test and test results are clearly specified by the test developer and examined by the test users prior to selection of the testing instrument. No assessment is without limitations.
* Published reviews of the assessments crafted by qualified persons have been reviewed prior to test selection and use.
* Test items and technical information have been examined to determine their currency. Careful examination of the age of the test and its technical information are considered before justifying the use of antiquated data or test versions.
* Assessment results are not over- or under-interpreted. Care has been taken not to develop interpretations or explanations that go beyond that which can be supported by the reliability and validity evidence.
* Test reports and supporting manuals specify appropriate and inappropriate uses of the assessment results.
* Reports include an indication of the degree of accuracy of the scores.
* Score reports specify which interpretations are supported by research and which interpretations are based on professional opinion. Sufficient information has been offered to allow the test taker to weigh the credibility of the opinion.
* Interpretations of test scores include the limitations of the test and test results, with reference to common misinterpretations of the assessment results.
* If scores are used in high-stakes situations such as graduation, promotion, college entrance, placement, and credentialing, an explanation on how the passing or cutoff scores were set is available to the test user.
* The score and interpretation report reveal where further information can be obtained about the test and score interpretation, and how a test taker can verify or challenge the accuracy of the score.
* When interpreting a test score, relevant aspects of other information about the individual, such as previous test scores, educational level, and other performance indicators are used to augment the interpretation. This information is used to help the test taker gain further insight into the meaning of his/her test results.
* Testing equipment is in good working order and the software and/or Internet connectivity has been verified to be operating properly.
* A site administrator is available to troubleshoot problems that may occur due to equipment, software, or other technology failures.
* Policies and procedures have been established, explained to the test taker, and applied in cases of technology failure. For example, if there is a computer crash or a power disruption, are the responses to the test items saved, or does the test taker need to begin the task again?
* Test takers are comfortable with the test format and use of keyboard or other equipment. If there is a question about the test takers' familiarity with the technology, practice exercises or tutorials have been provided to enable them to become facile with the equipment and the format, allowing the test taker to focus on the assessment rather than the mode of delivery.
* Test items and answers are protected from compromise. Security of the equipment and test items is critical to the fairness to test takers of current and future test administrations.
* The identity of the test taker is verified, particularly in high-stakes testing.
* Tests are administered according to the procedures specified by the test developer, particularly in cases where standardization is important.
* Both test users and test takers are informed about on whether individual score information is stored and, if so, where, and for how long. Periodic purges of individual test results stored locally or centrally may be advantageous in maintaining privacy. It may be more desirable for an individual to save test results on a personal disk rather than on a server or local computer.
* Test developers provide information indicating whether they subscribe to precepts of the various testing standards produced by professional organizations. Professional test developers who pledge adherence to various testing standards do attempt to produce high-quality assessments with sufficient technical and research support.
* Human judgment is provided where assessment results suggest specific actions on the part of the test taker or the specification of interventions to modify a situation and will never be replaced by technology alone.
* American Counseling Association (http://www.counseling.org)
* American Educational Research Association (http://www.aera.net)
* American Psychological Association in Science Directorate (http://www.apa.org/science)
* American Speech Language Hearing Association (http://www.asha.org)
* Association for Assessment in Counseling (http://www.aac.ncat.edu)
* Association of Test Publishers(ATP) (http:/www.testpublishers.org)
* Consortium for Equity in Standards and Testing (CTEST) (http://www.csteep.bc.edu/ctest)
* Council for Exceptional Children, Directory of Current Projects in Assessment (http://www.cec.sped.org/osep/6assessm.htm)
* Education Commission of the States (http://www.ecs.org)
* ERIC Clearinghouse on Assessment and Evaluation (http://www.ericae.net)
* ERIC Clearinghouse on Counseling and Student Services (http://www.uncg.edu/edu/ericass) * Joint Committee on Testing Practices (http://www.apa.org/science/jctpweb.html)
* National Assessment of Educational Progress (http://www.nces.ed.gov/naep)
* National Assessment of Educational Progress (http://www.ed.gov/pubs/ncesprograms/assessment/ surveys/naep.html)
* National Association of School Psychologists (http://www.naspweb.org)
* National Association of Test Directors (http://www.natd.org)
* National Center on Educational Outcomes (http://www.coled.wmn.edu/nceo)
* National Center for Education Statistics (http://www.nces.ed.gov/index.html)
* National Center for Research on Evaluation, Standards, and Student Testing (CRESST) (http://cresst96.cse.ucla.edu/index.htm)
* National Council for Measurement in Education (http://www.ncme.org)
American Educational Research Association, American Psychological Association, and National Council on Measurements in Education. (1999). Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association.
Anastasi, Anne (1988). Psychological Testing. New York, NY: Macmillan Publishing Company.
Anderson, R.E. & Ronnkvist, A. (1998). The Presence of Computers in American Schools, Center for Research on Information Technology and Organizations, University of California, Irvine and The University of Minnesota, June, 1999.
Joint Committee on Testing Practices. (1988). Code of Fair Testing Practices in Education. Washington DC: National Council on Measurement in Education.
Market Data Retrieval. (1999). Technology in Education, 1999. Shelton, CT: Author.
Sampson, J.P., Jr., Using the Internet to Enhance Testing in Counseling, Journal of Counseling and Development, Journal of Counseling and Development, Summer, 2000.
U.S. Department of Education, National Center for Education Statistics, The Condition of Education 1998, NCES 1999-011. Washington. DC: U.S. Government Printing Office.
Wiggins, Grant P. (1993). Assessing Student Performance: Exploring the Purpose and Limits of Testing. San Francisco, CA: Jossey-Bass Publishers.
ERIC Digests are in the public domain and may be freely reproduced and disseminated. This publication was funded by the U.S. Department of Education, Office of Educational Research and Improvement, Contract No. ED-99-CO-0014. Opinions expressed in this report do not necessarily reflect the positions of the U.S. Department of Education, OERI, or ERIC/CASS.
###