RESEARCH IN EDUCATIONAL EVALUATION AND EXAMINATION A TREND REPORT : PRITAM SINGH, VED PRAKASH
Ever since the formalization of the institution of examinations they have played a key role in influencing the educational process as well as in classifying students for various purposes. Because of this they have attracted the attention of researchers. Research in examinations during the last four decades calls for special attention as it was in these years that it ceased to be just a haphazard pursuit and became a consciously directed effort to study the subject in a concrete frame of reference. Surveys and reviews of studies in this field began to acquire significance. Long and Mehta (1966) published the first mental measurement yearbook about researches done in examination and evaluation. This was followed by the Directory of Behavioural Science Research in India (Pareek and Kumar, 1966). Dave (1968) reviewed re- searches in the area of examination and evaluation. Subsequently, research in examination and evaluation was reviewed by Buch (1972), Buch and Passi (1974), Passi and Padma (1974) and Passi and Hooda (1986). These surveys followed a system of classifying examination research and identified the problems and suggested the priority areas of research in this field. The previous surveys were based on researches done in India up to 1983. The present report takes into view the researches done till March 1988. There have been a total of 225 studies done up to date. Out of these, 88 are reported to have been submitted as theses for the Ph.D in Education, 11 in Psychology and two in Physical Education. Besides these, 124 are project reports. All these investigations have been classified in six areas and their periodical distribution is shown in Table 18.1.
Research in Educational Evaluation and Examinations seems to have been taken seriously by researcher
Table 18.1
Year Up 1950- 1955- 1960- 1965- 1970- 1975- 1980- 1985- Total
to 54 59 64 69 74 79 84 87
1949
Area
Achieve-
ment - - 2 10 16 17 17 10 1 73
Diagnos-
tic Test - - - - 5 5 2 2 - 14
Exami-
nations - - 2 12 17 16 17 13 5 82
Factors[
Affecting 1 2 5 2 4 3 2 2 - 21
Achieve-
ment
Predic-
tion-Ad-
mission - 2 2 6 4 3 3 1 1 22
Promo-
tion
Studies
Failures - - 2 4 2 - 3 1 1 13
Total 1 4 13 34 48 44 44 29 8 225
during 1960-79 when a number of Ph.D. studies and research projects in this area were undertaken. Though the problems undertaken were related to various allied fields also, the main emphasis was on Achievement and Examination. The maximum number of studies, i.e. 48 are reported to have been completed during 1965-69.
872
PRITAM SINGH VED PRAKASH
A slump period in the history of research in Educational Evaluation and Examinations again set in during eighties. Figures reveal that only 29 studies were undertaken during 1980-84 as compared to 44 during 1975-79. The situation got still worse in 1985-87 when only eight studies were conducted and the area of diagnostic testing remained completely untouched. Two out of six areas were mostly favoured by the researchers. The maximum number of studies (82) are reported to have been done on the Examinations side, followed by Achievement (73). The work done in areas like Failures (13) and Diagnostic Testing (14) is very meagre. Studies on Prediction- Admission Promotion (22) were also not adequate in number.
Investigators, while standardizing tests of various kinds, have used samples of different sizes and nature. A large number of studies were carried out on a sample of less than 3000 subjects. It was only in a very few studies that the sample used was as large as 7000 students. The sample for all these studies was drawn from various parts of the country. For studies like construction and standardization of an entrance test, the sample was drawn from a wide range of population. The techniques employed for drawing the samples differed very widely and included stratified sampling, random stratified, clustered multistage randomisation and simple randomisation. This was done on the basis of the nature of the related population and the purpose of the study. Besides, in some studies of special kinds, the sample was drawn from amongst teachers, guardians and students.
By and large, the researchers either used examination marks or developed achievement tests for completing their studies. The usual standard steps were employed for constructing achievement tests and question papers. Some researchers rigorously followed procedures such as content analysis, identification of concepts and objectives, development of design, preparation of blueprint, constructing test items, editing, trying out and item analysis. Besides estimating the coefficient of reliability and validity using different methods, norms for certain tests were also established.
It appears from the review that only two item parameters, namely, item difficulty and item discrimination, were widely used. Further, it seems that in quite a few of the investigations conditions required for carrying out item analysis were not seriously taken into consideration. It may be appropriate to mention here that if test performance is dependent upon various demographic and other variables, separate item analysis ought to be carried out for these subgroups. By and large, this was not taken into account.
Various methods have been employed for computing reliability such as analysis of variance, split-half, parallel form, KR-20, KR-21, and test-retest. Though many researchers used the test-retest method, the time gap between two administrations had been fixed casually. None of the studies introduced multiple time gaps. Multiple time gaps with test-retest or parallel forms should be studied to establish the stability coefficients. It is also observed that, in quite a few of the studies related to essay-type tests, content reliability had not been computed. The marks reliability for essay-type tests can be obtained as the product of content reliability and examiner reliability. In essay-type tests, the marks reliability is the content reliability attenuated by examiner unreliability. Besides, standard error of measurement which adds to the interpretation of the reliability of a test seemed to have not been reported in many studies. It is suggested that large-scale multi-facet studies of reliability, aiming to find out the error components due to various identifiable factors such as due to items, examiners, examiners, instructions, occasions, etc. be carried out.
Of 225 studies included in the report, as many as 180 are of a descriptive and correlation type, 23 are factor analytical, 15 regression analytical, one regression and factor analytical and six experimental. Experimental studies have used simple pre-test and post-test design with one treatment and one control group. The factor analytical studies have aimed at either evaluating the factorial validity of different instruments or classifying different school subjects into new families of subjects or
873
RESEARCH IN EDUCATIONAL EVALUATION AND EXAMINATION-A TREND REPORT
examining the nature of factors involved in the test batteries. The regression studies have aimed at establishing multiple regression equations and multiple correlation between predictors like SSC examination marks and criteria like college grades.
A review of studies on achievement tests indicates that investigators have constructed and standardized achievement tests in various areas such as general scholastic achievement tests, achievement tests in language, social sciences, mathematics and sciences. Besides these, efforts have also been made to develop some tests of miscellaneous types. A brief review of all these investigations following the said order is given below.
Tests of general scholastic achievement are tests of general educational development. They are of special interest because they measure complex learning outcomes that cut across subject-matter lines and are common to the major content areas of school. They emphasize understanding, interpretative skills and the ability to apply knowledge and skills to new situations. Since the learning outcomes measured by these tests are closely related to the ultimate objectives of education, such tests are especially likely to have a desirable influence on curriculum and teaching methods. These tests can be made use of to measure the general rate of student learning and to group students for various educational tasks. Keeping in view the usefulness of such tests for teachers as also for guidance workers, concerted efforts are needed for their development. Some efforts have been made by Parikh (1946 b), Liddle (1965), Jha (1974), Sharma (1975), Sharma (1976), Patel (1977), De (1979), and Shah (1982). Lele and Parikh (1965) constructed a Scholastic Aptitude Test for admission to preparatory science courses which comprised three sub-tests in En- glish, Numerical Ability and Abstract Reasoning. Liddle (1965) standardized an Academic Aptitude Test for high school students of Uttar Pradesh. It consisted of tests of vocabulary, numerical computation, sentence completion and mathematical reasoning. The co- efficients of reliability for each subtest as well as for the total test ranged from 0.83 to 0.89. The concurrent validity coefficients against scholastic achievement in terms of total scores ranged from 0.46 to 0.76. It appears from the review that the general scholastic achievement tests have been developed covering the subjects, English, Hindi, general science, mathematics, history and geography. These tests were constructed for the students of grades V to XI, in the states of UP, Gujarat, Punjab, Rajasthan and West Bengal. However, they have not been developed for the subjects of social studies, physics, chemistry and regional languages. Such tests for affective and psychomotor domains have not been developed at all. Some of the tests developed were content-oriented rather than oriented to basic skills and general educational development, which restrict their usefulness in these areas. Efforts should be made to overcome this limitation.
It is revealed from the review of research studies that language-skill-based achievement tests have been developed in English, Hindi, Gujarati, Oriya, Marathi and Kannada. In the English language, as many as 11 tests for Grades VI to XI have been constructed. A good deal of work has been done in this area by Aram, Rangaswamy & Feroze (1957), Buch, Patel and Kotwal (1960), Misra (1970), Deshpande (1972), Sinha (1967), Chatterji et al. (1970), Patel (1971), and Skariah (1981). As regards the Hindi language, a few achievement tests have been constructed by Shukla and Tutoo (1959), the CIE (1962), the Gujarat Research Society (1963), Jha et al. (1964), Sharma (1967), Deshpande (1972), Gaur (1973), Giri (1976), Verma (1977) and Joshi (1980). These studies provided tests for grades V to SSC level.
Amongst regional languages, Gujarati got the maximum attention from researchers in the area of achievement test construction. Efforts in this direction have been made by Buch, Patel and Kotwal (1960), Bhagatwala (1960), Maniar (1961), the Gujarat Research Society (1963), Bhatt (1971), Krishnamurti (1971), Maniar (1973), Pandya (1973), Parekh (1973), Desai (1974), Gohil (1974), Modi (1975), Bisnagari (1976), Patel (1978) and Upadhyaya (1979). Achievement tests in Gujarati are available for all grades from V to pre- university and for children in the age group three to five years. Besides, there are three other regional languages, namely, Kannada, Oriya, and Marathi, for which tests have been constructed. Dash (1967) standardized an achievement test in Oriya for Grade VII stu- dents, whereas Deshpande (1972) developed an objective assessment tool in Marathi for students appearing for the secondary education examination in
874
PRITAM SINGH, VED PRAKASH
Maharashtra. Shivananda (1981) standardized Reading Tests in Kannada for pupils of standards V to VII, separately.
It appears from the review that except in four regional languages namely, Gujarati, Marathi, Oriya and Kannada, no systematic and sustained efforts have been made to standardize tests in regional languages. Therefore, it is not only proper but highly desirable to gear up work on the construction of achievement tests in languages at the national level as well as at the regional levels. It is noteworthy to mention that the studies in the areas of reading speed, reading comprehension and listening comprehension are restricted to the Gujarati language. Efforts should be made to coordinate research studies carried out at the M. Ed. level also to have a better perspective. Attempts should also be made to develop achievement tests in each regional language for all grade levels.
The present inadequacy of tests in the social sciences urgently demands the designing of tests in this important area. Some efforts for the development of achievement tests in the social sciences were made by Aram et al. (1957), Shukla and Tutoo (1959), Buch et al. (1960). the Gujarat Research Society (1963), Saraf (1964), the SIE, Kerala (1965), Dash (1967), Muzaffar (1967), Srivastava (1967), Misra (1968), Vanajakshi (1970), Misra (1970), Deshpande (1972), and Sharma (1981). They developed tests in social studies, history, geography and civics only. The social studies tests were constructed for students of grades IV to VIII, in history for grades V to XI, in geography for grades V to VIII, X and XI and in civics for grades IX to XI.
It is observed that, by and large, achievement tests in social studies, history and geography are available for grades V to XI devised on the basis of studies of samples from various states. However, the general observation is that achievement tests in the social sciences have not been developed in all the subjects, even within a state, and for subjects like economics, sociology, etc.. they have not been standardized at all. An interesting investigation was carried out by Tiwari (1982) in which he tried to make a comparative study of trends of achievement measurement in civics in higher secondary examinations of various Boards of Secondary Education. His findings reveal lot of inconsistency with regard to difficulty level, objectives tested, etc.
Compared to the position in other disciplines, more tests have been constructed in mathematics, including arithmetic, algebra, geometry and trignometry. Some investigators like Aram et al. (1957), Maniar (1961), the SIE, Kerala (1965), Dash (1967), Kulkarni et al. (1970), Misra (1970), Vanajakshi (1970) and Bhatt (1971) constructed achievement tests in mathematics either for a doctorate degree or for institutional projects. As regards test construction in arithmetic, Chickermane (1943), Dave (1958), Buch et al. (1960), Pendharkar (1965) and Basu (1969) made some efforts and provided tests for grades III to X for the states of Maharashtra, Gujarat, Mysore and West Bengal. Besides these, Gokhale (1954) developed an achievement test for geometry, Jha (1974) for arithmetical and geometrical concepts, Gupta (1974) for algebra, trignometry and geometry, and Tewari (1975) for arithmetic, algebra and geometry. Sharma (1978) constructed a battery of sequential achievement tests for classes V to X. Besides, Ketkar 1982) standardized unit achievement tests for standard VIII for pupils studying in Maharashtra. With reference to the categories of Guilford's structure of Intellect Model, Chauhan (1982) constructed and standardized an achievement test in algebra for class IX.
It is observed that, by and large, validation of achievement tests in mathematics has been done using achievement scores as the external criterion. It is, therefore, desirable to establish other types of validity coefficients. Another important thing which needs mention here is that, with change of curriculum in mathematics, the available achievement tests would become obsolete. Therefore, it is not only proper but highly desirable to develop corresponding measuring tools.
A fairly large number of studies have been undertaken in this area. Of the various studies, there happen to be many of test construction in general science, physics, chemistry, botany, zoology and home science. Studies carried out in the area of general science were by Aram et al. (1957), Buch et al. (1960), Saxena(1960), Gupta (1962), the Gujarat Research Society (1963), the SIE. Kerala (1965), Dash (1967), Sheth (1967), Rup Prakash (1968), Vanajakshi (1970), the SCERT, Hyderabad (1971), Bhatta (1971) and Hira Devi (1973). By and large, the tests are available for grades V to VIII of the
875
RESEARCH IN EDUCATIONAL EVALUATION AND EXAMINATION-A TREND REPORT
states of Tamil Nadu, Punjab, Haryana, Maharashtra, Andhra Pradesh, Gujarat, Kerala and Orissa. For subjects like physics and chemistry, a few investigators like Bountra (1970), Gupta (1974), Tewari (1975), Sali (1977), Chhaya (1978) and Khandelwale (1981) standardized tests for high school and college students. They have drawn their samples from Uttar Pradesh, Haryana and Maharashtra, Kapoor (1968) and Garg (1969) standardized achievement tests in home science for secondary and higher secondary students of Uttar Pradesh respectively.
It appears from the review that achievement tests in other science subjects have not been constructed so far. Thus, it is desirable to develop more and more tools to keep pace with changing syllabi in science.