Đăng ký Đăng nhập
Trang chủ Giáo dục - Đào tạo Tiếng Anh Bồi dưỡng học sinh giỏi tiếng anh thpt chuyên đề assessment and testing in efl c...

Tài liệu Bồi dưỡng học sinh giỏi tiếng anh thpt chuyên đề assessment and testing in efl class at high school level

.PDF
18
1785
87

Mô tả:

Thai Binh Gifted High School Assessment and Testing in EFL Class at High School Level by Nguyen Thi Hong Hung 1 A. INTRODUCTION It is undeniable that English is an international language. In Viet Nam, English has become more and more important. It is now commonly used in technology, science, education, culture and economic activities. This leads to the increasing demand for English learning and teaching throughout the country. There are a lot of language centers in big cities. More importantly English has been a compulsory subject in schools and universities. More attention has been paid to English teaching and learning in gifted schools especially in gifted English classes. Nevertheless, how to teach, to learn English successfully is not simple. Besides teaching writing, reading, listening, speaking English, teachers need to be aware of the importance of assessing and testing . Testing and assessing will help teachers and students adjust the teaching and learning methods so that they can have the best resultsl. Within a limited scope of this paper, we would like to deal with the investigated issues that relate to “assessment and testing in EFL at High school level.. By writing this paper, we aim to cite some theoretical knowledge related to assessment and testing , the different testing and assessment activities. B. BODY Part 1. Fundamentals in theories of assessment and testing 1.1 Assessment, testing, measurement, evaluation It is important to distinguish between these four terms. Many people in the applied linguistics seen to make no difference, or at least no special distinction, between these terms. However, these terminologies, although related, may have different goals. Below is the distinction between them discussed by Nitko (2011). Assessment is a broad term defined as a process for obtaining information that is used for making decisions about students, curricula and programs, and educational policy. Test is a concept that is narrower than assessment. It is defined as an instrument or systematic procedure for observing and describing one or more characteristics of a student using either a numerical scale or a classification scheme. Measurement is defined as a procedure for assigning numbers (usually called scores) to a specific attribute or characteristics of a person in such a way that the numbers describe the degree to which the person possesses the attribute. Evaluation is defined as the process of making a value judgment about the worth of a student’s products or performance. There may exist different definitions of these terms, but they all share the common feature that is the interrelation between these terms in an assessment process. 2 ACTIVITY 1 Decide the truth or falsity of each of the following statements. Defend your answer a. To make evaluations, one must use measurements b. To measure an important educational attribute of a student, one must use a test. c. To evaluate a student, one must measure that student. d. To test a student, one must measure that student. e. Any piece of information a teacher obtains about a student is an assessment. f. To evaluate a student, one must assess that student. 1.2. Purposes of assessment Before designing any assessment, teachers need to ask themselves “what is the purpose of assessment?” One way to answer the question is to thoroughly examine these considerations: what kind of information you want to obtain from the assessment, and how the obtains information from assessment will be used Nitko (2000) discussed some of the decisions teachers may wish to consider. a. Instructional management decisions Your classroom is a decision-rich environment. You must take many decisions, including planning instructional activities, placing students into learning sequences, monitoring students’ progress, diagnosing students’ learning difficulties, providing students and parents with feedback about achievements, evaluating teaching effectiveness, and assigning grades to students. Hence, assessment can be used to make decisions on - Instructional diagnosis and remediation - Feedback to students - Feedback to the teacher - Modeling learning targets - Motivating students - Assigning grades to students b. Selection decisions Assessment can be used to provide part of the information on which selection decisions are based. For example, college admissions are often selection decisions: some candidates are admitted and others are not. c. Placement decisions Placement decisions are made to assign students to different levels of the same general type of instruction, education or work. For example: teachers may assess students to decide whether they are placed in a fast-track or regular group. d. Classification decisions Some times we must make a decision that results in a person being assigned to one of several different but unordered categories or programs. For example: students may be assessed to be classified into groups of different learning styles or interests. e. Counseling and guidance decisions Assessment results are frequently used to assist students in exploring and choosing careers and in directing them to prepare for the careers they select. In this case, a series of assessment is combined, instead of using a single assessment result. 3 f. Credentialing and certification decisions Credentialing and certification decisions are concerned with assuring that a student has attained certain standards off learning. ACTIVITY 2 Circle those that can be the purpose of assessment in the following box To To To To To To To grade punish students report the progress get promotion adjust your teaching compare two teaching methods pilot a new scheme To To To To To To To motivate students identify areas for improvement raise money show off create learning opportunities end a program keep students in school Now, reflect on your own assessing experience, why do you assess your students? What kind of decisions have you made and in which assessing situation? --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------1.3. Types of assessment List of terms used to describe and classify assessments Criteria of classification By kind of item By how student performance is scored By degree of standardization By administrative conditions By the basis for interpreting scores By the use of assessment result Types of assessments Choice items (true-false, multiple choice, matching) Completion items Short-answer items Essay items Objective assessment Subjective assessment Standardized assessments Nonstandardized assessments Individual assessments Group assessments Norm-referencing Criterion-referencing Formative assessment Summative assessment Diagnostic assessment 4 By the time of assessment By the domain to be assessed By the language emphasis of the scoring Placement/classification/selection Assessment Final assessment Continuous/on- going assessment Product assessment Process assessment Verbal assessment Performance assessment a. Objective versus subjective assessment A true-false and multiple choice test is said to be objective because once the scoring key is set, nearly everyone who scores a student’s responses arrives at the same report. Essay items, portfolio, and performance assessments, on the other hand, have a history of being scored differently by different persons and differently by the same persons on different occasions. Because of this, they are said to be subjective methods of assessment. b. Standardized versus nonstandardized assessment Standardization can improve the objectivity of assessments as well as the validity of interpreting the results. Standardization is the degree to which the observational procedures, administrative procedures, equipment and materials, and scoring rules have been fixed so that, same procedure occurs at different times and places, insofa as is possible. c. Norm versus criterion referencing Norn-referencing interpretations describe assessed performance in terms of a person’s position in a reference group that has been administered the assessment. For example, you may report a student’s performance on a test as being ‘better than 80% of the class’. This report expresses the student’s standing in a reference group, but it does not state what the student knows or is able to perform. The reference group is called the norm group. Criterion-referencing interpretations describe assessed performance in terms if the kinds of tasks a person with a given score can do. It is important to note that both kinds of interpretations are important to understand how well a student is learning. d. Formative versus summative assessment Formative assessment is designed to assist the learning process by providing feedback to the learning, which can be used to highlight learning areas for further study, hence improve future performance. Self and diagnostic assessments are types of formative assessment with specific purposes. Summative assessment is for progression n and/or external purposes, given at the end of a course and designed to judge the student’s overall performance. Summative assessment is most useful for those external to the educative process, who wish to make decisions based on the information gathered. It generally provides a concise summary of student’s abilities which can easily be understood as a pass/fail or a grade. It is however not useful for communicating more complex data about student’s individual abilities, which can then be used to inform further study and improve student performance. e. Final versus continuous/on-going assessment Final assessment is the one taken at the end of a course while continuous or on-going assessment is scattered throughout the course. The primar advantage of final assessment 5 is that it is simple to organize and condense assessment process into a short space of time. It means, however, the timing of the examination becomes of great importance. Illness at an unfortunate time can unduly influence the result. Moreover, final assessment cannot be used for formative purposes. The main advantages of continuous or on-going assessment is that both teachers and students obtain feedback from the process which can then be used to improve teaching and learning period. Disadvantages include the increased workload inherent with this mode of assessment, and difficulties associated with students from different backgrounds tackling the same material and being assessed in exactly the same way. f. Product versus process assessment With the rapidly changing nature of modern society, increased emphasis is being placed on skills and abilities, rather than knowledge. It is therefore important to consider whether you wish to assess the product of student learning, or the process undertaken. Product-driven assessments are usually easier to create, as the assessment criteria seem to be more tangible. They can also be more easily summarized. Process based assessment however can give more useful information about skills, and therefore can highlight for students the importance of learning generalized techniques, rather than specific knowledge. g. Verbal versus performance assessment Verbal assessment call for observing the verbal responses of students: for example, how well they can define words, explain their answers, or define similarities or differences between concepts. Most school assessments are verbal because schools emphasize verbal attributes. Other assessments are crafted to elicit and observe nonverbal response: assembling objects, completing experiments, performing psychomotor activities, and so on. These are called performance assessments. Although performance assessments emphasize nonverbal responses, verbal ability or language ability are also necessary. When assessing school learning targets, performance assessment focus on a student’s ability to apply and use knowledge from several areas to make something, produce a report, or demonstration. ACTIVITY 3 For question 1-5, match the situations in which a teacher sets a test with the reason for assessment listed A-F. There is one extra option which you do not need to use 1. The teacher has a new class. On the first day of the course, she sets a test which covers some language points she expects the students to be familiar with and others that she thinks the students may not know. The students do not prepare for the test. 2. The teacher notices that his intermediate students are making careless mistakes with basic question formation, which they should know. He announces that there will be a test on this the following week. The students have time to prepare for the test. 3. The students are going to take a public examination soon. The teacher gives them an example paper to do under test conditions 4. The teacher monitors students whenever they carry out speaking tasks and keeps notes about each student. 6 5. The class has recently finished a unit of the course book which focused on the use of the present perfect simple with ‘for’ and ‘since’. The teacher gives the class a surprise test on this. Reason for assessment A. To familiarize students with the test format B. To allow the teacher to plan an appropriate scheme of work C. To show students how well they have learned specific language D. To allow students to assess each other E. To motivate the students to revise a particular language area F. To assess students’ progress on a continuous basis ACTIVITY 4 For questions 1-5, match the assessment aims with assessment types listed A-G. There is one extra option which you do not need to use. 7 Assessment aims 1. to put students into a class at the correct level 2. to identify how much the class already knows about particular language items 3. to give student a test on language taught in the latest unit of their course 4. to keep a record of students’ performance , based on work completed throughout the course 5. to help students evaluate their own progress 6. to see how well students perform at the end of a course Assessment types A. B. C. D. E. F. G. continuous assessment placement tests diagnostic tests peer assessment self-assessment achievement tests progress test ACTIVITY 5 Which of the following are true for summative assessment (SA) or informative assessment (FA) in the appropriate box. is the practice of building a cumulative record of student achievement assists you to make judgments about student achievement at certain relevant points in the learning process or unit of study (e.g. end of course, project, semester, unit, year) assists teacher in modifying or extending their programmers or adapting their learning and teaching methods Can be used formally to measure the level of achievement of learning outcomes (e.g. tests, labs, assignments, projects, presentations etc) Is used to monitor students’ ongoing progress and to provide immediate and meaningful feedback Is very applicable and helpful during early group work processes Can also be used to judge programmed, teaching and/or unit of study effectiveness (that is as a form of evaluation) Usually takes place during day to day learning experiences and involves ongoing informal observations throughout the term, course, semester, or unit of study 1.4. Assessment OF Learning versus Assessment FOR Learning During the last few decades, there is increasing tension between assessment of learning and assessment for learning, whose fundamental differences lie in their assessment purposes. These are discussed in details in Earl and Katz (2006). 8 Assessment OF learning Assessment of learning refers to strategies designed to confirm what students know, demonstrate whether or not they have met curriculum outcomes or the goals of their individualized programs, or to certify proficiency and make decisions about students’ future programs or placements. It is designed to provide evidence of achievement to parents, other educators, the students themselves, and sometimes to outside groups (e.g., employers, other educational institutions). Assessment of learning is the assessment that becomes public and results in statements or symbols about how well students are learning. It often contributes to pivotal decisions that will affect students’ futures. It is important, then, that the underlying logic and measurement of assessment of learning be credible and defensible. Effective assessment of learning requires that teachers provide i. a rationale for undertaking a particular assessment of learning at a particular point in time ii. clear descriptions of the intended learning iii. processes that make it possible for students to demonstrate their competence and skill iv. a range of alterative mechanism for assessing the same outcomes v. public and defensible reference points for making judgments transparent approaches to interpretation vi. description of the assessment process vii. strategies for recourse in the event of disagreement about the decisions Assessment FOR learning Assessment for learning occurs throughout the learning process. It is designed he make each student’s understanding visible, so that teachers can decide what they can do help students progress. It assessment for learning, use assessment as an investigative tool to find out as mush as the can about what their students know and can do, and what confusions, preconceptions, or gaps they may have. Assessment Reform Group (2002) defines assessment for learning as the process of seeking and interpreting evidence for use by learners and their teachers to find out where the learners are in their learning process, where they need to go and how to get there. They suggest ten principles of Assessment for Learning as follow. i. Assessment for learning should be part of effective planning of teaching and learning A teacher’s planning should provide opportunities for both learner and teacher to obtain and use information about process towards learning goals. It also has to he flexible to respond to initial and emerging ideas and skill. Planning should include strategies to ensure that learners understand the goals they are pursuing and the criteria that will be applied in assessing their work. How learners will receive feedback, how they will take part in assessing their learning and how they will be helped to make further progress should also be planned. ii. Assessment for learning should focus on how students learn The process of learning has to be in minds of both learner and teacher when assessment is planned and when the evidence is interpreted. Learners should become as aware of the ‘how’ of their learning as they are of the ‘what’. iii. Assessment for learning should be recognized as central to classroom 9 Much of what teachers and learners do in classrooms can be described as assessment. That is, tasks and questions prompt learners to demonstrate their and interpreted, and judgments are made about how learning can be improved. These assessment processes are an essential part of everyday classroom practice and involve both teachers and learners in reflection, dialogue and decision making. iv. Assessment for learning should be regarded as a key professional skill for teachers Teachers requires the professional knowledge and skills to: plan for assessment; observe learning; analyze and interpret evidence of learning; give feedback to learners and support learners in self- assessment. Teachers should be supported in developing these skills through initial and continuing professional development. v. Assessment for learning should be sensitive and constructive because any assessment has an emotional impact Teachers should be aware of the impact that comments, marks and grades can learners' confidence and enthusiasm and should be as constructive as possible in the feedback that they give. Comments that focus on the work rather the person are more constructive for both learning and motivation. vi. Assessment should take account of the importance of learner motivation. Assessment that encourages learning fosters motivation by emphasizing progress and achievement rather than failure. Comparison with others who have been more successful is unlikely to motivate learners. It can also lead to their withdrawing from the learning process in areas where they have been made to feel they are no good Motivation can be preserved and enhanced by assessment methods which protect the learner's autonomy, provide some choice constructive feedback, and create opportunity for self-direction. vii. Assessment for learning should promote commitment no learning goals and a shared understanding of the criteria by which they are assessed For effective learning to take place learners need to understand what it is they are trying to achieve and want to achieve it. Understanding and commitment follows when learners have some part in deciding goals and identifying criteria for assessment criteria involves discussing them assessing progress. Communicating learners using terms that they can understand, providing examples of how the criteria can be met in practice and engaging learners in peer- and self-assessment. viii. Learners should receive constructive guidance about how to improve Learners need information and guidance in order to plan the next steps in their learning. Teachers should: pinpoint the learner's strengths and advise on how to develop them; be clear and constructive about any weaknesses and how they might addressed; provide opportunities for learners to improve upon their work. ix. Assessment for learning develops learners capacity for s assessment so that they can become reflective and self managing Independent learners have the ability to seek out and gain new skills, new knowledge and new understandings. They are able to engage in self-reflection and to identify the next steps in their learning. Teachers should equip learners with the desire and the capacity to take charge of their learning through developing the skills of self-assessment x. Assessment for learning should recognize the full range of achievements of all learners 10 Assessment for learning should be used to enhance all learners’ opportunities to learn in all areas of educational activity. It should enable all learners to achieve their best and to have their efforts recognized. ACTIVITY 6 Determining which of the types of assessment discussed in 1.3 are FOR or OF learning. Put a tick in the appropriate column. Types of assessment OF learning FOR learning 1.5. Test usefulness: Qualities of language tests Bachman and Palmer (1996) discuss the following qualities of language tests. a. Reliability Reliability is defined as consistency of measurement. It means a reliable test score will be consistent across different characteristics of the testing situation. In other words, a group of learners were to take the same assessment instrument on two occasions. That is, their results of these two occasions should be roughly the same provided that the conditions are constant. For a language test, the reliability be achieved through its size, specifically through a large number of test items in the test or a large number of learners taking the test. b. Validity Just as important as reliability is the question of validity. Does the assessed task actually fulfill its purpose? Does it give you the information you want about the students? Does the assessment enable you to make well-founded decisions? As exemplified by Rust (2002), just because an exam question includes the instruction analyze and evaluate does not actually mean that the skills of analysis and evaluation are going to be assessed. They may be, if the student is presented with a case study scenario and data they have never seen before. But if they can answer perfectly adequately by 11 regurgitating the notes they took from the lecture you gave on the subject then little more may be being assessed than ability to memorize. analysis and evaluation are going to be assessed. They may be, if the student is presented with a case study scenario and data they have never seen before. But if they can answer perfectly adequately by regurgitating the notes they took from the answer perfectly lecture you gave on the subject then little more may be being assessed than the ability to memorize. Validity has been extensively discussed in language testing. The validity of the test is the extent to which it measures what it is supposed to measure and nothing else, in another word, if a test can measure the right knowledge and skills of the learners which it wants to measure, it is valid. The reliability of a test ensures its consistency while validity ensures its meaningfulness. There are five types of validity in a language test: face, predictive, concurrent, content and construct validity  Face validity: if a test looks right to the lay eye(not too long, too short, too complicated or too easy), it has a face validity  Predictive validity: the predictive(or statistical or empirical) validity of a test is obtained by comparing its results with the results of some criterion measure such as - An existing test believed to be valid and given at the same time or - The teacher's ratings or other forms of independent assessment then or - The subsequent performance of the test takers on certain task measured by some valid test, etc. The test then is the predictor if it shows this validity in its relation to its future criterion which it predicts, for example scores of the test at entry to the 10th form of a senior secondary school related to academic degree success.  Concurrent validity: A statistical procedure establishes concurrent validity. The scores are related t an acceptable criterion which is concurrent and quantifiable. This validity is a useful check on a test's validity and all types’ tests can make use of it. Both concurrent and predictive validities are established by statistical correlation and need quantifiable criteria.  Content validity: If face validity is an appeal to the lay observer (non expert), content validity is an appeal to the subject expert (tester, teacher). The expert uses his own knowledge of the target language and language test to judge what extent the test provides a satisfactory sample of the syllabus, the acceptable and unacceptable content of the test. Proficiency and especially achievement tests depend on content validity a lot  Construct validity: A test, part of a test, or a testing technique is said have construct validity if it can be demonstrated that it measures just the ability that it is supposed to measure. The word construct' refers to any underlying ability, or trait, which is hypothesized in a theory of language ability c. Authenticity: Bachman and Palmer (1996) defined authenticity as the degree to which a given language test task's characteristics correspond to a target language use t features. Authenticity relates a test's task to the domain of generalization to which we want our scores' interpretations to be generalized it potentially affects test takers' perceptions of the test and their performance. 12 d. Interactiveness Interactiveness, according to Bachman and Palmer (1996) is the extent and type involvement of the test taker's individual characteristics in accomplishing a test task. Does the test motivate students? Is the language used in the test's questions and instructions appropriate for the students' level? Do the test items represent the language used in the as well as the target language? Al these questions represent the crucial elements that affect a test interactiveness. Many recent views consider this notion the core of language teaching and learning. e. Impact: According to Bachman and Palmer (1996), impact can be defined broadly interms of the various ways a test's use affects society, an educational system, and the individuals within them. In general terms, a test operates at the macro level of a societal educational system while corresponding to individuals, i.e., west takers, at the micro level, An aspect of impact that has been of particular interest to both language testing researchers and practitioners is what is referred to as ‘washback’, which is defined as 'the effect of testing on teaching and learning' (Hughes, 1992:1) f. Practicality: Bachman and Palmer (1996) coin the quality practicality as the relationship between the resources that will be required in design, development, and use of the test and the resources that will be available for these activities. They illustrated that this quality is unlike the others because it focuses on how the test is conducted. Moreover, Bachman and Palmer (1996) classified the addressed resources into three types: human resources, material resources, Based on this definition, practicality can be measured by the availability of the develop and conduct the test. Therefore, our judgment of the resources required language test is whether it is practical or impractical ACTIVITY 7 Some testing qualities are described in the following sentences (1-10). Choose the letters (A-F) from the box and write in the space provided. Some letters can be used MORE THAN ONE. A. Reliability D. Content validity B. Face validity E. Concurrent validity C. Construct validity D. Predictive validity Testing situations 1. Some MCQ questions in an IQ test had only three options. 2. A student got 45 in a math test and failed. He opposed to the test score and was re-marked by other two raters. One gave him 65 and the other 75 3. The program has communicative performance objectives but tests students using multiple-choice grammar tests 4. The scores during a student’s senior year and high school can provide information about this student’s first-year college grade point. 5. A researcher developing an IQ-test might ask his friends and relatives to read the questions and make their judgments. Qualities 13 6. An employment test is administered to a group of workers and then the test scores are correlated with the ratings of the workers’ supervisors taken on the same day. 7. A pie chart taken from a web-based source was used in the IELTS writing task but the colors of the chart were not clear and students were confused about what each color represents. 8. A teacher wants to test her students’ writing skill, so she asks her students to listen to a lecture and write a summary about it. 9. After the introduction of the new textbooks, teachers have taught communicatively from Grade 6 to Grade 9, emphasizing speaking and listening skills and fluency as well as accuracy. But on the Grade 9 exam there is still no speaking or listening component 10. For the past 8 years the Grade 9 exam has used passages, comprehension questions and grammar exercised for the exam by memorizing the book. This year, the Foreign Language Specialist writes the exam using parallel texts and exercises, not taken directly from the book, without warning anyone Part 2: Classroom-based assessment and testing 2.1. Test Specifications A test’s specifications provide official statement about what the test tests and how it tests it. The specifications are the blueprint to be followed by the test and item writers, and they are essential in the establishment of the test’s construct validity. Alderson, Clapham and Wall (1995) suggests a comprehensive framework to construct test specification, which entails criteria 1. What is the purpose of the test? Tests tend to fall into one of the following broad categories: placement, progress, achievement, proficiency, and diagnostic. 2. What sort of learner will be taking the test – age, sex, level of proficiency/stage of learning, first language, cultural background, country of origin, level and nature of education, reason for taking the test, likely personal and, if applicable, professional interests, likely levels of background (word) knowledge? 3. How many sections/papers should the test have, how long should they be and how will they be differentiated – one three-hour exam, five separate two-hour papers, three 45 minute sections, reading tested separately from grammar, listening and writing integrated into one paper, and so on? 4. What target language situation is envisaged for the test, and is this to be simulated in some way in the test content and method? 5. What text types should be chosen – written and/or spoken? What should be the sources of these, the supposed audience, the topics, the degree of authenticity? How difficult or long should they be? What functions should be embodied in the texts – persuasion, definition, summarizing, etc? 6. What language skills should be tested? Are enabling/micro skills specified, and should items be designed to test these individually or in some integrated fashion? Are distinctions made between items testing main idea, specific detail, inference? 7. What language elements should be tested? Is there a list of grammatical structures/features to be included? Is the lexis specified in some way – frequency lists, etc.? Are notions and functions, speech acts or pragmatic feature specified? 14 8. What sorts of tasks are required – discrete point, integrative, simulated ‘authentic’, objectively assessable 9. What is the relative weight for each item – equal weighting, extra weighting for more difficult items? 10. What test methods are to be used – multiple choice, gap filling, matching, transformation, short, answer question, picture description, role play with cue cards, essay, structured writing? 11. What rubrics will be used as instructions for candidates? Will example be required to help candidates know what is expected? Should the criteria by which candidates will be assessed be included in the rubric? 12. Which criteria will be used for assessment by markers? How important is accuracy, appropriacy, spelling, length of utterance/script, etc.? ACTIVITY 8 Study the objectives of a reading course targeted for a group of student at B2 level (CEFR). Then examine the test specification and the test, and make some comments. Course objectives On the completion of the course, the students are expected to meet partially B2 level by the Common European Framework for Reference Levels of Languages. In addition, students should have gained adequate knowledge related to certain business issues and situations at intermediate level. Specifically, students will be able to:  Acquire a sufficient range of language and knowledge related to business topics such as employment, trade, quality control and customer service, business ethics, leadership and innovation.  Demonstrate fairly good use of reading skills such as skimming and scanning, understanding examples and key details, and note-taking while reading texts that consist mainly of high frequency everyday or business-related language.  Tackle with confidence several reading question types, especially those in the BEC Vantage Reading Test. 2.2. CITAS Software to evaluate a multiple-choice test a. Test and Test Performance - mean test score (easy/difficult) - variance (how much diversity of the score) - histogram (distribution of test score) - standard error of measurement (SEM): the smaller the better 2SEM plus/minus the observed score ~a range we are 95% confident contains the true score - reliability (KR-20 or  ): 0    1 b. Item Analysis Item Difficult - p-value: proportion of examines who answer the item correctly - p < 0.50: difficult item, given 0.25 guessing for four-option MCQ - p > 0.95: quite easy item - average p ~ 60: difficult test - p range ~ 0.70-0.80 Item Discrimination - differentiating examinees of high & low levels 15 - item-total correlation or point-biserial correlation or R pbis - whether a student answering the item gets a high score correctly - R pbis the higher the better - R pbis negative  problems with key or too attractive distractive distractor(s) or too hard/easy item Distractor Analysis Attractivity of a distractor (a-value): proportion of examinees making that response, watch out when a>p Sample of Output 2.3 Alternative assessment Alternative assessment is an ongoing process involving the student and teacher in making judgments about the student’s progress in language using non-conventional strategies. Alternative assessments include performance assessment and continuous assessment, which have been discussed in the previous section. This part introduces two specific form of alternative assessment: self assessment and portfolio assessment. a, Self assessment Self assessment is described as the process in which learners simultaneously create and undergo the evaluation procedure, judging achievement in relation to themselves against their own personal criteria, in accordance with their own objectives and learning expectations (Henner-Stanchina and Holec, 1895). Some studies have reflected that application of self assessment is limited in formal education where the learning outcomes are fixed; end-of-course assessments are mostly used; classes are large; resources are limited; and most students are passive learner, and not autonomous in self assessment process. However, self assessment enables students to become more active, and to realize their strengths and weaknesses. Systematic self assessment may be done n the form of reflection. A simple example of self assessment by a 12-year-old student, done on completion of a project, illustrates the ability of reflection of a young learner. 1 What skills have I practised? Writing. When I take information of any book I learn new words and I write every time. Reading. I must read the information if want to know things for the project. 2 What language have I learnt? Some vocabulary (words) and some words I didn’t write well, now I write them better. About grammar not much. 3 What other information have I learnt? I have learnt a lot of thing about Incas, their life, their food, etc. 4 Disadvantages Sometimes is boring, but not all the time and I don’t find more disadvantages. 5 Advantages You learnt about the project (in my case about Incas). And to organize the work. Below is another sample of self assessment, in the form of continuous assessment card Unexpected End of Formula Name: Peter Anderson 16 Test No  Type of test and date Self assessment Test result Comments (by teacher or learner) 1 2 Interview Role-playing tasks 21 January 19 February ‘I thought I could ‘Went very well. But answer about half there were a few of the 10 questions words and phrases I satisfactorily. didn’t remember Weak on (Important?) pronunciation’ 7/10 Good ‘Slight under ‘You sounded a bit estimation blunt, perhaps’ Pronunciation not (Teacher) too bad’ (Teacher) ‘Better than I thought’(Student) 3 … … … … ‘Must practice polite phrases’ (Student) Continuous assessment card (Fulcher, 2010:72) b. Portfolio assessment A portfolio is a limited collection of a student’s work that is used to either present the student’s best work(s) or demonstrate the student’s educational growth over a given time span. Portfolios, however, are not just a collection of finished work(s), they can include biographies of work, range of work and reflections (Nitko, 2001; Gary, 1996; Wolf, 1989). Teachers and researchers admit portfolios are time-consuming and increase workload, but agree upon the benefit that they can lead to important changes in classroom. ACTIVITY 9 Discuss with your friend, and brainstorm how you can investigate alternative assessments in assessing your student’s language competence. 2.5. Framework for assessment Classroom learning is diverse from note memorization of vocabulary, facts, concepts to reasoning, critical thinking, problem solving. To help teacher identify and assess different kinds of academic learning, several frameworks for assessment have been developed. One of the most frequently used frameworks is Bloom’s taxonomy of the cognitive domain. One of the strengths of Bloom’s is that it is useful in developing instructional objectives and assessment targets. However, it has become out of date, and many experts have raised the concern on the hierarchical level of cognitive development. Suggested questions on next pages for Bloom’s revised taxonomy can be used to design test questions and instructional objectives. C. CONCLUSION: Testing and assessment play an important part in improving student performance. . 17 In order to make the testing and assessment more effective, teachers need to think carefully about a developing strategies techniques, It would be more advisable for teachers of language to join in an exchange club where every language issue will be discussed, shared, experienced and applied. 18
- Xem thêm -

Tài liệu liên quan