Tài liệu Nghiên cứu về sự dao động điểm bài thi nói của học sinh với các chủ đề khác nhau

.PDF

nghianguyenvan2552635 Báo vi phạm

Tải xuống 68

Mô tả:

This study investigated the influence of topics and raters on the speaking scores of 20 sophomores at FELTE, ULIS, VNU, as well as whether or not there exist differences between the two raters in the scoring process. The speaking topics were derived from an English test preparation course that these students are currently studying, and its format was the same as part 3 of the IELTS speaking test. A survey and interviews were utilized to collect the data, which were then analysed by paired-samples T-Test and content-based analysis respectively. The analysis of the result revealed that no significant differences were detected between the two raters’ scores of the candidates. However, the change in topics might influence the scores of the candidates. In addition, interviews with raters indicated that there exist huge differences between the two raters in their rating process, yet surprisingly, these differences did not seem to exert any significant impact on the scores.

VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF LANGUAGES AND INTERNATIONAL STUDIES FACULTY OF ENGLISH LANGUAGE TEACHER EDUCATION GRADUATION PAPER AN INVESTIGATION INTO STUDENTS’ VARIATION OF SPEAKING SCORES WITH DIFFERENT RATERS AND TOPICS A CASE STUDY: SECOND-YEAR ENGLISHMAJORED STUDENTS AT A LANGUAGE UNIVERSITY Supervisor: Dương Thu Mai Student: Ngô Phương Nga Course: QH2014.F1.E6 HÀ NỘI – 2018 ĐẠI HỌC QUỐC GIA HÀ NỘI TRƯỜNG ĐẠI HỌC NGOẠI NGỮ KHOA SƯ PHẠM TIẾNG ANH KHÓA LUẬN TỐT NGHIỆP NGHIÊN CỨU VỀ SỰ DAO ĐỘNG ĐIỂM BÀI THI NÓI CỦA HỌC SINH VỚI CÁC CHỦ ĐỀ KHÁC NHAU VÀ NGƯỜI CHẤM KHÁC NHAU: MỘT NGHIÊN CỨU VỀ SINH VIÊN NĂM THỨ HAI CHUYÊN NGÀNH TIẾNG ANH TẠI MỘT TRƯỜNG ĐẠI HỌC ĐÀO TẠO NGOẠI NGỮ Giáo viên hướng dẫn: Dương Thu Mai Sinh viên: Ngô Phương Nga Khóa: QH2014.F1.E6 HÀ NỘI – 2018 ACCEPTANCE PAGE I hereby state that I: Ngo Phuong Nga, QH2014.F1.E6SP, being a candidate for the degree of Bachelor of Arts English Language Teacher Education accept the requirements of the College relating to the retention and use of Bachelor’s Graduation Paper deposited in the library. In terms of these conditions, I agree that the origin of my paper deposited in the library should be accessible for the purposes of study and research, in accordance with the normal conditions established by the librarian for the care, loan or reproduction of the paper. Signature ……………………………… May 2018 ACKNOWLEGEMENTS Firstly, I would like to express my sincere gratitude to my supervisor, Ms.Duong Thu Mai, Ph.D., for her patience, profound knowledge, as well as her whole-hearted assistance in my researching and writing time. I could not imagine having a better supervisor for my thesis than her. Also, I would like to grasp this opportunity to send my thanks to the two teachers at FELTE, ULIS, VNU for their enthusiastic participation in my study, and for the incentive they offered throughout the realization of this paper. Besides my supervisor and teachers, I would like to send my thanks to the sophomores at FELTE, ULIS, VNU for their zeal in participating in the research. Last but not least, my deepest sincere thanks goes to my friend and my family, especially my grandmother for her tremendous support and encouragement. Without them, this research could not be fulfilled. i ABSTRACT This study investigated the influence of topics and raters on the speaking scores of 20 sophomores at FELTE, ULIS, VNU, as well as whether or not there exist differences between the two raters in the scoring process. The speaking topics were derived from an English test preparation course that these students are currently studying, and its format was the same as part 3 of the IELTS speaking test. A survey and interviews were utilized to collect the data, which were then analysed by paired-samples T-Test and content-based analysis respectively. The analysis of the result revealed that no significant differences were detected between the two raters‟ scores of the candidates. However, the change in topics might influence the scores of the candidates. In addition, interviews with raters indicated that there exist huge differences between the two raters in their rating process, yet surprisingly, these differences did not seem to exert any significant impact on the scores. ii TABLE OF CONTENTS ACKNOWLEGEMENTS ....................................................................................... i ABSTRACT ............................................................................................................ ii TABLE OF CONTENTS ...................................................................................... iii LIST OF ABBREVIATIONS ............................................................................... vi LIST OF FIGURES .............................................................................................. vii LIST OF TABLES ............................................................................................... viii PART A: INTRODUCTION ................................................................................. 1 1. Rationale ......................................................................................................... 2 2. Statement of research problem & questions ................................................... 3 3. Scope of the research ...................................................................................... 4 4. Significance .................................................................................................... 4 5. Organization of the study ............................................................................... 5 PART B: DEVELOPMENT .................................................................................. 6 CHAPTER I. LITERATURE REVIEW .............................................................. 7 1. Performance-based assessment ...................................................................... 7 1.1. Definitions of performance-based assessment ........................................... 7 1.2. Characteristics of performance-based assessment ...................................... 8 2. Assessing the speaking skills ......................................................................... 8 2.1. Spoken versus written language .............................................................. 8 2.2. Model of oral assessment ...................................................................... 10 2.3. Oral assessment process ........................................................................ 13 3. Topics in oral assessment ............................................................................. 18 3.1. Definition of topics ................................................................................... 18 3.2. The importance of topics in oral assessment ............................................ 18 iii 3.3. Related studies on topic influence in oral assessment .............................. 19 4. Raters in oral assessment.............................................................................. 20 4.1. The raters factor in second language performance-based assessment...... 20 4.2. Definition of rater reliability in language assessment .............................. 21 4.3. Factors that affect rating operation ........................................................... 22 4.4. Rater effects .............................................................................................. 24 4.5. Related studies on inter-rater reliability and rater effects in assessment.. 27 5. Chapter summary ......................................................................................... 28 CHAPTER II. METHODOLOGY ..................................................................... 30 1. Research questions .......................................................................................... 30 2. Research participants and the selection of participants .................................. 31 3. Data collection instruments ............................................................................. 33 4. Data collection procedure ............................................................................... 34 5. Data analysis procedure .................................................................................. 36 6. Chapter summary ............................................................................................ 40 CHAPTER III. FINDINGS AND DISCUSSION .............................................. 41 1. Research question 1: The students‟ speaking competence ............................. 41 2. Research question 2: The variation of the students‟ scores under the influence of different topics ................................................................................ 44 3. Research question 3: The variation of the students‟ scores under the influence of different raters ................................................................................. 46 5. Research Question 4: Differences between raters in the rating operation ...... 51 6. Chapter summary ............................................................................................ 53 PART C: CONCLUSION .................................................................................... 55 1. Summary of the findings and discussion ........................................................ 56 2. Implications ..................................................................................................... 57 iv 3. Limitation of the study .................................................................................... 57 4. Suggestions for further research ..................................................................... 58 REFERENCES ..................................................................................................... 59 APPENDICES....................................................................................................... 63 APPENDIX 1: Questions for the speaking test .................................................. 63 APPENDIX 2: Speaking scores of the test-takers .............................................. 64 APPENDIX 3: INTERVIEW FORM ................................................................. 77 v LIST OF ABBREVIATIONS CEFR The Common European Framework of Reference for Languages IELTS International English Language Testing System FELTE Faculty of English Language Teacher Education ULIS University of Languages and International Studies VNU Vietnam National University, Hanoi vi LIST OF FIGURES Figure 1. A conceptual framework for performance testing (Milanovic & Saville, 1996) .......................................................................................................... 10 Figure 2. An expanded model of speaking test performance (Fulcher, 2003) ...... 12 Figure 3. A framework for describing the construct definition for a test of second language speaking (Fulcher, 2003) ............................................................ 16 Figure 4. Traditional fixed response assessment and assessment involving judgement (McNamara, 1996) ............................................................................... 21 Figure 5..IELTS and the CEFR ("Common European Framework", n.d.) ........... 38 Figure 6. Descriptive statistics for the scores of topic 1 by rater 1 ....................... 41 Figure 7. Descriptive statistics for the scores of topic 2 by rater 1 ....................... 41 Figure 8. Descriptive statistics for the scores of topic 1 by rater 2 ....................... 42 vii LIST OF TABLES Table 1. .................................................................................................................. 37 Interpretation of Correlation Coefficient (Zou, Tuncali, & Silverman, 2003) ....... 37 Table 2. .................................................................................................................. 43 Descriptive statistics for students‟ speaking competence ...................................... 43 Table 3. .................................................................................................................. 44 Paired Samples Statistics for two topics by Rater 1 ............................................... 44 Table 4. .................................................................................................................. 44 Paired Samples Test Result for two topics by Rater 1 ........................................... 44 Table 5. .................................................................................................................. 45 Paired Samples Statistics for two topics by Rater 2 ............................................... 45 Table 6. .................................................................................................................. 46 Paired Samples Test Result for two topics by Rater 2 ........................................... 46 Table 7. .................................................................................................................. 47 Pearson correlation for topic 1 ............................................................................... 47 Table 8. .................................................................................................................. 47 Pearson correlation for topic 2 ............................................................................... 47 Table 9. .................................................................................................................. 48 Paired Samples Statistics for topic 1 by two raters ................................................ 48 Table 10. ................................................................................................................ 49 Paired Samples Test Result for topic 1 by two raters ............................................ 49 Table 11. ................................................................................................................ 49 Paired Samples Statistics for topic 2 by two raters ................................................ 49 Table 12. ................................................................................................................ 50 Paired Samples Test Result for topic 2 by two raters ............................................ 50 viii PART A: INTRODUCTION This initial part of the study aims to cover the background of the study, as well as its scope and significance. Also, the three research questions for this study will be mentioned. This chapter ends with the organization of the study to equip readers with better orientation and understanding of the structure. 1 1. Rationale Performance-based assessment has gained more popularity in English teaching and learning community over the last few decades as a set of strategies for knowledge and skills acquisition is represented through meaningful task performance (Hibbard, 1996). The importance of such assessment is also demonstrated by Baker, Oneil, & Linn (1993) when these authors claimed that this type of test format had been considered to be the “centerpiece of effort” or “the focal point” of some large-scale reforms. At the same time, McNamara (1996) has made it much clearer by pointing out the dominance of performance-based assessment over the traditional ones. Oral interviews are a prime method of performance-based assessment. Because of their usefulness in reinforcing the importance of communicative language teaching and assessment, they are now being used in a number of classrooms all over the world. Bachman & Palmer (1996) emphasized the high degree of content and face validity of such assessment. A great number of factors can potentially affect the results of the oral interviews. First, despite the assumed equivalent difficulty of different topics in the same speaking test, a plethora of studies have been conducted to point out some topic-related problems, including the topics themselves, the interest of the examinees, their opinions about the topics, or their prior knowledge of the topic, which could present the unfair advantages or disadvantages in scoring to certain groups of test-takers (Jenning et al., 1999). Bachman (1990) has stated that topic was one of the elements in the testing environment facet which could affect performance on a language test. In another work by Nguyen & Tran (2015) which studied two hundred and three students and ten English teachers, topical knowledge was also mentioned as one important factor exerting a certain influence on the performance of the test-takers. Papajohn (2002) stated that not only the performances of the examinees on the test but also the interpretation of the raters could pose an impact on the 2 final score of the examinees. The rater effects is another variable in the rating operation (Myford & Wolfe, 2015). As the oral interviews are graded by examiners, a certain degree of subjectivity is included. Cronbach (1990) mentioned the “complex and error-prone cognitive process” that every single rater undergoes during the scoring procedure. There is no guarantee that different raters could grade the same examinees similarly. Such problem raises the question of inter-rater reliability: whether the raters assign the same score to the same candidate. In Vietnam, there exists a notion that many Vietnamese students who could perform well in English written tests, still suffer from the underperformance in the speaking tests. A significant number of them find presenting orally really challenging. This scenario even occurs in some leading institutions of foreign languages like the University of Languages and International Studies (ULIS) when the average speaking band score of students is quite disappointing in comparison with their reading, listening and writing scores. Although many studies about this problem have been conducted worldwide, few studies have addressed students in Vietnam, and none have related to ULIS students. All the aforementioned reasons granted me to conduct this research with the title: “An investigation into students’ variation of speaking scores with different raters and topics. A case study: Second-year English-majored students at a language university”. This study is going to review the abovementioned literature of content to define the research questionnaires and focus on the two main factors: topics and raters to discover whether the scores of some sophomores at a language university actually vary under their influences. 2. Statement of research problem & questions This research aims to observe and analyze the variation of scores of ULIS sophomores, which have not been studied intensively, in a speaking task with different topics and different raters. Therefore, this paper aspires to raise 3 awareness of the importance of topics and raters in the speaking test, thus helps teachers to better their perceptions and encourages more training. In brief, the study purported to address the following questions: (1) What is the students‟ speaking competence in this test? (2) To what extent do the students‟ scores vary with different topics? (3) To what extent do the students‟ scores vary with different raters? (4) Are there differences between raters in the rating operation? 3. Scope of the research First, this research placed the main focus on the variation of scores of students in relation with different topics and raters. The underlying reason for this is that although there exist a number of elements affecting the scores of the students, according to Eckes (2011), topics and raters factor are the two most prominent elements influencing the reliability of the test. Also, another reason shelters in the scale of a Bachelor thesis and time constraint. In addition, it is noteworthy that the samples of the study were restricted to second-year students majoring in English language teacher education in ULIS, VNU, who are expected to reach the B2 level of English Proficiency in CEFR after their second year. Nevertheless, the survey results would be as considerably representative all second-year students in ULIS, VNU as possible. 4. Significance This research makes a contribution to the body of research on factors affecting scoring in Vietnam context. That teachers are aware of the importance of topics, and raters in grading students‟ speaking performances could put emphasis on a more careful choice of topics for the speaking tests. Otherwise, the final results of the students will suffer. At the same time, it is anticipated that the discoveries from this study would encourage more teacher training sessions with a view to minimizing any effects teachers and topics exert on speaking assessment. 4 On top of that, the research is also expected to not only serve as a useful reference material for teachers and students at ULIS but also lay the foundation for further research on the same topic. 5. Organization of the study The study is composed of 3 main parts: PART A. INTRODUCTION This chapter is the presentation of basic information such as the statement of the problem, rationale, scope, aims and objectives as well as the organization of the study. PART B. DEVELOPMENT Chapter I. Literature review This chapter conceptualizes the framework of the study by discussing the literature relating to performance-based assessment, oral assessment, topics and raters factors in oral assessment. Chapter II. Methodology This chapter features the context and the methodology of the study, which includes sampling, the data collection instruments, data collection procedure as well as data analysis. Chapter III. Findings and discussion This chapter focuses on the analysis of the data and discusses the results of the study. PART C. CONCLUSION This chapter summarizes the findings, offers some limitations and gives recommendations for further study. 5 PART B: DEVELOPMENT 6 CHAPTER I. LITERATURE REVIEW This chapter is an attempt to establish the theoretical backgrounds on performance-based assessment in general and oral assessment in particular. Then, the key concept of topics and raters in oral assessment, as well as factors that affect the rating procedure will be reviewed before some related studies worldwide and nationwide are mentioned. 1. Performance-based assessment 1.1. Definitions of performance-based assessment There are many researchers defining performance-based assessment. Deville (1995) claimed that performance-based assessment required students to produce complex responses together with many integrating skills and knowledge. At the same time, it is also of great importance for them to apply such skills to the real-life situation. Fitzpatrick and Morrison (1971) also concurred with Chalhoub-Deville in that some criteria in the performancebased assessment were made more simulated in comparison with the traditional paper-and-pencil test. Exploring the distinction between the performance-based test and other types of assessment, they also claimed that apart from being close to the reality, the performance-based test demonstrated almost no absolute differences with its counterparts. For Haertel (1992), there are two definitions of performance-based assessment: The narrow or the strong one and the broad or the weak one. Referring to the narrow definition which focuses mainly on the task completion of the examinees, Haertel noted that a performance test was “any test in which the stimuli presented or the response elicited emulate some aspects of the nontest settings”. On the other hands, it is the real language ability of the examinees, not the task completion, is of greater importance in the broad definition. The way performance-based assessment was defined by Haertel can 7 cover the ideas of aforementioned researchers, thus, it would be adopted in this research. 1.2. Characteristics of performance-based assessment A distinctive characteristic of performance-based assessment is that instead of recalling the abstract knowledge like in the traditional paper-andpencil test, learners are actually required to perform some relevant tasks (McNamara, 1996). Apart from that, while the traditional ways of assessment emphasize the relationship between the test-takers and the test instruments, one more element named raters, who grade the performances of the students based on rating scales, is added in the performance-based assessment process. McNamara (1996) also stated that such element would make the interaction among all elements much more complicated than before. 2. Assessing the speaking skills 2.1. Spoken versus written language Speaking skills can be considered to be one of the most important skills in language acquisition process. Although the definition of speaking appears to be quite familiar to most people, different researchers have their own ways to define it. Bygate (1987, as cited in Mazouzi, 2013) contended that oral language was utilized to deliver the intended messages of the speakers to the listeners. Such messages could be “ideas, intention, thoughts and feelings”. Later on, Fulcher (2003) defined speaking as “the verbal use of language to communicate with others”, whereas Hedge (2000, as cited in Mazouzi, 2013) suggested speaking was the skills which could be considered as one aspect to judge people during their first impression. In other words, speaking skills is of vital importance for not only mother tongue but also second and foreign languages. Byrne (1987) also concurred with Hedge that speaking was a “two-way process” which not only included speakers and listeners but also involved the 8 use of both productive skill (speaking) and receptive skill (listening). In other words, it seems that speaking is an interactive process of both speakers and listeners. This notion is also shared by Thornbury in 2005. He stated that because most of the speaking was face-to-face dialogue, interaction was certain to ensue in both monologic speaking and conversation. However, according to him, there were two more aspects, apart from the interaction, that needed to be considered in managing the talk. They are turn-taking and paralinguistics. In terms of turn-taking, speakers take turn to hold the “floor”, which can be defined as the right to speak. This denotes the rule that no two speakers should speak at once, at least, not for any sustained period of time. In addition, Thornbury (2005) also mentioned paralinguistics - the interactional use of eye gaze and gestures. According to him, in communicating with others, people often used eye-contact, facial expressions, body language, pauses, tempo and pitch variation to express their emotions and ideas. In other words, speaking is an interactive and multi-sensory activity performed by both speakers and listeners. There exist a number of studies on the distinction between speaking and writing. Nevertheless, Akinnaso (1982) believed that there was no agreement on the exact differences. In 1992, Hatch told spoken from written discourse production by three main aspects, including planning, contextualization, and formality. He concluded that speech was more unplanned, highly contextualized and informal than writing. That is, simple words, phrases or fillers such as “you know”, “kind of” are usually utilized during the conversation. Disagreeing with Hatch, Mazouzi (2013) claimed that spoken language differed from written language in the concept of durability. Specifically, when people communicate, their words only last for few seconds, whereas on the other hands, anything written down can last much longer. It is possible to conclude that the significant differences between speaking and writing might shelter in planning, contextualization, formality and durability. Under no circumstances should spoken language be underestimated as it deserves to be regarded as a pivotal element of language 9

- Xem thêm -

Tài liệu liên quan

Tài liệu vừa đăng

Tài liệu xem nhiều nhất