|Year : 2012 | Volume
| Issue : 2 | Page : 53-60
Assessment methods in undergraduate medical schools
Department of Paediatrics, Faculty of Medicine, Bayero University, Kano, Nigeria
|Date of Web Publication||12-Mar-2013|
Department of Paediatrics, Aminu Kano Teaching Hospital, Kano
Source of Support: None, Conflict of Interest: None
Assessment in medical education is very vital because of its far reaching implications not only for the students and their teachers but for the communities and world at large. This article attempts to review the assessment methods used in undergraduate medical schools, highlight their limitations while proffering solutions as recommended by experts in medical education. Assessment methods that are used in the undergraduate medical education can be broadly subdivided into two; assessment of knowledge and its application (Multiple choice questions, essay and viva voce) and assessment of clinical competence (long case, short case and objective structured clinical examination [OSCE]). There are five major criteria for determining the usefulness of a particular method of assessment: Reliability, Validity, Educational impact, Cost effectiveness and Acceptability. The major drawback of the long and short case examinations is the poor reliability or reproducibility due to case specificity, inter examiner and clinical case scenario variability. In 1975, Harden et al., introduced the OSCE to avoid the disadvantages of long case but acceptability may be an issue because of inadequate exposure to its principles and resistance to change by some teachers. Another objective version of the long case is the objective structured long examination records. Objective structured practical examination is also preferred to the traditional practical examinations in laboratory based courses. All assessment methods have their strengths and limitations. It is important that teachers in medical schools are aware of the limitations of the traditional assessment tools and embrace newer and more reliable methods of assessment.
Keywords: Assessment, medical schools, undergraduate
|How to cite this article:|
Asani M. Assessment methods in undergraduate medical schools. Niger J Basic Clin Sci 2012;9:53-60
| Introduction|| |
Assessment in medicine for several hundreds of years involved the direct observation of trainees by their senior colleagues. This is patterned after the traditional apprenticeship model that has been in existence for hundreds of years. Under the Han dynasty in China as far back as about 206 BC, candidates were selected for government service using some forms of assessment. Formal assessment was first introduced in medical education by the French in 1788 when entrance into internship was decreed to be by competition in the form of written and oral examinations.  It was only in 1850s that exit examinations were introduced among medical students in Britain at Oxford and Cambridge Universities. 
| Definitions of Assessment|| |
According to the Merriam-Webster dictionary, the word assessment is from the root word assess. It means to determine the importance, size, or value. The word assess originated from the Latin word, "assessus0" which means to sit beside, to assist in the office of a judge. This definition, once again, signify the concept of apprenticeship emphasized in the preceding subsection on historical perspectives.
Palomba and Banta  defined assessment as the systematic collection, review, and use of information about educational programs undertaken for the purpose of improving, learning and development. Norcini  described assessment as a process that involves testing, measuring, collecting and combining information and providing feedback. Assessment can also be defined as a sample taken from a larger domain of content and process skills that allows one to infer student understanding of a part of the larger domain being explored. 
| Aims of Assessment|| |
Assessment in medical education is very vital because of its far reaching implications not only for the students and their trainers but for the communities and world at large. , Every medical school admits students into its program with the sole aim of producing medical doctors who are competent to offer quality medical care to their communities and clients worldwide. Additionally, there is an increasing demand for quality medical care by the society. Medical curricular are designed with specific content to meet these challenges.
Assessment of medical education is important to verify whether or not the objectives of training are being met. Medical education emphasizes not only the art and science of teaching but learning and acquisition of professional skills as well. Assessment is as crucial to the student as it is to the teacher. Assessment is a potent stimulus for learning.  As a result, medical schools need to have designed curricular with defined objectives in the areas of knowledge, skills and attitude. The students may view emphasis on the curriculum content as excessive work load and direct their reading at those aspects of the course that will be terminally assessed.  Therefore, to promote learning, assessment should be educational and formative; they should learn from receiving early feedback so that their knowledge and skills can be positively enhanced.
There are several objectives of assessment. The major ones include:
- Monitoring of the program (Assessment assists the teacher in getting feedback as to the extent of learning)
- Feedback to the students: It gives the students feedback regarding their knowledge or deficiencies of expected learning outcomes. It gives the students targets for improvement. For students to learn from assessment, results of tests should be released early and during the period of their postings.
- Safeguarding the public: Assessment ensures that only the competent individuals are allowed to take care of the sick. It ensures that the statutory requirements have been met in keeping with national and or minimal global standards. This is a very important use of assessment because by upholding high professional standards, the general public is protected.
- Certification: Examinations are conducted with the aim of certifying candidates fit for graduation and award of relevant degrees. Although, the ultimate goal of training leads to certification, this should be de-emphasized and the students should be focused on attaining the learning objectives of the program of study.
- Means of screening: Assessment could be a reliable tool for screening students suitable for advanced training.  An effective means of boosting higher education is the retention of outstanding students in the academic institutions.
| Timing of Assessment (When to Assess)|| |
The timing of assessment can significantly and positively influence the effectiveness and efficiency of learning.  There are two major forms of assessment that are of great importance to teachers in the medical schools. They are formative and summative assessment. Formative assessment is otherwise called assessment for learning. It should be part of any instructional process. It is the assessment that is carried out during the course hence it is also referred to as classroom assessment. Its primary aim is to provide feedback to the student and teacher. The result of the assessment helps the teacher to know what has been taught while the student will also know what has been learnt. The teacher is expected to utilize the result of the assessment to modify his teaching, this advantage is what earns it its name, which is formative assessment. 
Formative assessment motivates the learner to concentrate on his area of weakness; this will ultimately improve his learning. It assists the teacher in identifying the weak students; this is very important as this category of students will need additional instruction and special attention. This form of assessment aids students to learn self-evaluation, self-assessment and goal setting.
There are several methods that can be used to execute formative assessment. These will include observation (teachers can assess whether instructions given to the students are well understood), classroom discussion, question and answer (this should be part of every lecture, bed side teaching, etc.), home assignment and short test.
Summative assessment is the final or end assessment at the end of a posting, term or a course. It is otherwise called assessment of learning. It is important that at the end of a posting, term or course, an assessment is carried out to know the level of the learner's ability, achievement and affective development. The purposes of summative assessment include the award of scores, grades and certification. It must be stated that the continuous assessment as done in many medical schools is a mini summative assessment and not formative assessment.
What to assess
The need to know what to assess must be clearly stated in every curriculum. The learning objectives should reflect what will be ultimately assessed for effective teaching and learning to occur.
In 1956, a committee of educators was chaired by Benjamin Bloom and subsequently proposed what is now known as Bloom's Taxonomy; a classification of learning objectives within education. It divides the educational objectives into three main domains: Cognitive, Psychomotor and affective domains, sometimes loosely described as knowing/head, feeling/heart and doing/hands respectively. ,,
Bloom identified six levels within the cognitive domain from the lowest level of knowledge (simple recall of facts) through comprehension (ability to understand the material), application (the ability to use learned material in new situations), analysis (ability to break down component parts so that composition or organizational parts can be understood), synthesis (the ability to put parts together to form a whole) and evaluation (the ability to judge the value of a material for a given purpose). These constitute what is generally referred to as the Bloom taxonomy.
While the Psychomotor domain is associated with physical, motor and manipulative skills required to be performed by a competent physician and the affective domain is concerned with attitude of the students towards the patients, the profession, their peers and teachers.
The general tendency in assessment among many medical schools is biased towards the cognitive domain while less emphasis is paid to assess the psychomotor and affective domains; a holistic approach should be encouraged.
Other areas of learning that are of growing importance, that needs to be assessed include: Communication, team work, professionalism, clinical reasoning and judgment in uncertain situations (ethical issues).
The criteria for determining usefulness of assessment methods
There has been an evolution of the various tools of assessment over the last few decades. Van der Vleuten  described five criteria for determining the usefulness of a particular method of assessment:
Reliability means reproducibility or consistency of assessment scores. It is the measure of the relative magnitude of variability in scores due to error, with the aim of achieving a desired level of measurement precision.  Reliability is influenced by many factors such as (a) Examiners' judgment (rating of performance differ among examiners in the absence of a check list). This short fall is addressed by the use of multiple examiners across different cases  (b) Clinical case scenarios; exposing different case scenarios to different students in the same examination will certain produce an inconsistent outcome (c) Mood of patients and candidate nervousness are other variables that determine the outcome of students' performance, especially in assessment of clinical competence.
Validity as the name implies means to bring about the ends or results intended. It is the degree to which the conclusions made about medical competence based on assessment scores are correct.  It must be stressed that no valid assessment methods that measure all aspects of clinical competence have been designed.  There are two types of validity evidence, these are content validity and construct validity. Content validity otherwise referred to as direct validity, is the degree to which a test covers the area of competence in question while construct validity, also known as indirect validity is the implication that those who perform better using the assessment method are better in real practice. It is the ability to differentiate between groups with known differences and abilities, e.g., beginners and experts. Predictive validity refers to the use of an assessment method to envisage an outcome in the future e.g., professional success after graduation. 
- The issues involved in the analysis of competence are captured in Miller's pyramid  [Figure 1]. The base represents the knowledge components of competence: Knows (basic facts) followed by knows how (applied knowledge). These two layers of the pyramid are better assessed with written tests like multiple choice questions (MCQ), essays and oral examination. The next layer of the pyramid represents "shows how" which symbolizes the clinical skills required for competency in clinical medicine, this layer is suitably assessed by the use of objective structured clinical examination (OSCE) and the traditional long and short cases. The apex of the pyramid represents "Does" this is the ultimate goal for a valid assessment of clinical competence, which is to test what is actually done in the place of work. It is of more relevance in the post-graduate training where portfolios are used for assessment.
This is the effect of the assessment method on both teaching and learning. The general tendency is for students to focus on areas of the curriculum content that is commonly assessed than on the learning objectives. A good assessment method should focus on the learning objectives so that students will be encouraged to concentrate on these areas instead of memorizing few areas of the teachers' favorite questions.
The acceptability of an assessment method depends on several factors like the perception of fairness by teachers and learners, familiarity of the methods.
The cost in terms of fund, man hours, use of real or simulated patients, effort in preparation and execution etc.
Apart from these major criteria, other features of a good assessment tool include relevance (A good test must be relevant with the curriculum content, learning opportunity and must be related to the needs of the community). Secondly, it must focus on essential and useful skills like clinical reasoning, diagnostic skills and management of common problems in the environment. Thirdly, it must be conducted in a conducive environment. In addition, it should be able to discriminate between good and poor students (this is essential in order to encourage patronage of emergency rooms, wards, clinics, surgical sections etc.) and importantly, it should be able to give students feedback. 
Whatever the method of assessment that is carried out, the students should have a clear understanding of its intention. The primary goal should be to ensure that the learning objectives are being achieved and not mere paper certification. Students need to be involved both as assessors of their own learning and as resources to other students. This will require descriptive feedback from the teachers as they learn.
Assessment methods that are used in the under-graduate medical education can be broadly sub-divided into two; assessment of knowledge and its application and secondly, assessment of clinical competence.  A third type, the assessment of performance is commonly used among the post-graduate and is currently used in few under-graduate medical schools.
Tests of knowledge and its application
The commonly used methods or instruments of assessment include:
Multiple choice questions
The MCQ remains the commonest type of objective tests in use in the various levels in educational institutions in Nigeria, from the primary to the tertiary schools, up to post-graduate levels. The MCQ is commonly used in both formative and summative assessments. It is the only instruments employed in the post-unified tertiary matriculation examination (Post-UTME); for entrance examinations into tertiary institutions in Nigeria.
MCQ basically comprises of a stem or base which serves as an introduction to the options that follow. The options are suggested answers for the questions. The correct answer is called the key while the other incorrect options are called distracters. It is recommended that patterns among the keys within a multiple choice question should be avoided, to discourage cueing. Wilson and Coyle  recommended that the options are stated in an alphabetic order.
Basically, MCQ assesses factual knowledge, recall, understanding and interpretation. All these are mainly cognitive domain of learning corresponding to the Miller's pyramid level of "knows" and "knows how." A well-constructed MCQ assesses not only mere recall of facts but application of knowledge and problem solving skills. 
There are several formats of MCQ 
Irrespective of the type of MCQ format used in assessing candidates, the instructions must be clearly spelt out to avoid any form of confusion.
- One best response: In this format, the examinee is expected to select the single best response from three or more options. The option "all of the above" should be used carefully if not completely avoided because students who are able to identify two of the options as correct without knowing the answers to the other alternatives, can deduce that the option "all of the above" is correct. The use of "none of the above" is more widely accepted as an effective option because it makes the question more difficult and less discriminating and the answer cannot be indirectly deduced unlike in "all of the above."
- Matching type: This format consists of two lists of statements or words, which have to be matched with one another with specific instructions. The two lists should contain different number of items to avoid cueing.
- Multiple true-false otherwise called Multiple Select Questions: This format consists of a stem followed by four or five true or false statements. The stem may take the form of a statement, question, case history or clinical data.
- Multiple true/false completion type: This format is different from the preceding one because the candidate is expected to separately respond to each of four or five choices so that any combination of answers is permitted.
- Reason-Assertion format or relationship analysis type. This form consists of two statements; an assertion linked to a reason by the connecting word "because''. The candidate will have to decide whether either or both statements are correct and if both are correct, whether the reasons rightly explain the assertion. This format is not commonly used by examiners because of the amount of language comprehension involved.
The merits of MCQ include the following:
Notwithstanding the merits of MCQ, it has several notable limitations,  some of which are:
- Can cover a large content of the syllabus.
- High reliability in scoring.
- Ease of marking, can be marked by computers.
- Usually require less time in administering.
- Ease of scoring.
- Overcomes the effect of poor handwriting by candidates.
- Can test large sample of knowledge in a short period of time.
- It does not assess other domains of learning chiefly the psychomotor and affective domains
- difficult to write especially in certain content areas
- It does not assess communication, a very important part of medical education
- It does not assess writing skills.
- Students can guess the answers rightly. Fray  in 1988 emphasised the use of formula scoring in MCQ to reduce guessing. Formula scoring is a procedure designed to reduce multiple choice test score irregularities due to guessing.
- Students can perform excellently well if questions are repeated.
- Errors in collation may occur if answers (to 100s of questions) are manually marked.
This is another commonly used instrument of assessment in medical schools. The traditional format, the questions are open-ended and the candidates are expected to respond in few paragraphs and several pages for short notes and long essays respectively. Essay questions can assess all the six levels of cognitive domains from mere recall to evaluation.
The merits of essay questions include:
The use of Essay questions in medical assessment is limited due to the following reasons:
- Assesses clinical problem solving ability
- Measures the students' power of expression
- Assesses understanding
- Assesses the ability to organize thoughts and communicate them in writing
- Encourages in depth study by students.
To improve the reliability of essay, the questions can be structured. Further efforts to improve the reliability of essay question has led to the development of modified essay questions (MEQ); this consists of a case followed by a series of questions that relate to the case and must be answered in the sequence asked.  This format assesses the candidate's reasoning skills and understanding of concepts instead of mere recall of factual knowledge  but Zafar-Khan and Aljarallah  (2011) in their study concluded that a well-constructed MCQ is superior to MEQ in testing the higher cognitive skills of undergraduate medical students in a problem based learning set up.
- It assess only Miller's levels of "Knows" and "knows how"
- It requires rigid compliance of marking scheme to avoid inter examiners score variability; a task that is almost impossible in large classes
- Candidates with poor handwritings are generally less favored
- Demands lot of time for wider areas of content to be assessed.
Viva voce is a Latin phrase which literally means "with a living voice" and translated as "by the word of mouth" or simply orally. In this assessment method, the candidate is questioned by one or a group of examiners in an interview or discussion-like format. It can be defined as an examination consisting of a dialogue with the examiner (s). Questions are asked to ascertain knowledge in certain areas or ability in clinical solving. It is a standard format for assessing communication skills, ethical issues, attitudes and professionalism. To increase the reliability of oral examination, there is a need to increase the number of questions, examiners and testing time but there is no convincing evidence that oral examination measures important aspects of medical competence not assessed by other methods.
The merits of oral examination include:
The demerits of oral examination include:
- Ensures direct contact between the candidates and examiners
- Candidates can be simultaneously assessed by examiners
- The examination is flexible, both strong and weak areas of the candidate can be explored
- Repeated oral examination enhances the communication skills of the candidates
- It provides opportunity in knowing the learning environment/personal challenges of the candidates.
Tests of clinical competence
- There is a tendency to over-rate candidates who are good orators
- The personal contact may be a source of gender or ethnic bias
- It lacks standardization and objectivity: Different sets of questions for same group of students without a structured marking scheme is the hall mark of oral examinations in some medical schools!
- Some candidates find it intimidating.
The other types of assessment tools test the Miller's pyramid level of "shows how." They are regarded as tests of clinical competence. It is recommended that the content of assessment should be specifically planned against learning objectives, the process is known as blueprinting.  Blueprinting matches students learning and eventual assessment.
The Long case is an essential part of the traditional clinical examinations. In a typical setting, the student is given a real patient to clerk for a stated period of time, usually between 30 and 45 min. During this period, he is expected to gather a history, perform a clinical examination, carry out appropriate side laboratory investigations like urinalysis and arrive at a clinical diagnosis. He is thereafter examined by a set of examiners, asking the students about the case and related topics. The merits of the Long case include:
The major drawback of the long case examination is the poor reliability or reproducibility as a means of assessment. This is mainly due to case specificity, variability in clinical scenarios among the examinees and differences among examiners, e.g., perception of proper dressing, proper communication skills, etc., which may be influenced by the examiner's cultural and religious background. The poor reliability of long case is widely acknowledged and it was described as the use of a real and untrained patient in an unstructured manner  because the long case is all about one or two examiners asking their pet questions about a clinical scenario.  Moreover, a student with good oratory ability may score higher than another with speech disability, e.g. stammering if case presentation is emphasized instead of bedside skills demonstration!
- The student interacts with a real life situation
- Students are presented with a complete and realistic true life challenge
- It assesses the three major domains and communication skills. 
A more objective version of the long case is the objective structured long examination record (OSLER). In this format, the examiners use a structured record or mark sheet to assess the candidate. The OSLER is a 10 item analytic record of the traditional long case in an attempt to improve on the objectivity, validity, and reliability. All the candidates are assessed over 20-30 min by the examiners on the same items. The 10 items consist of 4 on history, including communication skills, 3 on physical examination include examination technique and establishment of correct physical findings while the remaining three items are based on appropriate investigation, management and clinical acumen  In addition, case difficulty is identified by the examiners; it takes about the same time with the traditional long case. OSLER demands sufficient cases and greater number of examiners than the traditional long case raising the issue of practicality in resource limited countries.
This form of assessment is also used to assess clinical competence. In this format, the students are asked to perform physical examination of a real patient with little knowledge of the patient's history and then assessed on the examination technique and the ability to elicit physical signs and interpret the signs correctly.  Jauhar  emphasized the importance of short cases as part of students' assessment. He stated that the lack of emphasis on short cases is partly responsible for the decline in physical skill examination and over dependence on expensive investigations for diagnosis.
Hijazi et al.  concluded that performance in the short cases is a better discriminator of competence than that in the long case. This may be because in short cases the clinical encounters are wholly observed by the examiners unlike in long cases. A shortcoming of short cases is the omission of history taking by the candidate though history taking is an essential part of any diagnostic process.
Objective structured clinical examination
In 1975, Harden et al.  introduced the OSCE to avoid the disadvantages of the traditional Long cases. OSCE is a form of multiple station examinations, the others being practical examination and steeplechases. OSCE has become one of the most widely used tool for assessing clinical competence.  In an OSCE, all the students sequentially rotate around a series of structured clinical cases called stations. Using the principle of blueprinting, stations are structured to cover a wide range of competencies. At each station, the students are assigned a specific task to perform in a specified time. Each station is designed to test a particular skill such as history taking etc., The OSCE consists of active stations, e.g., examination of the oculomotor nerves and inactive station e.g., data or image interpretation A bell is used to signal the end of the period for each station and the students move to the next station. Each station has a checklist or a structured marking scheme used by a staff member who observes the students especially in the active stations. The number of stations are variable, typically between 10 and 20 stations are used.
The merits of OSCE include its reliability because of its multiple stations, multiple assessors, sufficient test time and checklist. OSCE also has high validity because of blueprinting. It tests a wide range of skills. Feedbacks are possible making it a very useful tool for formative assessment. Despite the numerous merits of OSCE, there are several demerits because no single tool meets all the five criteria described by Van der Vleuten,  which include reliability, validity, educational impact, cost efficiency and acceptability. While OSCE scores high in reliability and validity, the rating in the remaining three leaves much to be desired. Students tend to focus on the checklist rather than the attainment of the required skills. A well prepared OSCE is expensive in terms of staff man hours, disruption of clinical services, an emotional burden on real patients etc., Acceptability is an issue among many faculty staffs because of poor exposure to its principles and resistance to change.
It should be noted that the terminology with the OSCE format may vary; this format is commonly called OSCE in the undergraduate setting whereas in the post-graduate setting, it is called practical assessment of clinical examination skills (PACES) in the Royal college of physicians' membership clinical examination whereas the Royal college of general practitioners' membership examination refers to it as clinical skills assessment (CSA).
Objective structured practical examination
This is not a test of clinical competence, but it is better explained in this section since its principles are tailored after the OSCE. Objective structured practical examination (OSPE) was developed to overcome the shortcomings of the traditional practical examination done in the basic medical and laboratory based courses. , In the traditional practical examination, students may be given different experiments to perform unobserved thereby introducing the element of luck. A student may get a familiar experiment while another, a tough one. Sometimes, the practical examination may take the form of steeple chase, where all the students are exposed to the same questions but falls short of assessing the practical skills of individual students in the absence of examiners with checklist in specific stations. In OSPE, there are 20-30 stations broadly divided into procedure and question stations. At the procedure stations, the students are asked to perform a practical task like preparing a blood film, determination of vital capacity etc., while an examiner observes passively, ticking a prepared checklist containing the steps in carrying out the procedures unlike the traditional practical examination where the examiners are mere invigilators. The procedure station alternates with a question station which is related to it. The merits of OSPE among others include its ability to test a wide range of skills and not mere factual knowledge (MCQ and short essay are primarily designed to achieve test of factual knowledge), high reliability and objectivity through the use of carefully prepared checklist. ,
| Conclusions|| |
Since the role of assessment in medical education cannot be over emphasized, it is important that teachers in medical schools acquaint themselves with the usefulness and limitations of the different assessment methods. The appreciation of the limitations of the traditional assessment methods by medical teachers should stimulate them not only to address these shortcomings but to embrace relatively newer methods, which are widely accepted as more reliable and valid.
| References|| |
|1.||Gipps CV. Sociocultural aspects of assessment. Rev Res Educ. 1999;24:355-92. |
|2.||Jolly B, Rees L. Medical education in the millennium. Oxford: Oxford Medical Press; 1998. |
|3.||Palomba CA, Banta TW. Assessment essentials: Implementing, and improving assessment in higher education. San Francisco: Jossey-Bass publishers; 1999. |
|4.||Norcini J, Anderson B, Bollela V, Burch V, Costa MJ, Duvivier R, et al. Criteria for good assessment: Consensus statement and recommendations from the Ottawa 2010 Conference. Med Teach 2011;33:206-14. |
|5.||Epstein RM. Assessment in medical education. N Engl J Med 2007;356:387-96. |
|6.||Maxim BR, Dielman TE. Dimensionality, internal consistency and interrater reliability of clinical performance ratings. Med Educ 1987;21:130-7. |
|7.||Sloan DA, Donnelly MB, Johnson SB, Schwartz RW, Strodel WE. Use of an objective structured clinical examination (OSCE) to measure improvement in clinical competence during the surgical internship. Surgery 1993;114:343-50. |
|8.||Marton F, Saljom R. On qualitative differences in learning: 11-outcomes as a function of the learner′s conception of task. Br J Educ Psychol 1976;46:115-27. |
|9.||Ponnamperuma GG, Karunathilake IM, McAleer S, Davis MH. The long case and its modifications: A literature review. Med Educ 2009;43:936-41. |
|10.||Swing SR. Assessing the ACGME general competencies: General considerations and assessment methods. Acad Emerg Med 2002;9:1278-88. |
|11.||Wang X. Teachers views on conducting formative assessment in Chinese context. Eng Lett 2008;16:2-5. |
|12.||Bloom BS. Taxonomy of Educational objectives, Handbook 1: The cognitive domain. New York: David McKay Co Inc.; 1956. |
|13.||Harrow A. A taxonomy of Psychomotor domain: A guide for developing behavioral objectives. New York: David McKay Co Inc.; 1972. |
|14.||Krathwohl DR, Bloom BS, Masia BB. Taxonomy of educational objectives, the classification of educational goals. Handbook 11: Affective domain. New York: David McKay Co Inc.; 1973. |
|15.||Van der Vleuten C. The assessment of professional competence: Developments, research and practical implications. Adv Health Sci Educ 1996;1:41-67. |
|16.||Shavelson RJ, Webb NM. Generability theory: 1973-1980. Brit J Math Stat Psy 1981;34:133-66. |
|17.||Swanson DB. A measurement framework for performance based tests. In Further developments in assessing clinical competence. In: Hart IR, Harden RM, editors. Can Heal Montreal; 1987. p. 13-45. |
|18.||Messick S. Validity. In: Educational Measurement, R, L. Linn (ed) 3 rd ed. McMillan New York. 1989. pp. 13-103. |
|19.||Wass V, Van der Vleuten C, Shatzer J, Jones R. Assessment of clinical competence. Lancet 2001;357:945-9. |
|20.||Ahmed AM. Examination of the clinical eexamination: Notes on assessment of clinical competence in our medical schools. Sudan J Public Health 2011;6:29-35. |
|21.||Miller GE. The assessment of clinical skills/competence/performance. Acad Med 1990;65:S63-7. |
|22.||Nayar U. Assessment in medical education. In Assessment in medical education: Trends and tools. In: Sood R editors. K.L. Wig Centre for medical education and technology. New Delhi, India. 1995. p. 1-16. |
|23.||Al-Wardy NM. Assessment methods in undergraduate medical education. Sultan Qaboos Univ Med J 2010;10:203-9. |
|24.||Wilson TL, Coyle L. Improving multiple choice questioning: Preparing students for standardized tests. Clearing House 1991;64:422-4. |
|25.||Case SM, Swanson DB. Constructing written tests for the basic and clinical sciences (3 rd ed Revised). National Board of Medical Examiners; 2002. Available from: http://www.nbme.org/publications/item-writing-manual-download.html [Last Retrieved on 2012]. |
|26.||McCoubrie P. Improving the fairness of multiple-choice questions: A literature review. Med Teach 2004;26:709-12. |
|27.||Fray BR. An NCME instructional module on formula scoring of multiple choice tests (correction for guessing). Module 4. Sum 1988; 7:33-8. |
|28.||Knox JD. How to use modified essay questions. Med Teach 1980;2:20-4. |
|29.||Khan MU, Aljarallah BM. Evaluation of modified essay questions (MEQ) and multiple choice questions (MCQ) as a tool for assessing the cognitive skills of undergraduate medical students. Int J Health Sci (Qassim) 2011;5:39-43. |
|30.||Hossam H. Blue printing for the assessment of health care professionals. Clin Teach 2006:3;175-9. |
|31.||Norcini J. The death of the long case. BMJ 2002;324:408-9. |
|32.||Zeller A, Battegay M, Gyr N, Battegay E. Evaluation of unstructured medical school examinations: Prospective observational study. Swiss Med Wkly 2003;133:184-7. |
|33.||Gleeson F. Assessment of clinical competence using the objective structure long examination record (OSLER). Med Teach 1997;19:7-14. |
|34.||Parakh K. Assessment in medical education (correspondence). N Engl J Med 2007:356;2108-10. |
|35.||Jauhar S. The demise of the physical examination. N Engl J Med 2006;358:548-51. |
|36.||Hijazi Z, Premadasa IG, Moussa MA. Performance of students in the final examination in paediatrics: Importance of the short cases. Arch Dis Child 2002;86:57-8. |
|37.||Harden RM, Stevenson M, Downie WW, Wilson GM. Assessment of clinical competence using objective structured examination. Br Med J 1975;1:447-51. |
|38.||Newble D. Techniques for measuring clinical competence: Objective structured clinical examinations. Med Educ 2004;38:199-203. |
|39.||Nayar U, Malik SL, Bijlani RL. Objective structured practical examination: A new concept in assessment of laboratory exercises in preclinical sciences. Med Educ 1986;20:204-9. |
|40.||Malik S, Hamad A, Khan H, Bilal M. Conventional/traditional practical examination (CPE/TDPE) versus objective structured practical evaluation (OSPE)/Semiobjective structured practical evaluation (SOSPE). Pak J physiol 2009;5:58-64. |