Development and Evaluation of an Online Exam for Exercise Physiology During the COVID-19 Pandemic
The COVID-19 pandemic necessitated substantial changes to university learning and teaching, notably conversion to online formats. Physical interaction is inherent in an exercise physiology curriculum, but it is unclear whether students' clinical skills can be adequately assessed online. This study describes the development of an online Objective Structured Clinical Examination (OSCE) and aimed to determine its appropriateness for assessing final year undergraduate exercise physiology students' clinical skills. We converted our face-to-face (f2f) OSCE to an online format. This required station adaptation (e.g., editing scenarios to suit a telehealth format), technological considerations (for students, clients, and examiners), personnel and procedural aspects, and pilot testing. Fifteen students completed the online OSCE which was performed in May 2020. All OSCE stations were recorded, then later observed by 4 experienced OSCE examiners who appraised whether online OSCE features were better, worse, or similar to f2f for fairly and accurately assessing student performance across multiple domains (i.e., communication, information technology, procedural and technical components, professionalism, quality of assessment, and risks). Of 3,540 responses, 2,846 (80.4%) indicated no difference in quality between the f2f and online OSCEs (P < 0.001). Of the remaining 694 responses, 654 (94.4%) indicated that the online OSCE was worse than f2f (P < 0.001), most notably in the risk domain. The online OSCE was a sufficiently accurate and authentic clinical skills assessment for exercise physiology students. With ongoing challenges with clinical assessment posed by COVID-19 and telehealth likely to continue, the online format appears a suitable alternative and could be used to assess students online.ABSTRACT
Background
Methods
Results
Conclusion
INTRODUCTION
The Objective Structured Clinical Examination (OSCE), introduced 5 decades ago (1), is a well-established assessment tool worldwide (2). Typical OSCE methods have students rotate through stations assessing different skills. This assessment modality promotes authenticity of assessment (3) using simulated or real patient encounters, which sufficiently assess learner performance in clinical encounters (4), with high reliability and validity (5,6). OSCEs have undergone multiple reiterations in medical teaching (1,4,7) and, given translatability from medicine to allied health (8–10), are being increasingly used in allied health education (11).
Many disciplines use OSCEs including physiotherapy (12), occupational therapy (13), nursing (14), dietetics (15), and exercise physiology (16). In the University of New South Wales (UNSW) undergraduate exercise physiology program, OSCEs have been used for almost a decade. Briefly, consultation was held between clinical academics and accredited exercise physiologists to identify key exercise physiology competencies, which were then applied to clinical scenarios and simulated client encounters to develop our OSCE.
Refinement occurred after the piloting period to ensure content validity (based on feedback from clinical education experts, examiners, and academic staff) and reliability (using generalizability coefficients to assess the examination's reliability to consistently rank students) (16). Factors impacting examiner judgment and variance have also been explored, and while multiple themes were found to influence ratings, these had little bearing on the overall judgment of students' performance (17) and are consistent with analysis of examiner grading in medicine OSCEs (18,19). This means examiners were able to consistently judge good and poor performances, indicating OSCEs are also a valid assessment in exercise physiology. Since inception, the OSCE has been used to assess competency of UNSW exercise physiology students and is the final hurdle before students are eligible to graduate.
In 2020, COVID-19 necessitated changes in university teaching and assessment across all disciplines. Consequently, innovations in clinical teaching were required to ensure efficient training of graduates (20,21). In parallel, telehealth services, which rapidly grew due to COVID-19 restrictions and are predicted to continue beyond the pandemic (22), support an aligned online modality for health education and assessment. Through this growth, it has been realized that specific training is required for effective telehealth service delivery and competency (23,24). While some learning activities and assessments are more readily transferrable to online format, others are not, and this is particularly true of competency-based skills assessments, traditionally delivered face-to-face (f2f).
For OSCEs, distance delivery may provide a viable assessment solution. Computer-based OSCEs have been explored over a surprisingly long timeframe: Novack et al. (25) found student satisfaction when completing a WebOSCE and rated it similarly to the onsite exam; however, those who performed the WebOSCE scored lower on most domains. Holyfield et al. (26) found improved exam efficiency in dentistry when computer-based stations rotated students. More recently, teleOSCE or VirtualOSCE were deemed feasible and acceptable to students (27) and examiners (28), respectively. Finally, a f2f OSCE was adapted to teleconference delivery (29), with no differences in scores or failure rates between the two. Online OSCEs also allow efficient recording of stations which has been previously noted as time and resource demanding (30,31).
Based on this evidence, online OSCEs may be a feasible and valid alternative to f2f format, potentially with some advantages. This was demonstrated recently for osteopathy (32), but to our knowledge, no previous studies have sought to implement an online OSCE for exercise physiology.
Hereafter, we describe the conversion of our f2f OSCE to an online format and examine the advantages and challenges of this novel online competency-based assessment to assess student performance fairly and accurately.
METHODS
The methods were extensive and difficult to include within this article. We provided an overview below, but for detailed procedures, station information, and pilot structure, please see the Supplemental Methods provided.
Stations
Our f2f OSCE consists of 7 stations. Stations require students to perform either (a) a brief interview to gather relevant information and provide exercise and lifestyle education, (b) an exercise assessment, (c) prescribe and instruct an exercise, or (d) all of the above, for different pathologies. Students' clinical competencies are assessed, and they are evaluated on communication, professionalism, and technical and procedural skills.
To ensure face validity of the online OSCE, we determined the abovementioned f2f features should be retained. Conversion to online required consideration of many factors, including relevant personnel roles, technological aspects, equipment, station adaptation, marking rubric modification, and client safety. To this end, a panel of experienced examiners reviewed each f2f OSCE station and determined which aspect(s) needed modification, and how, to suit an online format. Education stations remained unchanged, as did the professionalism aspect of each station. However, there were changes to communication, technical, and procedural aspects in the others. In the online format, however, students could not physically perform many steps, so the panel agreed to modify stations so the student instead explained all steps. Clients then self-administered much of the station under the student's direction.
Technology
We used the Microsoft Teams™ (Teams) platform for our online OSCE, using the video meetings, chat functions, screen sharing, and recording features.
Virtual assessment rooms were set up by an online organizer (see Personnel) who created separate Teams meetings for the waiting room and all stations. The station examiner, client, IT support, and onsite organizer were included in this invite, which also included the student schedule and station information. Additionally, a separate group chat was created for communication between all relevant personnel. This chat was used to convey messages about station timing and to report and solve IT issues or equipment problems.
The other key technological consideration was equipment for filming each station and optimizing setup for students to observe clients. Again, the examiners determined this through discussion and several iterations of pilot testing (see Pilot testing). In the f2f OSCE, students would normally be able to observe the client from multiple viewpoints by moving around the client while performing an assessment or exercise. This was not feasible online, so we instead needed to determine the best position for a Web camera(s) where multiple views were required. Our f2f strength assessment (Figure 1A) and exercise prescription (Figure 1B) stations, for example, require students to set up and adjust equipment for the client, select the appropriate weight, monitor technique from various positions, and adjust technique via verbal and, where appropriate, tactile feedback throughout. For the online version, the Webcam(s) had to be positioned so the student could view the client from different angles if necessary, and they instructed their client to change necessary elements.



Citation: Journal of Clinical Exercise Physiology 11, 4; 10.31189/2165-6193-11.4.122
Personnel
Our f2f OSCE requires 7 examiners and 7 clients as well as an onsite organizer (see Table 1 for roles and functions). This was true of the online OSCE as well, but additional personnel were also required, including UNSW IT, who assisted with troubleshooting technical aspects both before and during the exam in addition to providing advice and procuring equipment where necessary. Most notably, the online OSCE required the establishment of an entirely new role: the online organizer.

Participants in this study were final year undergraduate exercise physiology students who were sitting the online OSCE as part of their normal summative assessment. All students who participated in the online OSCE (n = 15) were included in the present study.
Procedures
The flow of the main procedures for the online OSCE is shown in Figure 2. Procedurally, the f2f and online OSCEs were similar with respect to number of stations, core competencies assessed, and marking criteria. There were, however, subtle differences. For example, students now instructed clients on equipment setup rather than doing it. In the f2f OSCE, students are provided pen and paper, but for the online OSCE, instead, they used a whiteboard to ensure exam information could be erased at the end of the station. Students showed their blank whiteboard to the examiner before starting each station and to the online organizer at the completion of the exam. A room sweep was also conducted to ensure the student's room was clear of additional materials, and they also read a statement regarding exam integrity. Each station was physically set up to accommodate a variety of scenarios which could use the same equipment to minimize time changing between rounds.



Citation: Journal of Clinical Exercise Physiology 11, 4; 10.31189/2165-6193-11.4.122
Pilot testing
Iterative pilot testing was used throughout the development of the online OSCE to refine each station. After stations were initially adapted, a pilot test with a recent graduate was undertaken to test the content, timing, equipment, examination space, and software for 2 stations. Based on student, examiner, and client feedback, adjustments were made, and a second pilot was conducted, this time with 4 recent graduates and 4 examiners. Again, feedback necessitated further adjustments, particularly for 1 station, which was reported as too long. Consequently, we ran a standalone pilot with the students to be assessed in the upcoming OSCE. Further feedback was taken, addressed, and procedures for the final exam process were created. All online OSCE personnel were advised to use the Teams application. If this was not possible, the Web version via Google Chrome™ was used to decrease compatibility errors. Only when the online OSCE setup and procedures were finalized was training created and conducted for examiners and clients. A full mock exam for students was then performed, both to provide a full practice for students and to familiarize staff with the new format. Final adjustments were then made, and the summative online OSCE was conducted.
Data analysis
All OSCE elements were recorded, then later observed by 4 experienced OSCE examiners who estimated whether online OSCE features were better, worse, or similar to f2f OSCE in its ability to fairly and accurately assess student performance. Domains of assessment appraised were (a) communication, (b) IT issues, (c) procedural components of exercise physiology competencies, (d) technical components of exercise physiology competencies, (e) professionalism, (f) quality of assessment, and (g) risks. Each domain was further assessed by specific subdomains.
For each domain, the examiners responses were coded as follows: 3 = better than f2f, 2 = same as f2f, and 1 = worse than f2f. Nonparametric one-sample binomial tests were used to identify whether the quality of the observation type and each subfeature within the observation type were the same across online and f2f OSCEs. When differences were identified, nonparametric one-sample binomial tests were again used to determine whether the quality of types of observations and their features in the online OSCE were better or worse than the f2f. Statistical significance was set at α < 0.05.
RESULTS
There were 3,540 observations recorded; 80% indicated no difference in quality between f2f and online OSCE (P < 0.001). Of the remaining 694 responses, 94% indicated that the online OSCE was worse than f2f (P < 0.001; Table 2).

Specific features of each observation type were also assessed to identify those features that differed between f2f versus online. Within the communication domain, most features were marked as similar across OSCE modes. The most notable exception was for nonverbal communication, which was identified most often as different and worse in the online compared with f2f OSCE. Verbal communication and the use of appropriate language and tone were also identified as mostly the same across modes of OSCE, yet when differences were noted, the online appeared better than the f2f OSCE.
Although most procedural features were identified as similar between each mode (n = 828; 87%), some differences were identified, and they were all deemed worse online compared with f2f OSCE. Client management and monitoring, risk mitigation, and technique monitoring were most deemed worse. In contrast, providing appropriate exercise prescription and education, progressing or regressing exercises, and conducting the consult in a logical sequence were comparable between online and f2f OSCEs.
Within the domain of professionalism, all features were identified as mostly the same (~75% of the time) between each mode of OSCE. Features most deemed to be different were issues related to client needs, steps taken by the student to prevent harm, and integrity issues (i.e., cheating). Again, when these differences were identified, they were all deemed worse online versus f2f.
Examiners were also asked to appraise whether the online OSCE was able to accurately and authentically assess the relevant competencies compared with the f2f version. Under this quality domain, examiners believed the online OSCE was valid and authentic for approximately 75% of observations (Table 3). When different, the online OSCE was rated as worse than the f2f version.

Of all the domains assessed, the risk domain was most notably different between the online and f2f OSCEs (Table 4). This was true for all subdomains, especially those related to the client being at risk of injury or the student's ability to fully demonstrate the exercise or assessment. In addition, being able to fully observe all visual behaviors and subtle communication were also different most of the time. All differences favored the f2f OSCE.

Lastly, within the technical domain, most responses identified that the f2f and online OSCE were the same. The most apparent difference was for correctly directing the client to set up or use equipment. Unlike most other abovementioned domains where differences clearly favored the f2f OSCE, a similar number of examiners believed the online version was better than f2f (43%) compared with worse than f2f (57%; P = 0.58).
DISCUSSION
The main objective of this study was to determine if an online OSCE was appropriate to assess undergraduate exercise physiology students in the final stages of their program. Given no previous studies of this nature have been conducted in exercise physiology, we also sought to determine the advantages and challenges.
Implementation of the online OSCE required much methodological consideration. In our f2f OSCE, technological skills are required, for example, academics creating station items and assessment criteria in an online item bank, administrators creating relevant online assessment forms, and examiners using online assessment applications. The technological burden for the online OSCE, however, was greater. Firstly, we needed to determine which platform to use. Numerous technologies exist for online interactions, but after discussions with UNSW IT, it was decided that Teams would be most appropriate for our needs. At that point in time, Zoom was susceptible to cyberattack, and Blackboard Collaborate required a university identification and password our clients would not have. Teams had been introduced to UNSW in 2019 but due to COVID-19 rapidly grew in popularity and functionality. It has multiple uses but allows video or voice meetings, chats, and screen sharing and can be easily accessed by external clients.
Importantly, examiners evaluated the online OSCE accurately and authentically assessed students' competencies most (75%) of the time. This implies the online OSCE was a viable substitute when f2f exams were impossible due to COVID-19. This is an important finding. Health care programs worldwide should continue the pipeline of student training, especially as the workforce comes under considerable pressure due to the impact of COVID-19, both directly and indirectly (33).
While some differences favoring f2f were apparent, there were also numerous advantages of being online. For example, we were easily able to record each station. Recording f2f OSCEs has been discussed on several occasions (30,31) but requires time and resource-consuming processes or custom-built premises to occur seamlessly. However, recording OSCE stations makes it possible for independent review of students' performance later. Given OSCEs are usually used as summative assessments, often at end-of-phase or program waypoints, they are inherently high stakes in nature. Thus, the opportunity to further review student, station, and/or examiner performance is highly valuable for quality control and assurance and for future personnel training.
In relation to each person's location, students anecdotally reported feeling less anxious with the online format than having to perform the OSCE in the same room as the examiner. It is well-established that OSCEs are highly stress-inducing (34,35), so these anecdotes are important, as it considers student wellbeing at a point in the history of education when this has been severely tested.
Also of great value, students have become familiarized with telehealth assessment. COVID-19 drove a profound shift in exercise physiology service delivery from predominantly f2f to a mixture of f2f and online. Telehealth competency is now relevant for both working exercise physiologists and students, with Exercise & Sports Science Australia (ESSA) developing Telepractice Professional Standards to inform not only telehealth delivery but also to provide structure for teaching and curriculum design (36). Through the online OSCE, students became familiarized with telehealth assessment and demonstrated their developing abilities well. Now a future challenge is to implement telehealth delivery in health care education, so students are familiarized with it during their regular clinical training.
Despite the noted advantages of the online OSCE, there were challenges too, and where differences were recorded, the online OSCE often rated as worse (Tables 2–4). These differences were most evident in the IT and risk domains (Table 4). IT in the online OSCE was a new but not unexpected challenge. For the f2f OSCE, the Internet is only required before the examination to sync to the iPads™, then again afterwards to upload results. The iPads were then an electronic assessment form not requiring Internet. The online OSCE, however, required the examiner, student, and client to all have access to stable Internet throughout the examination and be familiar with the relevant software. We mitigated this with detailed written instructions and multiple practices for students and examiners, but it still poses a challenge not faced with a f2f OSCE.
In terms of risk, the largest risk when comparing online with f2f was our ability to assess the student's ability to manage client safety (Table 2). It is important to note, however, that behind the scenes, the simulated client was supervised in person by a qualified staff member in addition to being screened for station appropriateness beforehand. Therefore, the client's injury risk was low and likely comparable with f2f OSCE. Regardless, many f2f OSCE elements required modification to suit online format to reduce risk. For example, client monitoring including responding to adverse events or correct setup and use of equipment, were modified for some scenarios, by removing scenarios with significant client impairments or complex equipment requirements when necessary. However, this could also be an advantage, as students needed to give greater thought to assessments and exercises to be safe within a telehealth environment. Ideally, this will inform their future clinical practices, as telehealth standards have now been implemented and expected with standard care (36).
The other main identified limitation was the restricted Webcam view. In f2f OSCE, both client and student behaviors are potentially visible to an observer. However, this was not always the case for the online OSCE. We attempted to mitigate this through pilot testing to determine appropriate camera positions for each station, but even then, some movements and behaviors were not fully observable. This could explain why examiners rated nonverbal communication, a key soft skill, as worse online, with eye contact confounded by camera and screen position, body language dependent on how visible the student was, and tactile feedback impossible. It is worth noting though that, while live observers might think they can see and hear everything, this is rarely true. Indeed, many subtleties may be missed, but we simply accept (or forget) these shortcomings. As already discussed, with ability to review recordings, this advantage may outweigh limitations of the view provided by cameras.
Upon reflection, we noted that assessment timing was more efficiently controlled online, as students and onsite organizers were not required to physically move as they normally do between stations. Logistically, time burden was also reduced for most participants, with many able to participate offsite. Given OSCEs are a significant time and resource burden on health education programs, alleviating some of that burden is an important finding.
CONCLUSION
In comparison to f2f, the online OSCE was deemed to be a sufficiently accurate and authentic assessment of clinical skills of final year exercise physiology students. While several challenges were identified, there were numerous advantages compared with f2f, not least regarding flexibility in participation, student wellbeing (anecdotally), easily recording for review and quality assurance, and introducing telehealth assessment to students. With COVID-19 necessitating increases in online telehealth exercise physiology services, an online OSCE emerges as a fit-for-purpose assessment for at least telehealth services, not only during times where the pandemic imposes restrictions to f2f clinical exams but also post COVID-19, with the rise in telehealth services looking to continue. Importantly though, while the online OSCE provided a good alternative to assessment during the COVID-19 pandemic, f2f assessment of clinical skills remains the gold standard for clinical competency assurance. Subsequent iterations of the online OSCE will benefit from addressing limitations identified here to further improve such assessments.

(A) Strength assessment and (B) exercise prescription station setup.

Flowchart of online Objective Structured Clinical Examination (OSCE) procedures.
Contributor Notes