Teaching High Performance Computing to A Multicultural Audience

Abstract

It is well recognised that the teaching of adults can be somewhat distinguished from pedagogical approaches, partially due to their social situation (e.g., legal rights, responsibilities, expectations) and partially due to their developments in their psychology (e.g., neurological differences, use of formal cognitive and moral reasoning). This distinction is at least somewhat matched by transitional stages from primary, secondary, and then into tertiary education. Perhaps less well established however is that of a "post-tertiary" educational environment where research and the integration into a community of scholars is of importance. This project reviews the possibility of such integration through optimising course content for such a purpose.

The project initially engages in a descriptive account of the course, examining and reviewing the instructional method employed, including presentation, lesson plans, and learning styles. This examination will position the method and content according to a appropriate theoretical frameworks to illustrate the relative appropriateness of the teaching method to the quality of learning processes and outcomes, especially in reference to the characteristics of the students who are attending the classes and the various situational constraints of the teaching environment.

In addition to the relevant available readings in the course material empirical studies are also reviewed that are relevant to both the instructional content and the aspect of the instruction where improvement is sought. There is some difficulties in this of course, as it is quite rare that the two issues are considered simultaneously, least of all in the specialist sub-discipline being examined. Some material (e.g., Rodger (1995)., Naps et al (2002)) provides opportunities to examine delivery technologies, whereas others take a learner-centred orientations (e.g., Ben-Ari (2001), Cooper and Cunningham (2010)). Fortunately studies are also available which examine the relationship between multiculturalism and teaching computer science and operations (e.g., Duval et.al, (2001), Little et.al (2001)).

Following this evaluation of the course and the literature review a specific, singular, proposal is made for improving the learning outcomes from the course in detail. Further and future issues for investigation will also be noted in brief.

High Performance At The Victorian Partnership for Advanced Computing

High performance computing (HPC) is perhaps a field which is not particularly well understood outside of those who it regularly. The system engineers who build and maintain such systems are likely to point to metrics such as the six-monthly updates of the Top 500, which documents which are the most powerful computer systems in the world. Whilst this provides measured and potential floating point operations per second this abstract metric does not illustrate the practical importance of such systems. Scientists and engineers of course publish various papers of their developments and discoveries but it is rare that that high performance computing is mentioned in anything but a passing manner.

Nevertheless it is this combination of scientific modelling with HPC systems (meaning hardware, operating system, and applications) that match empirical evidence that drives much of contemporary research publications and technological development. Explained system, an HPC system takes a computational task, differentiates between those parts of the task that must be conducted in serial and those which can be conducted in parallel, and allocates different processor duties accordingly. To illustrate a very simple example, the generation of a pool of random numbers (e.g., for Monte Carlo experiments) is something that is extremely inefficient when conducted in serial, but very efficient in parallel; it is akin to roll a dice a thousand times in sequence to generate a pool of results compared to rolling a thousand dice once.

The Victorian Partnership for Advanced Computing (VPAC) is a registered research agency, owned by its member universities, that has provided HPC facilities and built and maintained such systems member institutions and government bodies for the past thirteen years. In that time it has two systems that have been reckoned among the world's Top 500 and provided the installation and initial build for a third, owned by a member institution. For a number of years participation even by researchers in such systems was very limited. Whilst VPAC ran a summer student program for recent graduates and honours students, the conduct of courses specifically to use the facilities offered was quite limited; a short course of a few hours for a dozen researchers conducted once a quarter.

In recent years there has been a dedicated revival at the training courses offered by the company. Based on anecdotal evidence, demand was increasing (as more and more HPC systems came online, and the research benefits became increasingly well-known) and increasingly the need for core command-line Linux operating system skills (over 99.5% of the world's top HPC systems use Linux or a Linux-like operating system) was evident - surveys at the beginning of each class indicated that approximately a third had little or no prior experience in such operations. It is of little wonder in such circumstances that Greg Wilson (2008) has described high performance computing as "harmful", referring to a term in computer culture where a system is less than optimal. Wilson, whilst noting the ability of HPC to reduce research publication time and the capacity to process data, still suffers from relatively low uptake as even scientific researchers often do not have the core skills to use the systems.

As a result in 2008 the agency engaged in a major revision and expansion of its course offerings. Today, VPAC runs three day-long courses every two months, entitled Introductory, Intermediate, and Advanced High Performance Computing Using Linux. In terms of content, the Introductory course introduces researchers to the Linux environment, file transfers, editing and manipulation, environment modules, the Portable Batch System (PBS) for scripting jobs, along with job submission mathematical, biological, and engineering examples. In the Intermediate course, various archiving and scripting strategies are illustrated, along with the manipulation of file permissions, output and program redirections, and job control, including job dependencies, arrays, and interactive using PBS, again with multidisciplinary examples. The third, Advanced, course, takes scripting to the functional programming level, illustrates the effects of different computer architectures, the principles of parallel operations, the use of C and Fortran with a message parsing interface (MPI) used in parallel programming, along with functions and data types, communicators, messages, deadlocks, collective communications, profiling, latency hiding, and derived data types.

Course material is presented in three workbooks, each of around 70 pages, supplemented with online material. With a small class size, the instructor takes the opportunity for each participant to explain to those present any existing experience with Linux, the command line, computational tasks, and their research projects. An introduction to the setting is provided (or revised) along with an outline of what will be taught on the day. Material in workbook is carried out in sequence, but with the teacher carrying out the tasks with the students with via a projector and stopping at each step along the way to ensure that nobody is left behind in acquiring the direct experience of interacting with the system. It should be mentioned that the emphasis on learning by doing was part of the original MIT 'Hacker Ethic', "Yield to the hands-on imperative" (Levy, 1984)

In addition, there is active encouragement for collaboration between the researchers to assist in each others activities. A meal break in the middle of the day also provides the opportunity for further casual collaboration and elaboration of what is being taught. At the conclusion of each day's course the instructor - rather than the learners - is assessed by an anonymous structured survey. Emphasis is given that the teaching is very much of the "boot camp" variety, with almost twenty hours of instruction and work in a brief period. Revision of the provided material should be conducted regularly in the course of computational activities and that further assistance is available from the agency.

It should be noted that the courses was not always taught in this fashion. In the past people with significant technical expertise but with minimal training experience were responsible for running the courses and as a result, it was more by luck rather than design that the collaboration among the participants occurred, or that attention was given on how each participant was faring in the course material. This is, of course, no fault of the trainers as they had not received prior education themselves in the relevant discipline. But the result was that much of the previous instruction was the digital equivalent of "chalk and talk".

Further, and even more recently, has been the emphasis on using practical scientific, engineering, and mathematical problems in the course content. Previously material was more illustrative of various quirks of the computing system itself which certainly did provide evidence of common pitfalls, or provided jovial examples from popular culture, which certainly did create a sense of comfort with the material. Course material is now orientated towards inspirational examples, such as how to discover genetic sequences which reveals propensity towards melanoma, structural integrity of buildings in high winds, biomass in high CO2 environments, drug docking with aspirin and phosoholipse, and patterns in earthquake data.

The success of the revision is primarily illustrated by both the results of the feedback, which now indicate very high levels of satisfaction with the course material, presentation, and facilities. There is also some suggestion that the popularity of course is evident by how quickly enrolments are filled, typically a few days after being offered, although it is acknowledged that can also be an indication of direct demand and the increasing prevalence and need for HPC in scientific researcher.

Survey and Literature Review

From a survey of the cultural origins of attendees at the course over the past year, over three-quarters (75.55%) are from non-English speaking background (NESB). This raises interesting questions about the appropriateness of the course content and instruction to the audience, not only to ensure that requisite levels of politeness and professionalism are applied but also to encourage the best possible learning outcomes to such a mixed audience. With regard to the former popular psychology and basic sociology has provided many examples of variation in symbolic gestures or concepts of status and personal space (e.g., Anderson and Taylor, 2007, pp107-110). Given that these symbolic gestures are largely arbitrary and may even be contradictory across different cultures (e.g., the variation in the 'thumbs up' gesture from positive, to innuendo, to insult), it is difficult to expect any teacher to have expertise in the nuances of each and every culture and potential culture that they encounter in the learning environment. Rather, the safest orientation is to be orientated towards neutrality.

Whilst this may provide some positive learning outcomes is questionable whether this is sufficient. Unfortunately empirical research of the effectiveness of teaching strategies in high performance computing to a multicultural audience is not exactly an area which has generated an significant degree of research, indeed if any. As a result, attempts have been made to broaden the field of inquiry in a hope to be able to find evidence of material that will be of use for the inquiry in question.

Naps et al (2002) have suggested the use of visualisation technology to illustrate concepts in computer science and in doing so, suggest a taxonomy of learner engagement related to Bloom's taxonomy of understanding. Given that the research group which conducted the study it itself quite multicultural (including researcher from the United States, Hong Kong, Germany, Finland, and Spain) it seems probable that the study itself could provide some insight. The study itself sought to avoid the twin perils of the overhead for the instructor and lack of utility for the learner by emphasising the need for engagement in the learning activity. To assist this it is recommended that the material be properly resourced, be orientated towards the learner's level, provide multiple views, and so forth. Surveys of current practises were carried out, including the current use of various visualisation technologies and the consideration of teaching staff of their utility.

Part of the surveys conducted indicated various benefits and issues of visualisation in teaching. The primary benefits included the improvement in the teaching experience, the heightened level of student participation, enjoyment, and learning, the ability to discuss conceptual frameworks using available technology. Issues included the time needed to learn the new tools or develop visualisations, concerns of the availability and robustness of the technology, and the time taken to adapt visualisations to a teaching approach. In developing a taxonomy of engagement, the authors provide a scale similar to Bloom's, from non-engagement to presenting their own visualisations. Unfortunately the data presented was not differentiated according to cultural source and given that a very high level of extensive visualisation is carried out in the existing courses, the Naps study is of limited utility, except to confirm the benefits of current practises.

On a more general level Rodger (1995) appeals for interactive lectures as distinct from verbatim lecturing. It is important to note that learning presentations can be visual, but not interactive, and plausibly interactive but not visual. Rodger describes the benefits of mid-level computer science courses in data structures, algorithms, and automata theory at Rensselaer Polytechnic Institute which moved from being traditional lectures to interactive lectures. Even in classes were computer terminals were not available interactivity took place using time-honoured tools such as pencil and paper and neighbours. Of particular note was through group and interactive assessment of tasks raised interest among the students was high with the realisation that there were different solutions to the same problem. Where computers were available, algorithms were illustrated through animations, and immediate feedback and evaluation could be carried out according to student questions. Rodger's evaluation of the interactive lecturing system is highly positive arguing that students had moved from an environment where they were often asleep (!) to one where they were thoroughly engaged in the educational process. Again however it must be noted that this level of interactivity is now carried out in the VPAC courses, and that evaluation of its effectiveness among students of a NESB was not conducted. Once more however it confirms the effectiveness of existing practises.

Taking a different approach to that of instructional style, Ben-Ari (2001) accounts for the constructivist student-centred approach to learning as it applies to computer science. After conducting a survey of the application of constructivism theory to science education as a whole, attempts are made to apply the theory to computer science education. Ben-Ari argues that the constructivst model argues that ontology is irrelevant, epistemology is relativistic and fallible, knowledge is acquired recursively, and that learning must be active. As part of the illustrative process, several studies are reviewed that show difficulties learners have with relativistic concepts (such as variable assignment) among beginners and the conflation of objects with variables, class etc among more advanced object-orientated programming. The issue, the author claims, is that the learners come out with an ontological attachment of the program to the computer, rather than a realisation that programs are simply logical. As a result, computer science is considered hard, autodidactic programming is unsuccessful, and Graphic-User Interfaces are unfriendly and non-intuitive.

By teaching computer science in a constructivist manner, Ben-Ari illustrates that the GUI is just a representational system for an underlying model, which must be explicitly taught, that abstract object-orientation can only follow after concrete models (such as structured and procedural programming) have been evaluated by the learner, and that recursive learning by bricolage and minimalism can be achieved through debugging processes and immediate and meaningful activities. One again however, a thorough review of the teaching material, and instruction method, indicates to a very large degree that following the review of 2008, the teaching of HPC and Linux at VPAC now fulfils the constructivist approach illustrated by Ben-Ali. Further, there is was no evaluation by the author on the possibility of cultural variation.

Finally finding an empirical paper that concerns itself with multicultural issues in computer science, Duval et. al., (2001) reviews the Ariadne Knowledge Pool System, a digital library of educational resources. Of particular note is that the system is multi-lingual, although the distribution of content is more in line with membership of the original consortium rather than of Europe as a whole. This is however, but one component of the meta-data that is associated with the files in the repository, with others including somewhat more subjective evaluations such as didactic context and difficulty level. In this sense the utility of copying more of the content currently available in the course manuals to an online repository is agreeable and is currently in progress. Certainly this would satisfy the criteria of re-usability and could also, via social networks of researchers according to culture, generate more and accessible interest in the course material.

With significant suggestions for learner engagement already applied in terms of teaching methodology and content availability, a review of curriculum is also a worthwile angle. This experience, specifically with recognition of cultural issues by Little et. al (2001). The authors seek to incorporate these issues for computer science and information systems education in the context of globalisation, which is certainly an appropriate issue for the current topic. Curriculum areas are defined in terms of technology overview, programming, systems analysis and design, software engineering, human-computer interaction, telecommunications and networking, databases, and web development. These were cross-referenced with cultural issues of multiculturalism, organisational cultures, professional cultures, socio-economic issues, and gender issues. Whilst this categories themselves are certainly open to question, the authors nevetheless identified curriculum areas that could be subject to modification according to the cultural issues idenified; the exceptions were programming, software engineering, and networking. On a technical level, the proposed integration of cultural issues into the curriculum included internationalisation of formats, and cultural variance for preferences in interface, along with several sociological and historical excersises whose effect was for consciousness raisinig. For the former examples, international formats are already in use along with a very high level of options for learning style preferences. With regard to the latter the course content has been very carefully designed to avoid any sterotypical representations in favour of concentrating on the scientific and programming requirements.

Proposal

It is not uncommon for research, pure or applied, to confirm established results. Certainly this is the case from this modest review of empirical papers on teaching computing to a multicultural audience has confirmed existing practises at VPAC. In terms of delivery method, instead of direct lecturing, an highly interactive sessions are carried out across several media with synchronous and asynchronous media to incorporate the widest possible variety of learning style preferences. The course curriculum has been designed to provide knowledge in sequential stages of complexity with practical but diverse problems across several scientific disciplines, and whose content is both motivational and avoids any stereotypical portrayal of individuals. Learners are provided many opportunities within the curriculum and in the delivery method to engage in collaborative activity and to develop networks for further research. With the strong application of the major components of engaged learning experiences applied, perhaps it should be expected that learner evaluation has been so positive. Certainly it can be raised, from the responses, that engagement in method, and motivational neutrality in curriculum content can perhaps be applied universally, at least among post-graduates, relatively independent of cultural background.

Nevertheless, a nagging question does remain on why there is such a significant and disproportionate number of students from NESB are enrolling and attending this course. Research from Andrew Harvey and Kemran Mestan of the Access and Achievment Research Unit of La Trobe University have reported (Harvey, Marsten, 2012) may provide some insight to this question, which notes that the existence of a significant language barrier tends to mean that at the tertiary level, NESB students "are underachievers at university and underemployed after it". The probable reason for the enrolment number demographic and even the responses become obvious only when considered from the adult learner perspective; enrolment occurs because the learners believe that they need training in the subject.

Obviously this is not to suggest that postgraduate students in scientific disciplines are in need of English training. Their English skills are certainly good enough for them to advance to a level of education that the majority of adults from an English-speaking background have not achieved. What the insight does suggest however that subsequent regular contact, assistance, and referals would be of significant help to such students in a manner that combines both mentoring and outreach. On an undergraduate level with reference to a multicultural environment, research by Peckham et al (2007) suggests that retention through mentoring can be carried out with a high degree of success.

Such an orientation alters the purpose of the interest in the multicultural environment as well. As Hazzan (2006) illustrates through three significant case studies (Agile, Carnegie Mellon University, Siemens), successful multicultural programs do not treat such issues as a target but rather as a means. As the organisation itself will benefit by enhancing and embracing diversity, it is in the organisations interest - in addition to ethical justifications - to carry out the project. Hazzan argues from this perspective adjusting technological courses to better fit unrepresented groups ought to be abandoned on the grounds that "the fit" already exists. Whilst this is an overly optimistic assessment of the possibility of sexism, racism, etc, in content and methods, there is certainly nothing wrong with being motivated by organisational interests for cultural change as well as making content and instructional changes based on universalistic moral principles.

As an organisation funding by the university members, it is in VPAC's interest to have increased enrolments in its HPC training courses and a high level of user satisfaction resulting from these courses. From the universities perspective, well-trained scientific researchers will be more able to produce quality research papers faster, which will attract greater funding. Thus the purpose of VPAC's training is not just the provision of a course to those who need it (with the noted NESB emphasis), but indirectly to encourage research publications. As a result, the specific example of instructional change recommended to improve student learning is actually outside the formal provision of course content itself, but rather outside of it.

VPAC will therefore use its existing training staff, with which a prior rapport has been established with the researchers, to contact a few researchers a week to evaluate their progress in using HPC systems, their collaboration with fellow researchers who have utilise applications in their area of interest, and their use of the organisation's resources in publication of research papers. This will, of course, be conducted with the agreement of the researchers in question. At the end of every six months the project will be reviewed by the organisation, using metrics of enrolment in the courses, papers published which have utilised the organisation's HPC systems, and feeback from the researchers themselves.

Further Elaborations

The great weakness in HPC is that despite its promise for providing mass computational resources for the scientific community, it has struggled to achieve enormous up-take in said communities because of a lack of training and experience in the research community. The presence of well-developed and media varied interactive course content and instruction at VPAC has led to increase in these numbers among that immediate community. This is however remains a small percentage of the potential usage, and therefore the potential improvement of scientific research in the region. The introduction of a mentoring and outreach project as an instructional extension to the training courses for revision and application becomes a form of evaluation of the course.

The greater adoption of HPC systems by the scientific community raises the prospect of establishing coursework in the subject, which is currently limited to electives in computer science in a handful of universities, with one well-known postgraduate level degree at the University of Edinburgh, although there is continuing research journals and conferences on the subject, with strong industry support. Whilst the lack of coursework may seem to be a surprising result for what is the provision of the world's most powerful computer systems, it is an understandable result when viewed as a highly specialised sub-discipline. Only when there is mass adaption of HPC systems by scientific researchers will there be the need for additional HPC systems and system engineers, and only then will there be the opportunity to conduct extensive coursework in the subject. This possibility is therefore a future elaboration dependent on the suggested project.

Bibliography

Anderson M., Taylor L., Sociology: The Essentials (4th edition), Thompson Higher Education

Ben-Ari, M. (2001). Constructivism in computer science education. Journal of Computers in Mathematics and Science Teaching, 20(1), 45-73.

Cooper, S., and Cunningham, S. (2010). Teaching computer science in context. ACM Inroads, 1(1), 5-8.

Duval, E., Forte, E., Cardinaels, K., Verhoeven, et. al., (2001). The Ariadne knowledge pool system. Communications of the ACM, 44(5), 72-78.

Harvey, A., Mestan K., Language too big a barrier for non-English speakers, The Australian, October 17, 2012

Hazzan, O., (2006), Diversity in Computing: A Means or a Target?, System Design Frontier Journal, 38.

Levy, S., "Hackers: Heroes of the Computer Revolution", Doubleday Dell, 1984

Little, J. C., Granger, M., Adams, E. S., et. al. (2001, June). Integrating cultural issues into the computer and information technology curriculum. In Working group reports from ITiCSE on Innovation and technology in computer science education (pp. 136-154). ACM.

Naps, T. L., Rößling, G., Almstrum, V., Dann, W., Fleischer, et. al.. (2002, June). Exploring the role of visualization and engagement in computer science education. In ACM SIGCSE Bulletin (Vol. 35, No. 2, pp. 131-152). ACM.

Peckham, J., Stephenson, P., Hervé, J. Y., Hutt, R., & Encarnação, M. (2007, March). Increasing student retention in computer science through research programs for undergraduates. In ACM SIGCSE Bulletin (Vol. 39, No. 1, pp. 124-128). ACM.

Rodger, S. H. (1995). An interactive lecture approach to teaching computer science. ACM SIGCSE Bulletin, 27(1), 278-282.

Wilson, G. V. (2008). "High-Performance Computing Considered Harmful". 22nd International Symposium on High Performance Computing Systems and Applications, 2008