Previous presentations to eResearch Australiasia described the implementation of Spartan, the University of Melbourne’s general- purpose HPC system. Initially, this system was small but innovative, arguably even experimental. Features included making extensive use of cloud infrastructure for compute nodes, OpenStack for deployment, Ceph for the file system, ROCE for network, Slurm as the workload manager, EasyBuild and LMod, etc.
Secure Shell is a very well established cryptographic network protocol for accessing operating network services and is the typical way to access high-performance computing (HPC) systems in preference to various unsecured remote shell protocols, such as rlogin, telnet, and ftp. As with any security protocol it has undergone several changes to improve the strength of the program, most notably the improvement to SSH-2 which incorporated Diffie-Hellman key exchange. The security advantages of SSH are sufficient that there are strong arguments that computing users should use SSH "everywhere".
Recently, a friend expressed a degree of shock that I could pull old, even very old, items of conversation from emails, Facebook messenger, etc., with apparent ease. "But I wrote that 17 years ago". They were even dismayed when I revealed that this all just stored as plain-text files, suggesting that perhaps I was like a spy, engaging in some sort of data collection on them by way of mutual conversations.
For my own part, I was equally shocked by their reaction. Another night of fitful sleep, where feelings of self-doubt percolate. Is this yet another example that I'm have some sort of alien psyche? But of course, this is not the case, as keeping old emails and the like as local text files is completely normal in computer science. All my work and professional colleagues do this.
What is the cause of this disparity between the computer scientist and the ubiquitous computer user? Once I realised that the disparity of expected behaviour was not personal, but professional, there was clarity. Essentially, the convenience of cloud technologies and their promotion of applications through Software as a Service (SaaS) has led to some very poor computational habits among general users that have significant real-world inefficiencies.
It may initially seem counter-intuitive, but sometimes one needs to process an image file without actually viewing the image file. This is particularly the case if one has a very large number of image files and a uniform change is required. The slow process is to open the images files individually in whatever application one is using and make the changes required, save and open the next file and make the changes required, and so forth. This is time-consuming, boring, and prone to error.
Yesterday I noticed that a number of Drupal websites I look after were down, and it was a pretty interesting error message:
MYSQL ERROR 2049 (HY000): Connection using old (pre-4.1.1) authentication protocol ref used (client option 'secure_auth' enabled)
There are two types of people in the world; those who have lost data and those who are about to. Given that entropy will bite eventually, the objective should be to minimise data loss. Some key rules for this backup, backup often, and backup with redundancy. Whilst an article on that subject will be produced, at this stage discussion is directed to the very specific task of recovering data from old machines which may not be accessible anymore using Linux.
Some thirteen years ago I worked with Xen virtual machines as part of my day job, and gave a presentation at Linux Users of Victoria on the subject (with additional lecture notes). A few years after that I gave another presentation on the Unified Extensible Firmware Interface (UEFI), itself which (indirectly) led to a post on Linux and MS-Windows 8 dual-booting. All of this now leads to a some notes on using MS-Windows as a host for Ubuntu Linux guest machines.
Almost two years ago I made a short blog post about how the Net Promoter Score (NPS), commonly used in business settings, is The Most Useless Metric of All. My reasons at the time is that it doesn't capture the reasons for a low score, it doesn't differentiate between subjective values in its scores, and it is mathematically incoherent (a three-value grade from an 11-point range of 0-10).
The translation of arithmetic to physical hardware with using the IEEE standard employed numerical representation is fraught with difficulty. As is well known by any who have used even a pocket calculator, computer processors are imprecise with dangerous rounding errors, which vary on different systems. Further, the standard representation method, IEEE 754 "Standard for Floating-Point Arithmetic" (1985, revised 2008), is extremely inefficient from an engineering perspective with increasing physical cost when additional precision is sought.
High Performance Computing (HPC) is the most effective method to process increasingly large and complex datasets, making them increasingly critical for research organisations. Researchers wanting to use HPC resources often start with low levels of skills in using those systems. Despite this situation, educational programmes coming out of well-informed user needs analysis and/or a widely acknowledged set of required skills, capabilities and knowledge are rare.