Barcelona Centro Nacional de Supercomputacion Visit 2016

The Centro Nacional de Supercomputacion (BSC/CNS) is the peak national HPC facility in Spain and is home to MareNostrum (1.1 Pflops, 48,896 Intel Sandy Bridge processors in 3,056 nodes, including 84 Xeon Phi 5110P in 42 nodes, and 2 PB of storage; 29th system in the top500 in June 2013). They also have Mino Tauro, a heterogeneous GPGPU cluster. MareNostrum is not the most powerful system in the world, but it is the most beautiful. It is housed in the Chapel Torre Girona, a 19th century (deconsecrated) church.

The BSC/CNS has an extensive PhD and Masters programs (with the Polytechnic University of Catalonia), internship, and diverse training programme with PRACE, including programming, performance analysis, data analytics, and HPC systems administration. The Centre has a very active outreach programme, encouraging regular visits to their data centre, as well as an extensive training and lecture series. Mare Nostrum

The visit to the Centre was carried out with Research Platforms and NeCTAR and was advertised as part of the Severo Ochoa Research Seminar Lectures. After the lectures, we had an extensive discussion on the state and distribution of IaaS cloud deployments, and the internship programme that the BSC offers with other similar institutions, followed by a tour of the Torre Girona data center. We were also treated to a very memorable lunch at a local Catalan restaurant, Restaurante Pati Blau. It was a fine way to conclude the 2016 tour of European HPC facilities.

Special thanks are given to the members of the various European facilities who took their time to accommodate my visit and provide tours of their facilities. This includes (my deepest apologies for names I’ve overlooked!) Vassil Alexandrov, Maria-Ribera Sancho, Fabrizio Gagliardi, Javier A. Espinosa Oviedo at the Barcelona Supercomputing Centre

Originally posted at: http://blogs.unimelb.edu.au/researchplatforms/2017/02/18/barcelona-centr...

Comments

Martin Paulo's presentation to the Barcelona Severo Ochoa Research Seminar Lecture.

The NeCTAR research cloud is the largest academic and research cloud deployment in Australia. Running on OpenStack this presentation will give an overview of the architecture, use, and training programme.

I’m going to take you all on a history tour.

I was working as a software developer in the financial field in 2009, when I read that NASA had created their own Infrastructure as a Service cloud platform, called Nebula. It was intended to allow their scientists to share and work with large, complex data sets – something they felt AWS didn’t support very well.

I was still working in the financial field in 2010 when I read that NASA and RackSpace’s engineers had gone on a dinner date – and on that dinner date they had realized that RackSpace’s object storage platform could be married to Nebula to produce an open source IaaS cloud platform. And so from this dinner date the OpenStack project was born.

It initially just had two parts:
A compute platform, now called Nova
Object storage, now called Swift.

After that initial launch, more and more modules were added by an ever increasing community. For example, there is:

Image management (Glance)
Block storage (Cinder)
A dashboard (Horizon)
Identity (KeyStone)
Orchestration (Heat)

The list goes on.

So OpenStack is a modular cloud platform.

You can install subsets of these modules to support your cloud needs.

Whilst I was reading about OpenStack, a series of meetings was taking place across Australia. The research community was being polled on what support they needed for their computing. And what they wanted were flexible, low cost resources they could access on demand.

In response the Australian government funded two projects: NeCTAR and RDSI

NeCTAR = National eResearch Collaboration Tools And Resources
RDSI = Research Data Storage Infrastructure

NeCTAR was tasked with providing computing resources to researchers.
RDSI was tasked with providing data resources to researchers.

There is irony here. Australia was going in the opposite direction to OpenStack!

These two projects also chose to go in very different directions.

RDSI toured the country, talking to people, gathering requirements and information in order to produce specifications.

NeCTAR almost immediately launched a small deployment of an OpenStack based cloud at the University of Melbourne: and started to solicit feedback from researchers.

It was at this point that my path crossed NeCTARs. I was sent by my then employer to help NeCTAR iron out some of their software wrinkles. For OpenStack was very young and bugs were being encountered.

I was a Java developer, and OpenStack is written in Python, so my life became quite interesting!

Australia has a lot of states, a lot of politics, and a lot of distance. To bridge that distance, a high speed network has been built between the research institutions of the country. The network is managed by an organization named AARNet.

What NeCTAR planned to do was to bed in that initial OpenStack deployment with our bug fixes, then to roll out further nodes around the country, using the AARNet backbone. These nodes were to be hosted by academic institutions, working co-operatively to provide a bigger cloud, following a co-investment model.

The goal was to build a countrywide federated OpenStack deployment. You would login and then be able launch your VM’s in any of these nodes around the country.

To login it was required that a shibboleth based front end be developed: And that shibboleth front end was to interact with the AAF (Australian Access Federation, Australia’s academic identity broker). Once installed this meant that any researcher in Australia could login to the NeCAR cloud using their institutional credentials.

The initial deployment went live in Melbourne on January 31, 2012

To drive the adoption of their cloud NeCTAR then also funded a grant program that allowed research teams to apply for funds to develop either cloud tools or virtual laboratories.

It was at this point that my role with NeCTAR ended. My employer sent me of to work on one of the virtual laboratories.

So whilst I was creating software to run on the NeCTAR cloud, the NeCTAR engineers were adding new nodes.

It was whilst working on this virtual laboratory that a flaw in NeCTAR’s grand scheme became apparent to us.

Remember there were two organizations: RDSI and NeCTAR. RDSI were tasked with providing storage. NeCTAR with compute.

RDSI were out information gathering. NeCTAR were being agile, putting working software into researchers hands.

And the researchers on my project needed storage. As did the researchers on the other projects. NeCTAR had provided object storage – but that’s not what we needed. Object storage was just to slow and non standard to work with the existing software we were building on. We needed block storage.

After some arm wrestling NeCTAR agreed to spend some of their precious funds on sourcing block storage. But that block storage was locked into particular nodes.

So you could only run your “cloud” VM’s in the data centre where you had your allocated block storage.

All the while NeCTAR had kept on rolling out new nodes across the country. They had also kept adding new OpenStack modules to the mix. Something they do with every OpenStack upgrade. For example, the latest module being deployed is Neutron: networking as a service.

The NeCTAR OpenStack software is automatically deployed and managed by means of puppet scripts.

There is a happy ending for all of us people using NeCTAR who required storage. Eventually RDSI came to the party: they finished their paperwork and bought storage, co-locating some of it in the NeCTAR data centres.

So today you can apply for both compute, and for storage, and use it for your research if you are an Australian researcher.

What’s more, every Australian researcher gets a free six month trial project on the NeCTAR cloud: to activate it, all they have to do is to log in and to accept the terms and conditions. Once they are finished their trial they can apply for a permanent project on the NeCTAR cloud.

What NeCTAR and RDSI give researchers is uniform access to computing resources and the certainty that their data is being kept in Australia under Australian control. That’s fairly import for some of our research fields.

The researchers use these resources very broadly. From individual researchers running single instance Wordpress sites to large teams supporting cloud bursting genomics applications, it’s all covered.

There are some pain points. The biggest in my experience are
Windows, due to licensing issues.
Storage, due to history.
OpenStack’s application authentication method.

Windows and Storage cause you to be geographically bound, and hence negate some of the advantages of the cloud.

In fact one or two of the nodes are trying not to encourage new users because the current users have taken all of their available storage.

OpenStack’s application authentication method is too broad. If your application on the cloud is required to manage OpenStack on your behalf, you have to give it rights to your entire project. Not good if your instance gets hacked!

NeCTAR provide a support team that answers end users questions, a knowledge base, and access to two different types of training. The first type is on demand online training, accessible via the support web site.

The second is where my history intersects with NeCTAR’s again.

At the start of the year I joined the University of Melbourne. And immediately was tasked with developing a hands on software carpentry style course for NeCTAR. And an accompanying “train the trainers” course.

Subsequent to that we’ve been delivering both of these courses around the country. And repeatedly to researchers at the University of Melbourne.

That’s because NeCTAR’s uniform interface, online support and knowledge base and training provide an advantage to the University: they can, in a sense, freeload off of this cooperatively developed platform.

The University has built their own private cell, for the use of their own researchers only. In this cell their researchers can access non standard large Virtual Machines, and large dedicated storage pools.

This private cell model is being followed by other host institutions around Australia.

But the University of Melbourne has gone one step further. They have used OpenStack in this private cell as the basis for an innovative new HPC system. Which my colleague, Lev Lafayette will now talk about.

Links:

OpenStack: https://www.openstack.org/

NeCTAR
Home page: https://nectar.org.au
Tools: https://nectar.org.au/about/tools/
Labs: https://nectar.org.au/labs-and-tools/
The node map: https://nectar.org.au/about/impact-and-usage/node-map/
Support: https://support.ehelp.edu.au/support/home (has training and knowledge bases)
Growing usage: https://nectar.org.au/about/impact-and-usage/virtual-laboratory-usage/

Training:
Software Carpentry: http://software-carpentry.org/
Hands on Training: https://github.com/resbaz/nectar-cloud-lessons
Train the Trainer: https://github.com/NeCTAR-RC/ResOsTrainTheTrainer
On line training: http://training.nectar.org.au/

Others
AARNET: https://www.aarnet.edu.au/
AAF: https://aaf.edu.au/ (Australian Access Federation, Australia’s academic identity broker)