One of the main reasons for my trip to SC11 was to look at the education program and talk with other national collaborations about how they approach education and training.
I took part in the Education Program, and also attended Workforce Development Sessions and BOF sessions. I was also able to have discussions with various people about how they are approaching education, training and outreach.
The overarching sense that I got from my time at SC11 is that education about parallel computation is not just a HPC and Grid computing issue, it is an issue that is at the heart of computing education. The growth in computing power no longer comes from faster clock speeds but from having more cores tackling a problem. This means that all developers need to "think parallel" it is no longer an special or extra thing to learn.
Four themes stand out from everything that I heard, saw and experienced, they are Computational Thinking, Parallel Programming, Computational Science and also Hardware. I'll explore each of these in turn and include links to resources that have been highlighted.
Computational Thinking is thinking about problems in a computational way, but not necessarily with a computer. Bob Panoff of the Shodor Foundation led many of these sessions and really brought things to life. Showing with simple things like knot tying how we often think about a problem in one way (and that way is a sequential way), but if we look at it in another way we can find a parallel solution.
One example that allows you to explore solving a problem in a number of different ways, and look at the potential bottlenecks is:
Imagine you are taking a van full of kids on a camp, you need to do the shopping on route, you have a wallet full of cash and an itemised shopping list. How can you get the shopping done. Sequential options include locking all the kids in the van and doing the shopping yourself. Massively parallel options include sending each kid off to get items. Things you may need to consider, no one knows the layout of the store, how long does each transaction at the checkout take, what happens if you have short kids and they have items on high shelves. How many checkouts are available?
This is a problem that you can explore a large number of parallel solutions with, you can model it in various ways and have a good discussion about all of the things that you need to consider. You don't need a computer to do this.
The stars of computational thinking were given as:
- Problem Solving
Resources for Computational Thinking can be found on the Shodor Website. Related resources, not so much focused on parallel but on teaching Computer Science concepts without a computer is CS Unplugged http://csunplugged.org/
Parallel Programming (Developing Software system using parallel approaches)
If we can think about problems in a parallel way through Computational Thinking, we then want to implement them, this requires parallel programming. The parallel programming arena is still dominated by Fortran and C code which may prove difficult for those that hav learnt to program using OO languages.
The parallel programming space has been fairly static for a number of years with MPI and OpenMP programming being the norm, now however we have the ability to use GPGPU languages like CUDA and OpenCL as well. Launched at SC11 was OpenACC which will allow programming of heterogeneous systems
With parallel programming the optimal solution is not a fixed target but one that changes over time. As processor speed, memory latency, communication latency, disk access all change, the balance of how to structure computation changes. At the moment the mantra would be "calculate, don't store", but this hasn't always been the case.
The traditional development loop is develop - test - deploy for parallel systems that may change to be develop - test - profile - refactor & optimise - deploy, particularly as moving to a parallel implementation from a serial implementation may initially result in code that runs slower.
It was highlighted that we should be teaching "concepts that endure, skills that are useful" particularly with the fast changing nature of Supercomputing. It became clear that having a 'course' on parallel programming was not a long term solution. What is needed is the weaving of parallel computation throughout the undergraduate curriculum so that it isn't something that a 'special' but something that is normal.
Computational Science & Engineering
What is Compuational Science and Engineering is the use of computation to do (and teach) science and engineering. The Education Program at SC11 had a number of parallel streams each looking at a different discipline. The goal here was to show computational techniques, tools and applications can be used in educating students about the discipline. One of the major learnings was that increased computational power allows more things to be done interactively. So solving an equation and graphing the result & then manipulating parameters in the equation can all be done in real time, students can then "see" how the equation works rather than learning from symbols and static graphs.
Simulations that can be run in the classroom in real time can show how things being studied behave, and what happens if conditions change. This allows for a more interactive learning experience and for students to ask "What If?" questions.
Xsede is developing a set of core competencies for Computational Science that can be taught as part of a science curriculum. These can be used to help bring computational science into the classroom curriculum and demonstrate that computational science is a normal way of doing science rather than an extraordinary one.
One significant issue that is raised with the integration of computation into teaching is the upskilling of faculty members so that they are confident with using computation themselves. One approach to this has been to offer workshops for educators, an example of this is the Summer Workshop program by the US National Compuational Science Institute.
The development of high performance computers requires knowing more about the hardware of the machine than the average person buying a standard desktop or laptop would need to know. It is analagous to owing a racing car that you want optimum performance from, over a 'standard' car. Particularly for cluster computing it is important to understand the components of the system, the interconnects and how they interact.
The number of people who have expertise in this area is quite small internationally and growing that pool of knowledge is important. Four initiatives that I saw/heard about at SC11 are:
- Computer System, Cluster and Networking Summer Institute
- Student Cluster Challenge
- Probe Lab
In a little more detail...
LittleFe is an project with a two fold goal, the main goal is providing a portable cluster that is relatively cheap and that can be transported in hold luggage by air and used for educational purposes. The second is to give people to opportunity to get experience assembling a cluster themselves. The cluster is built of mini-ITX Atom boards with a CUDA capable GPU. The cluster has 6 nodes and runs the Bootable Cluster CD Project software.
SC11 had a number of "LittleFe" buildout sessions where teams of two assembled from components a LittleFe cluster and then ran a number of "burn-in" tests.
This was a great program that is being supported by Intel and Educational Alliance for a Parallel Future (EAPF).
Computer System, Cluster and Networking Summer Institute
A third year undergraduate summer institute that is delivered at Los Alamos National Lab exposes sudents to the practical skills in setting up a real cluster and working with local staff to troubleshoot those clusters and get real codes running on them. The students benefit from working with the latest cluster computing hardware, and sometimes have to contact vendors for support and to report bugs. This is hands-on learning with cutting edge technology.
Student Cluster Contest
A part of SC for the past few years has been the Student Cluster Competition. Students from around the world form teams, gain sponsorship and train to compete at SC in a competition where they have to assemble, tune and benchmark their cluster before running a variety of science codes. The competition gets the students thinking in depth about the design and power envelope of their cluster (they have a hard power limit) and also into tuning hardware and software for best performance.
The PRObE lab is part of the New Mexico Consortium and provides a "Parallel Reconfigurable Observable Environment" based on EMULab and allows clusters to be setup and observed remotely to perform experiments. This can include hardware and software experiments, and permits the type of activity that couldn't be supported on a production level cluster due to disruption to other users. One issue highlighted a number of times during the conference was that software increasily needs to be "designed for node failure" as once you are running across thousands of cores the likely hood of one failing during a long run job is high.
It is important to be developing skills in "under the hood" areas of HPC as well as in software development.
As I highlighted at the start of this post, the issues being faced in Supercomputing/HPC/eResearch are not just about getting more "parallel programmers" it is a wide ranging problem. One of the way's to spread the word is through outreach activities, in this section I want to highlight a few that I heard about.
Before I get into specific schemes, a message that did come through was "start early", once students are at university, or even high school it can be too late, they may already be switched off STEM (Science, Technology, Engineering & Maths) Subjects.
The Scratch programming language allows parallel programming, Scratch is being used by many to teach early programming to students. If parallel programming models can be introduced here then students can see if as a normal part of programming.
New Mexico Supercomputing Challenge
A state wide challenge run in New Mexico, the New Mexico Supercomputing Challenge is for junior school students who attend a "kick off" conference to learn some skills and then work on a project of their own in small teams. The students use tools such as NetLogo & Python and undertake project relating to all aspects of science, often including a strong element of simulation. The challenge also runs a Teacher Institute to help upskill teachers and encourage them to support students taking part in the challenge.
Project GUTS (short for Growing Up Thinking Scientifically) is a scheme for (middle school) summer and after school clubs to engage students in thinking scientifically and looking at their local community and the topics that impact that community.
The Xsede project following on from a similar model under TeraGrid has "Campus Champions" at institutions around the country (USA), both at resource provider sites and non-resource provider sites. These campus champions will act as a first point of contact for users, come from a variety of backgrounds and have an allocation of resources that they can provide to new users to get them up and running quickly whilst they go through the more formal signup processes to get their own account and allocation.
The UK also has a champions scheme but they have Community based champions rather than institutional champions. The community champions can engage with those in their community of research to evangalise about HPC/Supercomputing/eResearch.
Not explicitly presented at the conference, but mentioned in conversation and neatly linking into the use of Python for HPC is the Software Carpentry course. This course is designed to teach Scientists and Engineers programming and software development. It makes use of Python and includes parallel programming concepts in the course material.
What for NZ?
From the long list of things in this blog post the standout questions is "What should we be doing in NZ?"
The answer to the question depends on who you are:
If you are a Computer Science Educator
- Include parallel computation in your courses and encourage other staff to include it in their material as well. Is your programme getting an onverhaul, build parallel in from the ground up
If you are a Science/Engineering Educator
- Think about what computational tools you can use in your teaching, can this explain & demonstrate concepts better than static images, charts and textbooks. Think about the computational tools that your students can use in labs or for research projects.
If you are providing infrastructure
- Can you set aside cycles for educational use, 1-5% was a common amount mentioned at SC11. How will you communicate with users, are campus champions or community champions appropriate? Don't forget those campuses/communities that are not providing resources but that will provide users.
- Can you run older hardware as a PRObE style lab to allow development and testing of code under various conditions.
- Can you support outreach to the next generation of developers and users
If you are a school teacher
- Can you engage with your local tertiary institute to run some computationally focused sessions for your students.
- Can you include computational thinking in what you do?
If you are in industry
- Can you volunteer at a school to enthuse some kids about computing
- Can your firm support or sponsor outreach and engagement
- Do you need to upgrade the skill set of your staff to cope with multi-core processors and to take advantage of accererators such as GPGPUs?
Blog post by Michael McCool (Intel) https://software.intel.com/en-us/blogs/2010/09/03/parallelism-education-and-the-role-of-abstraction/
Blog Post by Paul Steinberg (Intel) https://software.intel.com/en-us/blogs/2010/08/30/a-sea-change-in-computer-science-education/