The dramatic shift toward data-driven discovery in science – eScience – foreshadowed the recent emphasis on “data science” that is transforming industry. At the University of Washington eScience Institute, we see the two trends as mutually reinforcing: industry is investing deeply in new technology that can be repurposed for science, and science is training the next generation of data-savvy researchers who help advance a culture of statistical rigor.
eResearch NZ 2013
High performance computing (HPC) has the potential to revolutionise New Zealand research. Many of the lessons for applying HPC to research span research disciplines. As New Zealand’s adoption of HPC matures at differing rates across communities and institutions, an opportunity exists to hold an inter-disciplinary event alongside eResearch NZ 2013 to spread knowledge between practitioners and to strengthen the bonds within the HPC community.
eResearch NZ 2013 session type:
The impact of HPC on molecular simulations over the past ten years: Confessions of a reborn researcher
In a past (recent) life I was the director of an HPC centre that saw growth in its HPC capacity of over 50,000 times in ten years. This increase in computer power has had a significant impact on the field of molecular mechanics and the kinds of simulations that can now be routinely undertaken. In this presentation, I will:
eResearch NZ 2013 session type:
In this poster we present preliminary work on integrating a workbench for running active workflows into a digital library software framework to support scholars in a broader range of eResearch activities than either type of software supports in isolation. More specifically, the open source Meandre Workbench from the University of Illinois was embedded into the Greenstone digital library software framework from the University of Waikato (also open source).
New Zealand eScience Infrastructure has the privilege ofhosting 3 high performance computers and one visualisation cluster. This poster will show the capability and capacity of these machines, how they are connected via REANNZ's research network and the services that are designed to empower research.
A fundamental component of research is connectivity – both with other researchers and with those who will ultimately gain value from the research. Traditionally this notion has been discussed in the context of researchers disseminating their results to end-users through publishing and teaching. But the advent of technology and social media means that there are now more and better ways of connecting with people than ever before, and a core capability that researchers need to develop is how to best use the technology to improve the quality of their interactions.
Over the last 18 months, Pan, the New Zealand eScience Infrastructure Auckland cluster has grown from a single iDataplex rack, with 2 support racks to four iDataplex racks housing 1056 Westmere and 3296 Sandybridge i386 cores, with and additional 72 cores in 3 large memory nodes. There are 16,384 Tesla M2090, 26,880 K20X and 240 Intel Phi 5110P cores. Growing pains, and operational experience have shaped the current cluster topology.
eResearch infrastructure should be secure from intrusion and should be carefully monitored to ensure its computing resources are operational as expected At the Centre for eResearch, we have implemented automated security measures and integrity checks to ensure this for the Pan cluster as described in this poster presentation. Integrity checking software (“Pan Cluster Health Report”) developed in-house, Icinga integrity software, Tripwire security software, and other elements will be included.
PAN is an x86 cluster, featuring over 4000 cores with over 50 TB of total memory, fast interconnect and an extensive storage. The facility is used by over 300 researchers. The poster demonstrates the configuration of infrastructural support for the cluster, which includes a variety of databases and a mix of shell scripts, configuration files and executables which allows the systems and support team to efficiently control and re-focus the cluster as well as quickly obtain usage reports of arbitrary complexity.
Formal data management plans (DMPs) are becoming required for projects funded through UK and EU sources. Data sets are increasingly being expected to accompany published articles or be made available in order to encourage re-analysis and scholarly transparency. DMPs outline expectations of how datasets will be presented as final research outputs. Considerations for a DMP include format, privacy, licensing and re-use restrictions, archiving, persistent identification and compliance to funder’s and institutional policies.