URISC@SC17 and the #LongestLastMile
A multinational delegation recently attended the Understanding Risk in Shared CyberEcosystems workshop, or URISC@SC17, in Denver, Colorado. URISC participants and presenters from 11 countries, including eight African nations, 12 U.S. states, Canada, India and Nepal, also attended SC17, the annual international conference for high performance computing (HPC), networking, storage and analysis that drew nearly 13,000 attendees. Von Welch (Indiana University), who directs the Center for Trustworthy Scientific Cyberinfrastructure, provided expert oversight for the URISC program. Welch invited nine specialists who presented open-source tools and cybersecurity best practices.
URISC Presenter Nick Roy, Director of Technology and Strategy for Internet2’s InCommon Federation, explained eduGAIN and its benefits to the global research community. “From a local management standpoint, eduGAIN saves managers time and effort because home credentials provide authentication and access to resources, instrumentation and data that are physically located at institutions in in 48 member countries that comprise an interfederated trust fabric,” said Roy. “It’s more secure, and takes less time to manage since researchers must only remember one user name and password,” he added.
eduGAIN member map. Key: dark-blue indicates eduGAIN membership, green are voting-only, and aqua indicates "candidate" sites.
While eduGAIN’s convenience and added security would be welcome in the many resource-constrained regions represented by URISC delegates, it was difficult for some to imagine that they could ever engage; there are many physical and financial barriers to entry.
For more than 50 years, HPC has supported tremendous advances in all areas of science. But densely-populated communities can more easily support subscription-based commodity networks and energy infrastructure that make it more affordable for urban universities to engage globally. Research centers based in sparsely-populated regions are extremely disadvantaged. There are fewer partners with which to cost-share connectivity, and copper thieves make it challenging to sustain infrastructure in the poorest regions. Their universities have a more difficult time recruiting and retaining skilled personnel who must travel further for training. In some cases, consumer prices are 70-80 percent lower, so hardware and software purchases are inflated; everything is shipped from developed countries which increases the cost.
But these regions reflect globally-significant human capacity, environmental factors, biodiversity, geology and minerals. Each site has a unique perspective of our universe, and less-populated areas offer the most detailed and unfettered vantage points. We can’t expect rural universities to pay for the pipe used by the rest of the world, however. The effort will require global cooperation, with broad public and private financial support. When researchers everywhere can access data generated by and stored at these sites, progress will be accelerated toward solutions to problems that impact global climate, environment, food and water security, public health, quality of life, and world peace.
Justifying #LongestLastMile engagement one case at a time…
The pan-European data network for research and education, GÉANT, with U.S. stakeholders, forged the pathway that originally made eduGAIN possible. It was conceived by the global High Energy Physics (HEP) community whose users required access to HEP instrumentation and data located in the U.S. (Laser Inferometer Gravitational-Wave Observatory, LIGO) and Europe (Large Hadron Collider at the European Organization for Nuclear Research, LHC-CERN).
The Office of CyberInfrastructure and Computational Biology at the National Institute of Allergy and Infectious Diseases (NIAID is part of the U.S. National Institutes of Health), is another such driver, and NIAID Chief Information Officer Michael Tartakovsky is eager to accommodate more global researchers who are fighting infectious diseases.
NIAID supports centers in Mali and Uganda that provide support and services for collaborations working on treatments and vaccines for Malaria, Ebola, and tuberculosis (TB) via eduGAIN and GÉANT’s research and education federation (REFEDS R&S). Beyond Africa, NIH looks forward to providing access to research staff at Fudan University when the China Federation joins eduGAIN. They are also working with the Indian Federation and its National Institute for Research in TB. “By joining the global trust federation network, we can all work together to solve the most daunting global infectious disease challenges,” said Tartakovsky.
The computational biology community is working to solve the world’s direst grand challenges. South African Computational Biologist Nicola Mulder’s group from the University of Cape Town’s (UCT) Institute of Infectious Disease and Molecular Medicine is analyzing sequence data from African human genomes that are of critical importance to public health and food security research. Until the South African Centre for High Performance Computing (CHPC) introduced the Lengau supercomputer in 2016, UCT ran computations in the U.S. on Blue Waters at the National Center for Supercomputing Applications (NCSA). “We had access to NCSA computing facilities and then returned the processed data to South Africa; the processing and transfer took months to complete,” said Mulder.
Global energy demands will rely on a larger supply from alternative sources in the future, and Africa is expected to play a major role in energy production and innovation. “The need to power portable electronic devices and manage peaks and valleys associated with solar and wind energy will require more advanced battery storage solutions that will likely require minerals and rare earths that are abundant in sub-Saharan Africa,” said Principal Researcher Rapela Regina Maphanga (South African Council for Scientific and Industrial Research (CSIR), Modelling and Digital Science Division).
The global astrophysics and astronomy communities are watching sub-Saharan Africa with great anticipation. The Square Kilometer Array (SKA) is being built in the great Karoo region of South Africa and will be the world’s biggest radio telescope. With an expected 50-year lifespan, SKA is investing in regional infrastructure and human capital development, but SKA can’t do it alone; African infrastructure to serve the global research needs of the future will require a much larger investment.
In their SC17 keynote, Professor Phil Diamond (SKA Organization Director General) and Dr. Rosie Bolton (SKA Regional Centre Project Scientist) described the SKA project and its computational challenges. For the first phase of the project, which represents a fraction of what it will be in the future, the total processing power required in the SKA observatory’s Science Data Processors is about 250 PF (peak). Each SKA site is expected to generate up to 1 PB of data each day during full operations (from about 2026). SKA data will be globally-distributed to SKA “Regional Centres” which will provide researchers with access to data for analysis and processing. The design of this federated network is an interesting challenge since it will likely also support users from other observatories and even from other science disciplines as part of the HPC and networking infrastructures supported in each country or region.
With SKA’s presence in South Africa, a larger astro research presence will begin to take root in the region that will demand access to the global treasure-trove of data currently generated by six telescopes supported by the U.S. National Science Foundation, and complementary instrumentation, such as the Murchison Widefield Array (MWA), a precursor to SKA, in Western Australia at the Murchison Radio-astronomy Observatory (MRO).
LIGO’s Identity and Access Management (IAM) Architect Scott Koranda (University of Wisconsin at Milwaukee) which first piloted eduGAIN in 2014) said that MWA is establishing a new IAM infrastructure that is built on federated identity. Their services are published in the Australian Access Federation (AAF) and will soon be “pushed” into eduGAIN. “The eduGAIN component is important because MWA, like SKA, is a global project with scientists who live in and work from many countries,” said Koranda.
The important role NRENs play and their status in sub-Saharan Africa
African regional-serving universities benefit from fast and affordable bandwidth delivered via National Research and Education Networks, or NRENs, that engage with larger networks, such as the UbuntuNet Alliance in eastern and southern Africa, and WACREN in Western and Central Africa, to deliver more advanced service options. The major backbone then allies with Internet2 in the U.S. and GÉANT in Europe. Through this complex fabric of trust, it’s possible for NRENs to deliver eduGAIN service.
But, as was explained, developing an NREN from scratch is challenging for stakeholders in sparsely-populated, resource-constrained regions. In his December presentation to the Southern African Development Community (SADC) Cyberinfrastructure Forum that was co-located with the South African Centre for High Performance Computing’s (CHPC) National Meeting in Pretoria, SANReN’s Director Leon Staphorst cited a 2016 World Bank Report by Michael Foley titled, “The Role and Status of NRENs in Africa.” The document serves as an important guide for those who wish to develop, use or fund an NREN.
Staphorst shared a table of progress being made toward African NREN development. Among nations represented at URISC that participate in the African HPC Ecosystems and SKA Readiness Projects (see map and slide excerpt below), South Africa is the only country whose researchers use eduGAIN (through relationships with GÉANT, SANReN and SAFIRE). In South Africa’s case, the HEP community’s need to reach LIGO/LHC in Geneva, Switzerland was a driver, with biomed demand a close second; specifically, access to a global TB research protocol required by scientists at the University of Cape Town and Stellenbosch University.
Next in queue among HPC Ecosystems sites that are prospective eduGAIN members given the operational status of their NRENs and subsequent engagement with UbuntuNet, are Ethiopia, Kenya and Zambia. It’s likely that Madagascar and Namibia will be next, followed by Botswana, Mauritius, and Mozambique.
HPC ecosystems project footprint.
HPC ecosystems project sites/ NREN status
It can still require a considerable amount of time to move big data around the world, however. “African network traffic is currently routed via Europe before it travels to the U.S., and elsewhere,” said Julio Ibarra (Florida International University AVP for Technology Augmented Research). “Depending on the amount of data transferred among eduGAIN 48 member nations, the distance and number of Internet exchange sites along the way could cause significant delays,” he added.
Ibarra’s HPC On Common Ground @SC16 workshop presentation described a collaborative effort to facilitate “big data” transfers through the development of international software defined exchange points (SDX). The “AtlanticWave-SDX” is an NSF-funded project at Florida International University and the Georgia Institute of Technology, with support from Brazil’s NREN, Rede Nacional de Ensino e Pesquisa (RNP, and the Academic Network of Sao Paulo (ANSP). An SDX enables a domain scientist connected to an SDN network to use the network more intelligently; e.g., scheduling use when resources are available, or requesting a more favorable path.
In the future, Ibarra’s group hopes to explore the feasibility of establishing an SDX in West Africa, in collaboration with African NRENs, based on future availability of submarine cable spectrum for use by research and education communities between Western Africa and Brazil (scheduled 2018 and beyond, per Foley’s report).
Success, speed and reliability require some magic in the middle…
Irrespective of regional networks and the IAM infrastructure deployed at each site, moving and sharing massive amounts of data around the world requires a certain amount of geopolitical cooperation, compatible middleware and universally-adopted toolkits. One such resource is Globus which can securely and, more importantly, reliably transfer data in many scenarios where network availability and quality is highly variable. “Globus has already been successfully used on H3ABioNet to move and share data among far-flung research groups in Africa, and is currently being evaluated for broader adoption by a number of institutions in South Africa,” said Globus Co-Founder Ian Foster (University of Chicago).
With supercomputers capable of processing trillions of calculations per second, it’s unreasonable that critically-important research processes still require days or even months to complete. In light of current and anticipated global grand challenges, an accelerated process of discovery is fundamentally important to future generations’ prosperity, health and social stability. Developing the e-Infrastructure and human capital that serve the #LongestLastMile will require a globally-collaborative endeavor and investment.
URISC@SC17 was STEM-Trek Nonprofit’s third SC co-located workshop. Last year’s “HPC On Common Ground @SC16” program in Salt Lake City featured a food security theme. The SC17 program was led by Elizabeth Leake (STEM-Trek) and Von Welch (Indiana University), and was financially-supported by U.S. National Science Foundation grants managed by Indiana University and Oklahoma State University, with STEM-Trek donations from Google, Corelight, SC17 General Chair Bernd Mohr (Jülich Supercomputing Centre) and SC17 Inclusivity Chair Toni Collis (U-Edinburgh).
This article, by Elizabeth Leake, was originally published on HPCwire.
STEM-Trek wishes to thank URISC collaborator Von Welch (Indiana University/CTSC), the planning committee from IU and CHPC South Africa, reviewers, financial and in-kind sponsors, and presenters—especially Nick Roy (InCommon) who inspired this article. We appreciate delegates who took time to apply for and attend our workshop, and all who covered the bases in their absence at home.