Final workshop report

Summary

Distributed applications have become a core component of the Internet's infrastructure. However, many undergraduate curricula do not offer courses that focus on the design and implementation of distributed systems. As a result, undergraduates are not as prepared as they should be for graduate study or careers in industry. Historically, the problem has often been caused by a lack of resources, since many schools do not have the computing infrastructure needed to experiment with distributed systems. However, with the growing availability and accessibility of academic and industrial distributed computing platforms, this is no longer true. Even colleges with limited on-campus computing facilities now have the ability to experiment with large-scale systems using state-of-the-art networking technologies. This workshop focuses on developing and disseminating new tools and curricula for undergraduate courses in distributed systems and computer networks that leverage the resources available in publicly-accessible testbeds. The 20-30 attendees will come from a variety of backgrounds, including top-tier research universities, liberal arts colleges, and industry. The final result of the workshop will be a report that will be submitted to the CNS Division in the NSF, and a website that aggregates relevant educational material.


Vision and Motivation

As the number of Internet users continues to rise, Internet-based companies, such as Google and Amazon, are leveraging the aggregate computing power and robustness of distributed systems to satisfy the demands of their users. However, although distributed systems offer many advantages over non-distributed systems and applications, such as increased aggregate computing power and better resilience to failure, they also introduce many new challenges to developers. Configuring and maintaining a distributed set of computers for hosting an application is a tedious task. Detecting and recovering from bugs in applications that are running on hundreds of machines potentially spread around the world is much more challenging than debugging code locally. These challenges are overwhelming to developers without prior programming experience in distributed environments.

Since distributed computing is quickly becoming the de facto way to accomplish large-scale tasks on the Internet, one would think that the skills required to design, implement, and evaluate distributed systems would be available to students at the college and university level. Unfortunately this is frequently not the case, since many colleges and universities do not offer undergraduate courses in Distributed Systems. Particularly at small colleges, the problem is often due to a lack of computing resources. Without access to a dedicated computing cluster that permits students to run experimental code on distributed resources, it is difficult to expose students to the challenges of implementing and evaluating distributed systems. At large universities, although computing clusters are present, they are typically reserved for research purposes and are not readily available for use in the classroom. As a result, undergraduates are not receiving the training necessary to become good programmers in distributed environments. Further, they are not being exposed to current trends and topics in systems research, and thus many are not considering graduate school as a viable option after graduation.

Given the increasing popularity, accessibility, and maturity of publicly-available experimental testbeds, such as GENI, XSEDE, Open Science Grid, and Open Cirrus, as well as low cost industry-backed options such as Amazon EC2, Microsoft Azure, and Google AppEngine, educators are no longer restricted to using only on-campus resources for assignments. Using the resources available in these platforms, even students at small colleges with minimal local computing infrastructure can gain hands-on experience with large-scale distributed systems. The technological richness that was previously only available to a limited number of undergraduates at top-tier research schools is now widely available. The key goal of this workshop is to redefine systems education at the undergraduate level by taking advantage of these distributed computing testbeds.

The first step to redefining undergraduate systems education is to inform educators at a broad cross-section of colleges and universities about the resources available to them and their students. Thus another goal of this workshop is to develop a set of materials for educators, starting with how to access and use the aforementioned testbeds and platforms. In addition, sample assignments with varying difficulty and other course-related material that leverages these resources will be discussed. This workshop will bring together users and developers of publicly-available, shared testbeds who share a common goal of lowering the entry barrier to large-scale, cutting-edge systems research and introducing students to distributed systems at early stages of their careers.