The dramatic shift toward data-driven discovery in science – eScience – foreshadowed the recent emphasis on “data science” that is transforming industry. At the University of Washington eScience Institute, we see the two trends as mutually reinforcing: industry is investing deeply in new technology that can be repurposed for science, and science is training the next generation of data-savvy researchers who help advance a culture of statistical rigor.
In this talk, I'll describe some of the findings from our activities aimed at advancing and integrating both fronts. In education, we recently completed a massively open online course “Introduction to Data Science” serving over 100,000 registered students, we are bootstrapping a new Phd Track in interdisciplinary big data work, and we led the creation of two certificate programs for returning professionals. In research and infrastructure, we have taken an “everything-as-a-service” approach, deploying database, visualization, and massive-scale analytics services targeting science applications in both the head and the “long tail." Organizationally, we are launching a new “incubator” program to provide seed funding and dedicated staff for short-term, high-impact data science projects.
I'll end with some lessons learned in fostering collaboration between technologists and domain researchers, and some ideas for the future.