2021-2022 Research Data Services Annual Report
The Research Data Services team within Northwestern IT Research Computing Services supports Northwestern researchers in working with their data. There are two main areas of support:
- Data science and visualization: data science, data visualization, and computer programming training, consultations, and collaborative support.
- Data management: support for the research data lifecycle, especially data storage and workflow solutions.
This report summarizes our activities during the 2021-2022 academic year.
Training Services
We provide training on data science, programming, and visualization skills for researchers across Northwestern. We continued virtual workshops in 2021-2022 and began recording many of them to provide increased flexibility for attendees.
In addition to workshops taught by Research Computing Services staff and student consultants, 651 different researchers from over 70 departments from every Northwestern school were able to learn data science skills with the online Dataquest platform. We also published 8 new learning guides/tutorials, as well as new self-study workshop materials.
Consultations
We support researchers through one-on-one research consultations and through small groups that meet quarterly as a part of our Bring Your Own Data (BYOD) Working Groups program. With additional staff coming onboard in 2022, we expanded capacity to support researchers’ data management needs through consultations. Researchers from all Northwestern schools made use of our consultation services.
Research Collaboration Highlights
This year we continued several research collaborations in addition to starting new ones. Some project highlights are below. Student consultants contributed to projects with a *.
LIGO Mass Plot
Updated an interactive visualization of gravitational wave data for Vicky Kalogera and the LIGO Collaboration. Created visualizations used in press releases and textbooks.
Linkage Analysis Tool
Created an interactive data exploration tool for Claudia Haase and the Life-Span Development Lab to visualize correlations of emotional and physiological measurements across pairs of people interacting during experiments.
Pediatric Organ Dysfunction
Designed interactive data visualizations for L. Nelson Sanchez-Pinto to support ongoing research and upcoming publications on pediatric organ dysfunction.
Healthcare Utilization by MS Patients*
Exploring patterns of use of the healthcare system by patients with multiple sclerosis with Dominique Kinnett-Hopkins using electronic health record data from CAPriCORN and the Northwestern EDW.
The Global Rules of Art
Completed a collaboration with Larissa Buchholz on 50+ data visualizations for her forthcoming book on the international art market.
Text Analysis for Medical Education
Continued supporting the Feinberg School of Medicine Augusta Weber Office of Medical Education on a Stemmler Fundsupported project to incorporate natural language processing methods in the student assessment process.
Identifying Family Trees for Medical Research*
Completed the development of code to identify familial relationships in electronic medical records with the Northwestern Medicine Enterprise Data Warehouse and Institute for Public Health and Medicine (IPHAM).
Paper Conservation Analysis*
Analyzed factors affecting the preservation of paper in collaboration with the Library’s Conservation Resident.
Illinois Prison Records Data Collection*
Helped the Health Disparities & Public Policy Program collect data from prison records.
Firefly
Continued development of Firefly, an interactive viewer for particle-based data, with Alex Gurvich and Claude-André Faucher-Giguère.
Materials Design Tool
Completed development of interactive data markup and visualization tools that allow researchers from the Center for Hierarchical Materials Design (CHiMaD) to teach workshops and courses.
Service Improvements
- The data management team worked with colleagues in Northwestern IT Cyberinfrastructure to refresh the Research Data Storage Service hardware, expand the system capacity, and update accounting processes.
- The data management team deployed a new Globus endpoint for OneDrive/Sharepoint to support research data transfers and workflows.
External Impact
Research Data Services staff contributed to the national research computing and data community in multiple ways. These contributions help staff stay up to date on new technologies and service trends, form collaborations with colleagues at other institutions, and establish Northwestern as a leader in the field. Contributions include:
- US Research Software Engineer Association (US-RSE) Steering Committee
- Campus Research Computing Consortium (CaRCC) Researcher-Facing Track Co-Coordinator
- NSF Cybersecurity Center of Excellence Trusted CI Fellow
- CaRCC Professionalization Working Group Co-Chair
- RDAP Summit 2022 Presentations (Bring Your Own Data (BYOD) Working Groups: A New Service to Multiply Staff Impact and Create Community; Data Professionals and Data Responsibilities in the Research Data and Computing Workforce)
- PEARC22 Paper (Characterizing the US Research Computing and Data (RCD) Workforce) and Panels (Campus Research Computing Consortium (CaRCC) Town Hall; Building Enduring Cyberinfrastructures – The Role of Professional Research Software Engineers)
- Practices and Experiences in Advanced Research Computing (PEARC22) Reviewer
- US-RSE Community Building Workshop Organizer
- Journal of Open Source Software (JOSS) Reviewer
- WiDS Chicago Datathon Team Mentors
Team Members
Staff
- Christina Maimone, Manager Research Data Services
- Aaron Geller, Senior Data Visualization Specialist
- Tobin Magle, Lead Data Management Specialist
- Brian Roland, Data Management Specialist
- Colby Witherup Wood, Senior Data Scientist
Data Science Student Consultants
- Haley Carter, Plant Biology and Conservation
- Rahul Devathu, Neuroscience and Data Science
- Amanda d’Urso, Political Science
- Xi Cindy Kang, Computer Science
- Julianne Murphy, Health Sciences Integrated Program
- Julie Anh Nguyen, Applied Math
- Jose Sotelo, Cognitive Psychology
- Carrie Stallings, Sociology
- Dan Turner, Linguistics
Data Management Summer Interns
- Nasir Simms
- Daniel Turner