Northwestern Guidance on the Use of Generative AI
Generative AI offers the potential for new capabilities in research, education, and productivity. Expectedly, use of tools and services, including OpenAI’s ChatGPT, Microsoft’s Bing Chat, and Google’s Bard, is growing within higher education and across Northwestern University. Understanding what to look for when adopting these tools is key to ensuring the intended use is met while protecting University data.
Background
Generative AI is a general term for artificial intelligence that creates new content based on patterns from the data sets used to train it. These generative models learn patterns, structures, and features from the input data and can create content with similar characteristics. Generative AI tools commonly known today include ChatGPT, Bard, Bing Chat, Starry AI MidJourney, etc. Generative AI is also now embedded in many commercial products that can be integrated with other tools, such as OtterAI, Microsoft Co-Pilot, Zoom IQ, and Fireflies.
Privacy
In most cases, the data you share as part of your queries in generative AI tools will be accessible by others using the same tools. This is because generative AI learns by collecting, analyzing, and storing user-provided information. Therefore, University faculty, staff, students, and affiliates should not enter institutional data into any generative AI tools that have not been validated by the University for appropriate use and have explicit permission of the data provider.
Data Approved for Use with Generative AI
To determine whether your data requires special attention, consult Northwestern’s Data Classification Policy. If your data is Level 1 (non-confidential and public data), uploading it to generative AI tools is permissible. To process data above Level 1, any generative AI tool must have been approved through Northwestern IT’s procurement and security review processes.
The following table outlines Northwestern’s current services posture based on data classification:
Interaction Type | Public Data (Level 1) | Sensitive/Regulated Data (Level 2, Level 3, Level 4) |
Conversational/Interactive Mode | Use of publicly available tools (e.g., ChatGPT, Bing Chat, Bard/Gemini, MidJourney, etc.) | Microsoft Copilot for Bing, when signed in with a Northwestern Microsoft account for Level 2 and Level 3 data* |
Application Programming Interfaces | Use of Northwestern Azure with OpenAI with appropriate security and access controls | Use of Northwestern Azure + OpenAI with security and access controls that meet or exceed regulatory or data protection requirements |
*Copilot for Bing is approved for Level 2 and generally Level 3 data. Some Level 3 data types require additional non-technical controls that may not available in Copilot for Bing. Please check your contractual or legal obligations to determine if Copilot for Bing can be used with your Level 3 data. If you have additional questions about whether your data type is appropriate for Copilot for Bing, please contact the Information Security Office (security@northwestern.edu).
This page will be updated as other products are evaluated for security and privacy with Northwestern data.
Academic Use at Northwestern
Use of generative AI for teaching and learning purposes is governed by the Provost’s Committee on Generative AI, in tandem with Northwestern IT. Guidance on generative AI tools and the impact on teaching and learning can be found on the Office of the Provost website.
Research Use at Northwestern
Questions about AI usage pertaining to research applications can be directed to the Northwestern IT Research Computing and Data Services team.
All Other Uses of AI at Northwestern
Questions about AI usage pertaining to information security, privacy considerations, use of AI tools with Level 2 or above data, or evaluation of new third-party AI products can be directed to the Northwestern IT Information Security Office.
Further Considerations for Adoption
Generative AI models are designed to output the most common results possible based on their training data and will invariably tend to suppress less common or marginalized information. Further, these tools can “hallucinate” or make up random data that is not true as they do not have the ability to determine what is true or false.
- Revalidation of output is required: Given that generative AI models continue to adjust over time, the output of their responses could change too. It is also possible for generative AI models to “hallucinate,” or otherwise use false information in their responses. When deployed where output requires accuracy, these tools should be periodically re-validated to ensure appropriate responses.
- Original work and intellectual property: Ensure necessary approvals for use of original works and intellectual property are obtained prior to input into generative AI tools. Examples here could be original works developed by students, research results, or obtained copyrighted material.
- Informed consent: Users have the right to know when they are using generative AI tools. This knowledge can encourage informed decisions about whether to engage, share, or rely on the generated content.