Skip to main content

Generative AI on Your Terms: Data and Privacy 101

Generative AI on Your Terms: Data and Privacy 101

The instructional practices shared in this article are ideas for exploration, not requirements for any instructor. They were developed by Northwestern IT Teaching and Learning Technologies in partnership with the Provost’s Generative AI Advisory Committee. Please note: 

  • Accessing Copilot via your Northwestern credentials is the recommended path for accessing a generative AI tool. Questions about whether a risk assessment has been performed or an institutional contract exists for a specific AI tool can be directed to the Northwestern IT Information Security Office (security@northwestern.edu). Procurement of new AI tools should follow university processes and policies regarding licensing and third-party risk assessments.  
  • Output from large language models (LLMs) can include false or incorrect information. Verifying accuracy via other sources is a critical practice for instructors, students, and staff to engage in when using LLMs.

In July 2024, I came across a generative AI tool that seemed promising for breaking down academic articles into manageable chunks of familiar language. However, having heard that some generative AI tools train on user input, I wanted to find out if copyrighted materials were protected. Though the FAQ on their site said the tool would not share my files with anyone, I could not find information about whether it would train its AI model on my uploaded files. So, I tried poking around the Terms and Conditions and found that uploading copyrighted material was actually prohibited. Not only did I not have a clear answer to my question about training the AI model, this got me wondering about the ways other tools might be using the things I put into an AI tool—not only articles I might upload, but also any of my own intellectual property. And what does this mean for courses I might teach that involve the use of generative AI? How can I make sure data and information is being shared on my terms?  

Below, I have summarized my findings regarding this question of data and information sharing consent. Please note that this information is accurate as of August 2024, but generative AI is a fast-moving field, and things change rapidly.  


A Baseline for Information Sharing 

When using generative AI in our courses, there are a few key practices Northwestern IT recommends: 

Protect Private Information

Start with Northwestern’s guidance that neither students nor instructors share private or otherwise sensitive information (about themselves or others) with generative AI. This guidance aligns with University policies and helps ensure appropriate data protections, which is an important aspect of responsible generative AI use.  

Use Copilot: The Safer Option for the Northwestern Community

Through the University’s Microsoft license, Northwestern students, faculty, and staff have access to Microsoft’s implementation of the GPT 4.0 large language model through Microsoft Copilot (available only through a smartphone app or internet browser). Access to Copilot is important because when you are signed in with your Northwestern account, any data you put into the chat is covered by Northwestern’s contract with Microsoft for data protection so that Microsoft does not use it for product improvement or to train their AI models. This is the closest interface to ChatGPT, but only provides data protections when signed in with a Northwestern Microsoft Account. (Learn more about logging in and using Copilot.)

Communicating Your Expectations to Students

Include a statement about generative AI in your syllabi. The AI at Northwestern site offers a guide to exploring course policy options as it relates to generative AI. 

Moving Beyond the Baseline 

Copilot is the generative AI tool available most broadly for Northwestern students, faculty, and staff. In addition to accessing existing resources from Northwestern IT for generative AI, it is a good idea to understand some of the things to look out for in a new-to-you generative AI tool and additions of generative AI to pre-existing tools.  


Questions to Ask about Data When Looking at a Generative AI Tool

  1. Will the tool train using my data? 
  2. What data will be shared with the AI model? 
  3. Will the tool store my data? 
  4. How will data be logged by the tool? (Logged data might be used in a dataset to analyze activity or trends.) 
  5. Will personally identifiable information (PII) be sent to the model?
  6. How will I know the output of the tool is accurate? Who owns the output?
  7. Are users offered settings to customize any of these behaviors?

Get Students in on the Conversation 

Bring your students into the conversation about sharing intellectual property with generative AI. You might even co-create your generative AI classroom policies with your students early in the quarter. Including students in the decision-making process around information-sharing and generative AI empowers them to become more intentional consumers of generative AI and can be done by examining information publicly available from a variety of tools.  

Note that the following exercises can be done without making new accounts with tools, but instead by accessing publicly available documents like a Terms of Service, Privacy Policy, Terms and Conditions, or User Guidelines. Some examples are the Grammarly Privacy Policy, Adobe Generative AI User Guidelines, Gemini Apps Privacy Hub, and the Elicit Privacy Policy 


Generative AI Review Committees 

Divide students into small groups and assign each group a privacy or terms of service document to investigate. First, have the group consider and discuss their opinions around information-sharing with generative AI. Then, have the group search the document for information about the ways their assigned generative AI uses personal data and submissions. The group can formulate a recommendation for the generative AI tool and present their findings to the class. 

Class Corpus (with Commentary)

  1. Either select an AI tool or request that students find a privacy policy or terms and conditions document from a tool on their own.
  2. Create a Canvas discussion board prompt like: 
Respond to the questions for the AI tool selected by your class in your reply, and then respond to at least one other thread with their opinions on whether you would feel comfortable using this tool (and why) based on your peers’ discoveries.   
  • What do the terms and conditions of "TOOL NAME" say about how my user/profile information will be used? 
  • What do the terms and conditions of "TOOL NAME" say about how my contributions (any information I type, upload, etc.) will be used?  
  • When I upload something, do I retain ownership of it? Does uploading it mean that the platform/company can now use, copy, or distribute it without my permission? (These ideas are usually discussed in terms of "rights reserved" or "rights to contributions." 
  • Does the platform/company explicitly say I should not upload or submit anything, such as copyrighted texts or personally identifiable information?

In Closing

Whether your class uses Copilot for coursework, or you explore tool policies without using the tools, you and your students will benefit from getting to know the data protection and privacy policies that are common in generative AI tools and beyond. The more we support our students in understanding the basics of engaging with generative AI during their time at Northwestern, the more thoughtful and nuanced their lifelong decisions around generative AI will be. 

 Video Resource

 

This explainer video lays out the big ideas behind how large language models generate material and their limitations.

 

From Northwestern Center for Advancing Safety of Machine Intelligence.


Continue Exploring this Topic

Generative AI and Teaching for Fall 2024
Breaking Down the Writing Process with AI
AI Study Buddy