Data Handling Guidelines for AI

General Guidelines

All members of the university community must access and use university data in ways that safeguard the data and protect the institution.

Units and members of the university community must ensure:

  1. Compliance with regulatory requirements, as well as third-party and other contractual data obligations.
  2. Data is used for the purposes for which it is collected and any restrictions for its use are observed.
  3. Data is collected, stored, and disposed of in ways appropriate to the risk and impact of unintended disclosure.

The following table provides guidance on the risks of using various tools with different types of university and external data.

'Personal' licenses, as described below, are truly personal and should carry those connotations, and in general should not be used by faculty and staff with university data. In contrast, it may be possible to acquire individual licenses under institutional or enterprise licensing agreements, which often include additional protections negotiated for the safe handling of university data.

Inclusion of a product in the list below does not necessarily mean an institutional agreement is in place that would allow for its use.

Definitions:

  • Personal license: For private individuals who purchase a license for their own use
  • Individual license: For a license purchased by a single person within an organization through an enterprise agreement
  • Institutional license: For organizations or business entities

Legend

= Currently available
= Not currently available
= Recommended - Generally safe to use with minimal risk
= Warning - Can be used with caution and specific guidance
= Not Recommended - Significant risks that make the tool unsuitable for the data classification

Data Policy Reference - Data Management Policy
Data Classification Reference - Data Classifications
Artificial Intelligence Tools - KB Article
Data Sanitization - KB Article
LicenseType Product Data Classifications and Recommendations
Public Data Internal Data Limited Data Restricted Data
USask
Institutional USask Data Centre Hosted AI
Locally Hosted AI - Research Segment 6 6
Locally Hosted AI - Managed Computer 6 6
Locally Hosted AI - Unmanaged Computer 6 6 6
Microsoft
Institutional Microsoft 365 Copilot Chat - web
Microsoft 365 Copilot
Azure AI Services 4 4 4 5
Copilot Studio 4 4 4 5
Personal Bing Search 1 2 3 1 2 3 1 2 3 1 2 3
Copilot for Home 1 2 3 1 2 3 1 2 3 1 2 3
Copilot Pro 1 2 3 1 2 3 1 2 3 1 2 3
AWS
Institutional Amazon Bedrock 4 4 5 4 5
SageMaker 4 4 5 4 5
Amazon Q 4 4 5 4 5
Personal AI Services 1 2 3 1 2 3 1 2 3 1 2 3
OpenAI
Institutional ChatGPT Edu 4 4 4
OpenAI API 5 5 6 5 6
Personal ChatGPT Free 1 2 3 1 2 3 1 2 3 1 2 3
ChatGPT Plus 1 2 3 1 2 3 1 2 3 1 2 3
Google
Institutional Gemini Workspace
Google AI Studio 4 4 6 4 6
Vertex AI 5 6 5 6 5 6
NotebookLM Plus
Personal Gemini Free 1 2 3 1 2 3 1 2 3 1 2 3
Gemini Advanced 1 2 3 1 2 3 1 2 3 1 2 3
NotebookLM 1 2 3 1 2 3 1 2 3 1 2 3
NotebookLM Plus 1 2 3 1 2 3 1 2 3 1 2 3
Anthropic
Institutional Claude for Enterprise 4 4 4 6
Claude API 5 5 6 5 6
Claude Research Teams 4 4 6 4 6
Personal Claude (Free) 1 2 3 1 2 3 1 2 3 1 2 3
Claude Pro 1 2 3 1 2 3 1 2 3 1 2 3
Perplexity
Institutional Perplexity Business 4 4 6 4 6
Perplexity Deep Research 4 6 4 6 4 6
Sonar APIs by Perplexity 5 5 6 5 6
Personal Perplexity (Free) 1 2 3 1 2 3 1 2 3 1 2 3
Perplexity Pro 1 2 3 1 2 3 1 2 3 1 2 3
DeepSeek
Personal DeepSeek - cloud 1 2 3 1 2 3 1 2 3 1 2 3
Zoom
Institutional Zoom AI Companion

*Notes :  The AI marketplace is changing rapidly; as a result, the classification levels are illustrative and subject to change. Always consult current service terms or reach out to IT Support for additional consultation and support.

  1. Do not use Personal license to process, store, or share university or third-party data as they lack the contracts or service agreements that safeguard ownership and control of university data.
  2. Personal license may incorporate user submitted content to improve their models. Use of such services with USask data can result in the loss of IP/patent rights or the leaking of content to other unauthorized users.
  3. Be careful when publishing the outputs from Personal license. These services do not include legal indemnification and publication of their outputs may lead to IP claims against generated content.
  4. When training new models with sensitive USask data take steps to restrict access permissions to the model. AI models may leak their underlying training data, so treat access to the model as equivalent to access to the raw underlying data.
  5. Use of advanced AI tools (APIs, agents, etc) should be reviewed with ICT early in the planning stages. AI tools can leak underlying training data and lie to users, so great care should be taken before integrating AI tools into public facing services.
  6. For leveraging AI services related to restricted and limited data, it is recommended to use USask, Microsoft, and AWS platforms with guidance from the Security and Operations teams.

Data Classification Guidance

USask Data

Type of USask Data

Institutional data: Data that is created, collected and stored by all units and members of the university community, in support of academic and administrative activities. Administrative data about teaching, learning, research and scholarly activity, such as grades, attendance, research grants held and publications generated, is considered institutional data.

Research data: Data that is created by or derived from research, scholarly, and artistic activities.

Personal data: Data that contains personal information about an identifiable individual as defined in the Provincial Local Authority Freedom of Information and Protection of Privacy Act (LAFOIP). This data if compromised or used inappropriately would have implications to the privacy of an individual.

Public Data

  • Data that is (or can be) generally available to all employees, the general public, and the media. Unintended disclosure of such information has no effect on an individual, a group or institutional operations, assets or reputation.
  • Examples include:
    • Course catalogs
    • University event schedules
    • General contact information for departments
    • Publicly available university policies and guidelines

Internal Data

  • Data that is available to those members of the university community or research project team with a clear need for access as part of their employment, academic, or research duties and responsibilities. Unintended disclosure of such information has minimal or no effect on an individual, a group or institutional operations, assets or reputation.
  • Examples include:
    • Aggregated or de-identified personal information
    • Intellectual property - patent applications and drafts of research papers
    • Raw and processed data from research with no human/animal ethics considerations
    • Internal meeting minutes and operational plans

Limited Data

  • Data of a sensitive or confidential nature which is intended for limited internal use. Unintended disclosure of such information has moderate effect on an individual, a group or institutional operations, assets or reputation. Access to data is generally limited to individuals in specific job functions.
  • Examples include:
    • Student numbers or grades
    • Detailed floor plans showing gas, water, hazardous materials
    • Data on identifiable human biological material (e.g. tissue samples)
    • Most identifiable personal information

Restricted Data

  • Data of a highly sensitive or confidential nature which is intended for restricted internal use. Unintended disclosure of such information is serious and has severe or adverse effect on an individual, a group or institutional operations, assets or reputation. Access to data is restricted to specific legitimate use cases.
  • Examples include:
    • Confidential research data
    • Data governed by third-party agreements/contracts with stipulations for restricted access
    • Identifiable health information (must comply with relevant health information privacy laws)
    • Data subject to export control regulations
    • Social insurance numbers / Credit card numbers / Equity and Diverity Information such as Gender Identity

Third-Party Data

Is data that is created or owned by a third party and is being used in support of academic, research and administrative activities. This data if compromised or used inappropriately would have implications for the third party. This includes data such as licensed content, data sets, or copyrighted material.

Public (free)

  • Examples may include:
    • Web sites
    • Open-access research articles
    • Publicly available datasets
  • Note: "Free" or "public" doesn't mean the content is not protected by copyright and it is not settled whether using copyrighted material in an AI tool is considered "fair use", verify compliance with organizational data sharing and copyright policies before using.

Limited / Restricted

  • Data governed by third-party agreements/contracts/licenses. Examples may include:
    • Confidential research data
    • Identifiable health information
    • Licenses data sets subject to intellectual property rights
    • Data provided under a licensing agreement from the publisher (for example, published research papers available through the University Library)
    • Data shared under non-disclosure agreements (NDAs - define scope of usage and sharing)