How do I know what data or information was used to train AI tools?

Question

Q. How do I know what data or information was used to train AI tools?

Answered By: Amanda Suiters

Last Updated: Mar 26, 2025 Views: 18

The short answer is--you usually don't.

Most AI tools like ChatGPT were trained on large, mixed datasets from books, websites, and online forums. The training data isn't always made public, and it often doesn't include up-to-date or peer-reviewed academic sources. Most academic quality research is behind paywalls and cannot be included into the datasets due to copyright law.

This means AI tools may give outdated, biased, or incorrect information--especially in scientific, legal, or technical fields.

Links & Files

Artificial Intelligence: A Guide for Students Opens in new window

Chat with a Librarian

Find More Answers

Student (187)
Finding Information (62)
Faculty (60)
Sources (52)
Databases (43)
eBooks (36)
TSTC Answers (28)
Articles (27)
Research Tips (25)
Library Services (19)
Classroom Support (16)
Streaming Video (15)
Artificial Intelligence (15)
Research Papers & Presentations (14)
Accessibility (14)
Journals (13)
Software (13)
Using Library Resources (13)
Casual Reading (12)
Evaluating Information (12)
Canvas (12)
Troubleshooting (10)
Audiobooks (9)
ARC (9)
Citation Help (7)
Off Campus Access (7)
Newspapers (7)
Study Guides & Exam Resources (7)
Images (7)
Specialty Databases/eResources (7)
Permanent link (7)
Career/Job Information (6)
Chat (5)
Book a Librarian (4)
Research Assistance (4)
Circulation (3)
Library Policies (3)
Websites (3)
Allied Health (3)
English & Composition (3)
Request an item (2)
Hours/Contact Information (2)
Automotive (2)
Digital Media Design (2)
Education & Training (2)
Technical Problem (2)
Citation (2)
Tutoring (1)
Events (1)
Environmental & Safety (1)
Life Skills (1)
Aviation (1)
Construction & Maintenance (1)
Engineering & Manufacturing (1)
Biology (1)
Culinary Arts (1)
Mathematics (1)
Computer Labs (1)
Testing (1)
Workday (1)
Cache (1)
Student Aids (1)
Professional Development (1)
Soft Skills (1)
Collection Development (1)
Library Updates (1)
OpenAthens (1)

TSTC Library Services

Library Guides

Ask a Librarian

Question

Q. How do I know what data or information was used to train AI tools?

Answered By: Amanda Suiters

Last Updated: Mar 26, 2025 Views: 18

Links & Files

Chat with a Librarian

Find More Answers

Ask a Librarian

Question

Q. How do I know what data or information was used to train AI tools?

Answered By: Amanda Suiters Last Updated: Mar 26, 2025 Views: 18

Links & Files

Chat with a Librarian

Find More Answers

Answered By: Amanda Suiters

Last Updated: Mar 26, 2025 Views: 18