AI Foundations · Module 02 April 2026
Responsible.·AI.

The human in the loop. When AI decides about loans, hires, and diagnoses, someone still has to be accountable — and that someone is you.

Why this module 02 / 16
The stakes, in one sentence
AI is already deciding who gets hired, loans, and care.

Technical excellence is necessary but not sufficient. Without a foundation of fairness, accountability, transparency, and privacy, AI systems amplify past discrimination and cause real harm. This module is about the guardrails — what they are, why they exist, and your role in using them.

Part One 03 / 16
01
Part One · Ethical challenges

Two problems that don't fix themselves.

Bias and hallucinations are not bugs in the normal sense. They're properties of how the system learns.

AI bias 04 / 16
The inherited problem

AI reflects the data it was trained on — including the unfairness in it.

Where bias comes from
  • Historical — past discrimination baked into the data
  • Representation — some groups underrepresented in training
  • Measurement — collection methods favor certain groups
  • Aggregation — one-size-fits-all models ignore minority patterns
What it looks like in the world
  • Healthcare — algorithms allocating less care to Black patients
  • Criminal justice — risk scores with racial bias
  • Hiring — resume screening favoring male candidates
  • Credit — lower limits for women (Apple Card, 2019)

Bias is not intentional. That's what makes it dangerous — nobody thinks they have to check.

Hallucinations 05 / 16
When the model gets creative with facts

AI predicts likely words — not true ones.

Why they happen
  • LLMs optimize for plausibility, not truth
  • Training data includes misinformation
  • Systems are pressured to always produce an answer
  • Models can't say "I don't know" by default
Real consequences
  • Lawyers citing non-existent cases in court
  • Students submitting essays with fabricated references
  • Medical misinformation presented confidently
  • Business decisions made on invented data

Prevention: always verify, cross-reference multiple sources, treat AI as a starting point, and prefer tools with citation capability (NotebookLM, Gemini) for anything you'll act on.

Part Two 06 / 16
02
Part Two · Building responsibility

Six principles. One moving target.

Global regulators, standards bodies, and industry coalitions mostly agree on the principles. The hard part is operationalizing them.

The six core principles 07 / 16
Transparency → accountability → human oversight

The ethical foundation of every AI system worth trusting.

01

Transparency

Understand how an AI system reaches its decisions — to the extent possible.

02

Accountability

Clear human ownership for every AI-driven action and its consequences.

03

Fairness

Equal treatment across groups. Audit for bias. Measure outcomes, not intent.

04

Privacy

Minimize data. Protect personal information. Consent, not assumption.

05

Safety

Prevent foreseeable harm — technical, social, psychological.

06

Human in the loop

Keep meaningful human oversight on any decision that affects people's lives.

IEEE · Partnership on AI · UNESCO · EU AI Act (prohibited uses in force Feb 2025 · high-risk systems Aug 2026) — different texts, same backbone.

Guardrails 08 / 16
Keeping AI on track

Layered safety mechanisms — not a single switch.

LAYER 01

Input filters

Block harmful or out-of-scope requests before the model sees them.

LAYER 02

Model-level

Training and fine-tuning shape the model's baseline values and refusals.

LAYER 03

Output filters

Screen generated content before it reaches the user — toxicity, PII, violence.

LAYER 04

Content moderation

Remove inappropriate material at the application layer.

LAYER 05

Safety classifiers

Specialized detectors for self-harm, extremism, CSAM, credential leaks.

LAYER 06

Human review

Escalation paths for edge cases. The final fallback that can't be bypassed.

The challenge: balance safety with usefulness. Over-restrictive guardrails block legitimate work; under-restrictive ones create real harm.

Part Three 09 / 16
03
Part Three · Learning from failure

When safeguards weren't there.

Two high-profile failures, each with a lesson that shaped the industry's current defaults.

Case · Amazon recruiting AI 10 / 16
Amazon · 2014 → 2018 Hiring · HR automation
10 years
of historical resumes used to train a recruiting model — and the model learned the bias in them.
What happened: the system downgraded resumes containing the word "women's" and penalized graduates of all-women colleges. It learned that historically, male candidates were preferred — and treated that history as a rule. The response: Amazon scrapped the system entirely. They couldn't guarantee the bias could be removed. The lesson: training on biased history automates the bias. Always audit outcomes, not just accuracy.
Case · Microsoft Tay 11 / 16
Microsoft · 2016 Conversational AI · Twitter
24 hours
from friendly launch to forced shutdown, after coordinated trolls fed the bot racist and extremist content.
What happened: Tay was designed to learn from Twitter conversations. Trolls exploited the "repeat after me" behavior and the absence of content filters. Within hours the bot was posting deeply offensive content. Microsoft took it offline. The lesson: any system that learns from public input and acts in public needs adversarial testing, content filters, and rate limits. Assume users will try to break it — because some will.
Part Four 12 / 16
04
Part Four · Transparency & privacy

The black box and the data inside it.

Two practical problems that show up on the job: unexplainable decisions, and the temptation to feed AI things it shouldn't see.

The black box problem 13 / 16
Why "because the model said so" doesn't cut it

Modern models can't always explain themselves — but the affected person still deserves an explanation.

Why decisions become opaque
  • Millions of calculations per inference
  • No single rule explains any answer
  • Even developers can't trace the logic
  • Patterns are learned in spaces humans can't see
Where it hurts
  • Credit — "denied" with no explanation
  • Medical — AI flags a scan, doctor doesn't know what it saw
  • Hiring — "not selected", criteria unclear
  • Moderation — posts removed without reason

What regulators now require: the right to an explanation, the right to appeal, and evidence the system was audited for fairness. This is why explainability is a skill — not a nice-to-have.

What you can — and can't — put into AI 14 / 16
Data privacy in practice

When you paste it into ChatGPT, you're sharing it with more than ChatGPT.

Data type Public AI (ChatGPT, Gemini) Enterprise / approved AI tool
Public informationAllowedAllowed
Internal non-sensitiveStrip identifying detailsAllowed
Client dataNeverWith approval
Personal employee dataNeverWith approval
Passwords, API keys, credentialsNeverNever

Pseudonymize by default. "Private mode" is a UI affordance — not a data-protection guarantee.

The short list 15 / 16
Keep these six rules within reach
Verifyevery fact, every citation, every number before you publish.
Minimizethe data. Only share what the task actually needs.
Anonymizenames, IDs, and specifics when you ask for help.
Consentfor photos, recordings, and personal stories — always.
Never sharepasswords, keys, client data, or proprietary code with public tools.
Be transparentwhen AI meaningfully shaped your work.
End of Module 02 16 / 16
// TAKE HOME
Verify.
Respect.
Own the decision.

AI can draft, suggest, and accelerate — but accountability doesn't transfer. The human in the loop is not a compliance checkbox. It's the job.

Next up · Module 03 — the tools of the trade. Module 04 — deepfakes, your rights, and AI at work.