Tendril — AI Lessons for Real Life

Tendril

The premise

Data poisoning is the supply-chain risk for fine-tuned models; detection is multi-layered and starts with provenance.

What AI does well here

Track data provenance from source to training pipeline (cryptographic hashes, source attestation)

Run statistical anomaly detection on training data (label distribution, feature distribution, outliers)

Evaluate model behavior against suspected trigger patterns post-training

Maintain a separate, trusted evaluation set never exposed to the training pipeline

What AI cannot do

Detect poisoning that perfectly mimics legitimate data distribution

Substitute for supply-chain controls on data sources

Replace human review of suspicious data clusters

End-of-lesson check

15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-ethics-safety-data-poisoning-detection-adults

Which cryptographic technique is most appropriate for verifying training data provenance throughout a fine-tuning pipeline?

End-to-end encryption of data stores
Digital signatures on model weights
OAuth tokens for API authentication
Cryptographic hashes of individual data samples

Statistical anomaly detection on training data typically monitors which of the following?

Label distribution, feature distribution, and outlier presence
GPU memory utilization during training
User engagement metrics with deployed models
Model inference latency patterns

Why must a trusted evaluation set remain completely separate from the training pipeline?

To comply with GDPR data minimization requirements
To prevent data leakage that would inflate performance metrics
To ensure the model sees only fresh data during testing
To reduce computational costs during evaluation

What limitation prevents AI systems from detecting all forms of data poisoning?

AI systems lack sufficient computational power for large-scale analysis
AI cannot process unstructured data like text and images
AI requires labeled data to identify poisoning patterns
AI cannot detect poisoning that perfectly mimics legitimate data distribution

Which audit area examines vendor attestations and internal access controls for training data?

Statistical anomaly detection
Trigger-pattern evaluation
Evaluation set integrity
Supply-chain trust

After training, behavioral evaluation against suspected trigger patterns serves which purpose?

To compress model size for deployment
To measure model accuracy on standard benchmarks
To optimize hyperparameter settings
To detect whether backdoors were植入 during training

Which activity cannot be fully replaced by automated AI detection in a poisoning defense strategy?

Logging model training metrics for audit trails
Human review of suspicious data clusters
Running statistical anomaly detection on data distributions
Generating cryptographic hashes for provenance tracking

Why are sophisticated backdoors particularly difficult to detect through standard testing?

They require extremely large datasets to activate
They only affect models with fewer than 10 billion parameters
They trigger on rare, naturalistic combinations developers wouldn't test
They produce audible warnings when activated

What does source attestation provide in a data provenance control system?

Provides automatic translation of data between formats
Verifies that the data source certifies its origin and integrity
Guarantees the data was generated by humans
Ensures data is stored in geographic locations meeting compliance requirements

Which control area would specifically address the question: 'Do we test the trained model against known trigger patterns?'

Incident response planning
Data provenance tracking
Supply-chain trust
Trigger-pattern evaluation

What is the primary purpose of maintaining cryptographic hashes for training data?

To detect any unauthorized modification of data after ingestion
To compress the storage footprint of training datasets
To generate synthetic training examples
To speed up data loading during training

In the context of data poisoning, what is a backdoor attack?

A vulnerability in network infrastructure that exposes training data
A physical security breach at data centers
A method for unauthorized access to model weights
An attack where poisoned data causes model behavior changes on specific trigger inputs

What should an incident response plan for data poisoning include?

Procedures for model deployment to production
Guidelines for marketing the poisoned model
Methods for increasing training dataset size
Steps to contain and remediate poisoning once detected

Why can't provenance tracking alone prevent data poisoning?

Provenance tracking requires more storage than is available
Provenance tracking is incompatible with cloud computing
Provenance only verifies data origin, not whether the source itself was compromised
Provenance cannot be applied to text data

Which scenario represents the greatest challenge for automated poisoning detection?

Training data containing obvious duplicate entries
Poisoned data with statistical properties nearly identical to clean data
Training datasets smaller than 1,000 samples
Data sources from well-known major corporations

The premise

Data poisoning is the supply-chain risk for fine-tuned models; detection is multi-layered and starts with provenance.

What AI does well here

Track data provenance from source to training pipeline (cryptographic hashes, source attestation)

Run statistical anomaly detection on training data (label distribution, feature distribution, outliers)

Evaluate model behavior against suspected trigger patterns post-training

Maintain a separate, trusted evaluation set never exposed to the training pipeline

What AI cannot do

Detect poisoning that perfectly mimics legitimate data distribution

Substitute for supply-chain controls on data sources

Replace human review of suspicious data clusters

End-of-lesson check

15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-ethics-safety-data-poisoning-detection-adults

Which cryptographic technique is most appropriate for verifying training data provenance throughout a fine-tuning pipeline?

End-to-end encryption of data stores
Digital signatures on model weights
OAuth tokens for API authentication
Cryptographic hashes of individual data samples

Statistical anomaly detection on training data typically monitors which of the following?

Label distribution, feature distribution, and outlier presence
GPU memory utilization during training
User engagement metrics with deployed models
Model inference latency patterns

Why must a trusted evaluation set remain completely separate from the training pipeline?

To comply with GDPR data minimization requirements
To prevent data leakage that would inflate performance metrics
To ensure the model sees only fresh data during testing
To reduce computational costs during evaluation

What limitation prevents AI systems from detecting all forms of data poisoning?

AI systems lack sufficient computational power for large-scale analysis
AI cannot process unstructured data like text and images
AI requires labeled data to identify poisoning patterns
AI cannot detect poisoning that perfectly mimics legitimate data distribution

Which audit area examines vendor attestations and internal access controls for training data?

Statistical anomaly detection
Trigger-pattern evaluation
Evaluation set integrity
Supply-chain trust

After training, behavioral evaluation against suspected trigger patterns serves which purpose?

To compress model size for deployment
To measure model accuracy on standard benchmarks
To optimize hyperparameter settings
To detect whether backdoors were植入 during training

Which activity cannot be fully replaced by automated AI detection in a poisoning defense strategy?

Logging model training metrics for audit trails
Human review of suspicious data clusters
Running statistical anomaly detection on data distributions
Generating cryptographic hashes for provenance tracking

Why are sophisticated backdoors particularly difficult to detect through standard testing?

They require extremely large datasets to activate
They only affect models with fewer than 10 billion parameters
They trigger on rare, naturalistic combinations developers wouldn't test
They produce audible warnings when activated

What does source attestation provide in a data provenance control system?

Provides automatic translation of data between formats
Verifies that the data source certifies its origin and integrity
Guarantees the data was generated by humans
Ensures data is stored in geographic locations meeting compliance requirements

Which control area would specifically address the question: 'Do we test the trained model against known trigger patterns?'

Incident response planning
Data provenance tracking
Supply-chain trust
Trigger-pattern evaluation

What is the primary purpose of maintaining cryptographic hashes for training data?

To detect any unauthorized modification of data after ingestion
To compress the storage footprint of training datasets
To generate synthetic training examples
To speed up data loading during training

In the context of data poisoning, what is a backdoor attack?

A vulnerability in network infrastructure that exposes training data
A physical security breach at data centers
A method for unauthorized access to model weights
An attack where poisoned data causes model behavior changes on specific trigger inputs

What should an incident response plan for data poisoning include?

Procedures for model deployment to production
Guidelines for marketing the poisoned model
Methods for increasing training dataset size
Steps to contain and remediate poisoning once detected

Why can't provenance tracking alone prevent data poisoning?

Provenance tracking requires more storage than is available
Provenance tracking is incompatible with cloud computing
Provenance only verifies data origin, not whether the source itself was compromised
Provenance cannot be applied to text data

Which scenario represents the greatest challenge for automated poisoning detection?

Training data containing obvious duplicate entries
Poisoned data with statistical properties nearly identical to clean data
Training datasets smaller than 1,000 samples
Data sources from well-known major corporations

Data Poisoning Detection: Why Your Fine-Tuning Pipeline Needs Provenance Controls

The premise

What AI does well here

What AI cannot do

End-of-lesson check

Data Poisoning Detection: Why Your Fine-Tuning Pipeline Needs Provenance Controls

The premise

What AI does well here

What AI cannot do

End-of-lesson check