Skip to main content

neural-forge.io

Learn Tracks Models AI Explorer Compare

Sign inStartStart learning

Tendril

Tendril neural-forge.io

Free AI literacy for everyone, supported by trust-safe partners.

Learn

Curriculum
Tracks
For you
Preferences

Resources

Glossary
In the Wild
Newsroom
Community
Partners
Send Feedback
Changelog
About
New to AI?

Schools & Orgs

Schools
Libraries
Tech Teams
Free Access
Sponsor
Sign Up
Support the Mission

Trust

Privacy
Terms
COPPA
Accessibility

Legal

Privacy
Terms
COPPA
Accessibility

© 2026 Tendril·Privacy·Terms·Contact

Built with Claude

Filter

Tips

Press / anywhere on the site to jump here.

Use ↑ ↓ to move, ↵ to open.

Search

Results for “LangSmith”

20 results

Lessons

14

AI Tracing Platforms: Langfuse, LangSmith, Helicone, Phoenix
Best match
Compare tracing and observability platforms specifically for LLM and agent applications.
CreatorsadvancedprofessionalcoderAdvanced
AI Evaluation Platforms: When to Buy vs Build
Eval platforms (Braintrust, LangSmith, Weights & Biases) accelerate teams. The buy-vs-build call depends on team size, use cases, and customization needs.
Creatorsadvancedprofessionalcoderresearcher
Comparing AI Evaluation Platforms
Eval platforms (Braintrust, LangSmith, Weights & Biases) all support evaluation differently. Selection matters.
CreatorsadvancedresearcherAdvancedResearcher
LLM Observability Tools: What to Trace, What to Sample, What to Alert
LLM observability tools (LangSmith, LangFuse, Helicone, Datadog LLM, custom) all trace conversations. The differentiation is in evaluation, dashboards, and alerting — and choosing the wrong tool wastes months.
CreatorsadvancedprofessionalopsAdvanced
AI Observability Stack 2026: Traces, Metrics, and Cost in One Pane
Building a unified view across LangSmith, Datadog LLM Observability, OpenTelemetry, and custom dashboards.
Creatorsadvancedprofessionalcoderops
AI Agent Evaluation Platforms in 2026
Compare LangSmith, Braintrust, Humanloop and friends for evaluating multi-step agent traces.
CreatorsadvancedcoderAdvancedCoder
Building with LangGraph
LangGraph became the production favorite in 2026 for good reasons — explicit state, checkpointing, first-class MCP. Build a real agent end-to-end and learn why.
CreatorsadvancedprofessionalcoderAdvanced
Evaluating Agent Performance: SWE-bench, WebArena, GAIA
Numbers on leaderboards are seductive and often wrong. Learn the big benchmarks, their leaderboard positions, their recently-exposed cheats, and how to run your own evals.
CreatorsadvancedcoderresearcherAdvanced
Production Agent Patterns: Queues, Retries, Idempotency
A prototype agent and a production agent have the same LLM. What's different is everything around it — durable state, retries, idempotency, observability. The real engineering.
Creatorsadvancedprofessionalcoderops
Capstone: Build and Ship a Real Agent
Everything comes together. Design, code, test, secure, and ship a production-quality agent with open-source code you can fork today.
Creatorsadvancedprofessionalcoderdesigner
Reading an Agent Trace
A trace is the full record of what an agent did and why.
BuildersintermediatecoderIntermediateCoder
ML Engineer in 2026: You Build the Tools Everyone Else Uses
Fine-tune, evaluate, serve, monitor. The ML engineer is the person who ships the models that now power medicine, law, and design. It is the highest-leverage engineering role.
CreatorsadvancedcoderdesignerAdvanced
Evaluating Prompt Performance: From Vibes to Metrics
You can't improve what you don't measure. Build an eval set, pick metrics, and turn prompt engineering from gut-feel into a rigorous discipline.
CreatorsadvancedcoderAdvancedCoder
AI Monitoring Stack: From Metrics to Quality
AI monitoring requires more than uptime metrics. Quality monitoring, drift detection, and outcome tracking are the differentiation.
CreatorsadvancedAdvancedAI monitoringquality metrics

Tracks

2

Tools Literacy
Which model when? Claude, GPT, Gemini, Grok — and how to choose. 578 lessons.
explorersbuilderscreatorssports analytics
Model Families
Every family in the industry. Variants, strengths, limits, pricing. 357 lessons.
builderscreatorsadultsClaude Haiku 4.5

Careers

1

Prompt Engineer
Prompt engineers design and tune instructions for AI systems. It didn't exist before 2022 — now it's a core role inside every AI team.
EmergingemergingClaudeChatGPT

Glossary

3

Evaluation
Testing a model's quality — beyond benchmarks, using real tasks and user feedback.
Creatorsbenchmarkred-team-evalllm-as-judgea-b-testing
LLMOps
MLOps specifically for LLM-based applications — prompt versioning, eval, and inference ops.
Creatorsmlopsevaluationprompt-templatedrift
LangChain
Open-source framework for chaining LLM calls, tools, and memory into apps.
Creatorsllamaindexlanggraphrag