LangFuzz: Redteaming for Language Models
One of the hardest parts of testing your LLM application is coming with a good dataset of edge cases to benchmark it on
`langfuzz` is a new experimental library that helps with that
❓How?
It uses a fuzz testing technique similar to metamorphic testing. It generates pairs of similar questions, and then runs them through the LLM application. If the answers are drastically different - then one of them must be wrong!
From there, you can add both, one, or neither of the questions to a LangSmith dataset so you can iterate on your app and continue benchmarking on these datapoints.
GitHub: https:…
Watch on YouTube ↗
(saves to browser)
Playlist
Uploads from LangChain · LangChain · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Chat With Your Documents Using LangChain + JavaScript
LangChain
LangChain SQL Webinar
LangChain
LangChain Document Question-Answering Webinar
LangChain
LangChain Output Parsing and Extraction Webinar
LangChain
LangChain Agents Webinar
LangChain
LangChain Prompt Injection Webinar
LangChain
LangChain Low-Code/No-Code Webinar
LangChain
LangChain "Agents in Production" Webinar
LangChain
LangChain "Hallucinations in Document Question-Answering" Webinar
LangChain
LangChain Retrieval Webinar
LangChain
LangChain "in Education" Webinar
LangChain
LangChain "Quivr Deepdive" Webinar
LangChain
LangChain "OpenSource LLMs" Webinar
LangChain
LangChain "OpenAI functions" Webinar
LangChain
LangChain "Chains vs Agents" Webinar
LangChain
LangChain "Characters" Webinar
LangChain
LangSmith Launch
LangChain
LangChain x Pinecone: Supercharging Llama-2 with RAG
LangChain
LangChain Expression Language
LangChain
LangChain "Advanced Retrieval" Webinar
LangChain
Building LLM applications with LangChain with Lance
LangChain
Benchmarking Question/Answering Over CSV Data
LangChain
LangChain "Production Ingestion" Webinar
LangChain
LangChain "RAG Evaluation" Webinar
LangChain
Fine-tuning in Your Voice Webinar
LangChain
Tabular Data Retrieval
LangChain
Building an LLM Application with Audio by AssemblyAI
LangChain
Superagent Deepdive Webinar
LangChain
Lessons from Deploying LLMs with LangSmith
LangChain
Shortwave Assistant Deepdive Webinar
LangChain
Cognitive Architectures for Language Agents
LangChain
Effectively Building with LLMs in the Browser with Jacob
LangChain
Data Privacy for LLMs
LangChain
"Theory of Mind" Webinar with Plastic Labs
LangChain
LangChain Templates
LangChain
Using Natural Language to Query Postgres with Jacob
LangChain
LangServe and LangChain Templates Webinar
LangChain
Building a Research Assistant from Scratch
LangChain
Benchmarking RAG over LangChain Docs
LangChain
Skeleton-of-Thought: Building a New Template from Scratch
LangChain
Benchmarking Methods for Semi-Structured RAG
LangChain
LangSmith Highlights: Getting Started
LangChain
LangSmith Highlights: Debugging
LangChain
LangSmith Highlights: Datasets
LangChain
LangSmith Highlights: Evaluation
LangChain
LangSmith Highlights: Human Annotation
LangChain
LangSmith Highlights: Monitoring
LangChain
LangSmith Highlights: Hub
LangChain
SQL Research Assistant
LangChain
Getting Started with Multi-Modal LLMs
LangChain
Build a Full Stack RAG App With TypeScript
LangChain
Auto-Prompt Builder (with Hosted LangServe)
LangChain
LangChain v0.1.0 Launch: Introduction
LangChain
LangChain v0.1.0 Launch: Observability
LangChain
LangChain v0.1.0 Launch: Integrations
LangChain
LangChain v0.1.0 Launch: Composability
LangChain
LangChain v0.1.0 Launch: Streaming
LangChain
LangChain v0.1.0 Launch: Output Parsing
LangChain
LangChain v0.1.0 Launch: Retrieval
LangChain
LangChain v0.1.0 Launch: Agents
LangChain
DeepCamp AI