API Design Philosophy - BERTopic for Topic Modeling
Key Takeaways
The video discusses BERTopic, a popular Python topic modeling library, and its API design philosophy, highlighting the importance of modularity and out-of-the-box functionality for users who may not be experienced coders.
Full Transcript
and I'm curious to also pick your brain on your API design philosophy um you identified some of the main the three pillars uh that you think of in in bird topic but I'm sure while you're also doing other other libraries there are a few things that um you've picked up or Lessons Learned uh of what makes a good um API and I think for example I mean there's a very clear psychic learn sense to to some of that um here's a picture of your mind on just API design philosophy what makes for uh a good API design for for tasks like this I think if I go back to something I've shown previously is this and this so we have modularity you can you can change whatever you want but out of the box it worked well enough for a lot of use cases so what most people want when they're trying new technologies they just want it to work you have three lines of code it works it runs and then you can look at it and say okay this is a horrible package I go with something else that's fine um but you can also say okay it's almost there and I want to change these these and these things and that has been the focus mostly on of the coding of the design that I had in mind making sure that for most usage it works out of the box because topic modeling is also used by a lot of people who don't code 24 7. for those people it needs to work and for for the ones that want to dive into it a little bit deeper and they can expand upon it similarly with designing all of this it has been a struggle at times to find that balance between out of the box you know it worked well enough and you can do everything you want with it and you can see that back in some of the things that I've I've designed a few years ago so at some point I opened up the possibility to use your own umap model and the the parameter is still you map underscore model but you can throw in k-means or anything else in there so technically we should change that to Cluster underscore or or reduction on the score model or something like that um so so with a few years of developing this you see some of these things being snuck in um into the package and some of things are better than others and some things still work and and you know should be improved amazing uh we have a lot of questions we're going to get um to them I don't think we're going to be able to answer all of them I would uh suggest to people to visit the cohere Discord there's a talking language AI Forum there and there's a specific thread for bird topic and this discussion um and so and this goes for both people seeing it live but also people seeing it on on YouTube later uh we'll keep it alive you can keep answering your question we'll have the community answer we'll do our best to uh get the the questions answered there um before we take a couple more questions I'm curious to hear your very high level sort of overview of let's do maybe polyphoses first or let's say let's do keyboard first and then uh polyphose
Original Description
BERTopic for Topic Modeling - Talking Language AI full episode: https://www.youtube.com/watch?v=uZxQz87lb84
Topic modeling allows us to explore large text archives with software. This is commonly called "topic modeling". Go in-depth into BERTopic (the popular python topic modeling library) with its creator, Maarten Grootendorst. We explore three important pillars of the package, modularity, variations, and visualizations. Each of the pillars demonstrates how BERTopic gives control back to the developer allowing for a one-stop-shop of topic modeling. This video also demonstrates BERTopic's basic capabilities and some advanced tricks that new and advanced users of BERTopic may enjoy.
Maarten is Open Source Developer and Maintainer (BERTopic, PolyFuzz, KeyBERT), Data Scientist, Psychologist.
===
Join the Cohere Discord: https://discord.gg/co-mmunity
Discussion thread for this episode (feel free to ask questions): https://discord.com/channels/95442198...
Maarten on Twitter: https://twitter.com/MaartenGr
BERTopic: https://maartengr.github.io/BERTopic/
BERTopic on Github: https://github.com/MaartenGr/BERTopic
BERTopic paper: https://arxiv.org/abs/2203.05794
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Cohere · Cohere · 19 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
▶
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Andreas Madsen on Independent Research and Interpretability
Cohere
Plex: Towards Reliability using Pretrained Large Model Extensions
Cohere
Independent Research Panel Discussion
Cohere
The Future of ML Ops: Open Challenges and Opportunities
Cohere
C4AI Special - Grad School Applications
Cohere
Cohere For AI Fireside Chat: Samy Bengio
Cohere
Cohere For AI - Scholars Program Information Session
Cohere
Modular and Composable Transfer Learning with Jonas Pfeiffer
Cohere
Jay Alammar Presents Large Language Models for Real World Applications
Cohere
Catherine Olsson - Mechanistic Interpretability: Getting Started
Cohere
How To Prompt Engineer a Tech Interview App | TOHacks 2022 Winners
Cohere
C4AI Sparks: Samy Bengio
Cohere
BERTopic for Topic Modeling - Maarten Grootendorst - Talking Language AI Ep#1
Cohere
Exploring News Headlines With Text Clustering | Jay Alammar
Cohere
Scale TransformX | Fireside Chat: Aidan Gomez and Alexandr Wang
Cohere
Making Large Language Models Accessible | Scale AI Fireside chat with Bill MacCartney
Cohere
Intro to KeyBERT - BERTopic for Topic Modeling
Cohere
Intro to PolyFuzz - BERTopic for Topic Modeling
Cohere
API Design Philosophy - BERTopic for Topic Modeling
Cohere
Code demo of BERTopic - BERTopic for Topic Modeling
Cohere
Short texts vs long texts in BERTopic- BERTopic for Topic Modeling
Cohere
How People can help BERTopic - BERTopic for Topic Modeling
Cohere
Cohere For AI: Training Sensorimotor Agency in Cellular Automata with Bert Chan
Cohere
Cohere API Community Demos | October 2022
Cohere
Perfect Prompt Demo By Arjun Patel
Cohere
Project Idea Generator Demo By Tobechukwu Okamkpa
Cohere
SuperTransformer Demo By Amir Nagri and Team Megatron
Cohere
Cohere For AI Fireside Chat: Pablo Samuel Castro
Cohere
How Startups Can Use NLP to Build a Competitive Moat
Cohere
Build Chatbots Faster with Large Language Models
Cohere
Tools to Improve Training Data - Vincent Warmerdam - Talking Language AI Ep#2
Cohere
Utku Evci - Sparsity and Beyond Static Network Architectures
Cohere
Adding human intelligence to ML models with human-learn #shorts #machinelearning #nlp
Cohere
Iterating on your data with doubtlab - Tools to Improve Training Data
Cohere
Adding Human Intelligence to ML models with Human learn - Tools to Improve Training Data
Cohere
Scikt Learn embeddings helpers with Embetter - Tools to Improve Training Data
Cohere
Building Cohere API Demo App With Streamlit | Adrien Morisot
Cohere
Rosanne Liu - career creation for non-standard candidates
Cohere
Giving computers many human languages with Cohere's multilingual embeddings
Cohere
Learning by Distilling Context with Charlie Snell
Cohere
Sentence Transformers and Embedding Evaluation - Nils Reimers - Talking Language AI Ep#3
Cohere
Reflecting on for.ai...
Cohere
Create a Custom Language Model with Surge AI and Cohere
Cohere
Cohere API Community Demos | November 2022
Cohere
Cohere API Community Demos | December 2022
Cohere
Cohere For AI Presents: Colin Raffel
Cohere
Lucas Beyer - FlexiViT: One Model for All Patch Sizes
Cohere
What is Neural Search? Nils Reimers - Sentence Transformers and Embedding Evaluation
Cohere
Evaluating Information Retrieval with BEIR
Cohere
Evaluating Embeddings with MTEB Massive text embeddings benchmark - Nils Reimers
Cohere
High quality text classification with few training examples with SetFit
Cohere
Multilingual and cross lingual embeddings - Nils Reimers
Cohere
Developing open-source software: lessons, benefits, and challenges - Nils Reimers
Cohere
Ask Me Anything with Ed Grefenstette, Head of Machine Learning at Cohere
Cohere
HyperWrite Powers Its Generative AI Service with Cohere
Cohere
EMNLP 2022 Conference Special Edition - Talking Language AI #4
Cohere
Cohere API Community Demos | January 2023
Cohere
C4AI Sparks: Rosanne Liu on Career Creation for Non-Standard Candidates
Cohere
Michael Tschannen - Image-and-Language Understanding from Pixels Only
Cohere
How to Add AI to your App
Cohere
More on: API Design
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
I Spent Weeks Looking for a Research Gap Before I Realized I Was Searching the Wrong Way
Medium · AI
ICMI 2026 Reviews [D]
Reddit r/MachineLearning
Workshop submission for main conference paper under review [D]
Reddit r/MachineLearning
Kept context-switching between arxiv, OpenReview, GitHub, and HuggingFace for every paper, so I built this. Chrome extension + website with everything inline, plus citation graph + SPECTER2 neighbors. 3M papers, free, feedback welcome [P]
Reddit r/MachineLearning
🎓
Tutor Explanation
DeepCamp AI