How Shuttlecloud Saves Time and Money by Monitoring with Prometheus

The New Stack · Beginner ·☁️ DevOps & Cloud ·9y ago

Skills: Startup Basics80%Business Model Design70%AI Startup Building60%Fundraising50%

Key Takeaways

Shuttlecloud utilizes Prometheus for monitoring and metrics, leveraging its flexibility and labels to save time and money, while also integrating with tools like PagerDuty, Graphana, and Kubernetes.

Full Transcript

[Music] Thank You Cisco for sponsoring our day of podcasting at KU bukan we had some great conversations thanks again to Cisco you can learn more about Cisco and their micro-services platform at mantel dot IO that's ma n TL dot IO hey it's Alex Williams the new stack here at KU Khan on Wednesday afternoon here in Seattle Seattle here in Seattle and I'm here with Ignacio Perez Carter oh yeah that's correct great and you're with shuttle cloud yes and you presented here at KU bukan and we're looking to you know talk to people who are actually using the kubernetes technology and so tell us what she presented and you know more about you know why you took the approach you did sure so I've talked about our experience with Prometheus because we are small start-up small in the number of people that we work so we are seven people in the engineering team and we might be 15 or 16 people the whole company so which company shut up oh right yeah it's a weird shuttle cloud what do we do yeah so we offer to our customers and API to easily migrate or import email and contacts so we have them grow by offering them a service to import their customers so they can import their email yeah and and in create almost it's created a database out of that then or is it more just a service that just stores the information I'll give you an example no we don't sort anything so one example one one of our clients is Google we like Gmail the object okay so if you go if you happen to have a gmail account which almost as I say you can go to accounts and import your emails for from Yahoo or you name it account right we support actually 247 providers right so we offer that service to Google that offer that service to its fantasy so like your the API provide yourself for instance if I'm adding upper third party service naming integrate with Gmail then using your API that's correct okay great so that turns out to be a lot of data that yeah yeah so tell us in what you presented about yeah it tell us these guys start at guess again with Prometheus sure so I sort of start my start my presentation telling a little bit about our history and on the beginning we didn't have that need to have a whole monitoring in place in our company but as you can imagine I've started to grow we needed to have some system so we sort of analyzed all different products that there were on a market we asked different people the devops community in Spain for instance and we decided to give it a try to to Prometheus and we wanted to try for a short period of time and if it didn't work try the next one but it actually went pretty well was it the first one you tried yeah really I loved I told that in presentation that I'd love to tell the story like and then we tried this one Indian and for me this was the best but we were so happy from the very beginning and it was much the implementation was much easier than what we expected so we kept on working with it what were your requirements for for the for this project what were you trying to to learn and achieve yeah so we want it I think I don't think it's a specific case our case but we wanted to have some operation metrics how our sis instances are performing if some service up or down we want also to monitor some business metrics like how many migrations are we processing its their status if you are going right or wrong and we want it also so what some of the things that made us choose Prometheus as the solution is that there is no first that well it's a time series database that you can edit and it's very flexible so it it has labels and you don't have to decide from the very beginning your the metrics you want to have you can add them afterwards or edit them that's one of the good things that we liked about it and other things that you there's no need of any external service like for instance for sensor you need a messaging system like RabbitMQ from it uses independent so you don't need anything else and also it's mating go so it's very easy to install and implement and give it a try because this is why we wanted to do so from all the all the solutions that we want to try the first one because it was the easiest to start with was Prometheus ok and what were you doing before this we sort of have some different systems most of them in-house solutions and we had a lot of manual actions that we had to do to monitor so it was not fully automated so for the operation matrix we had I gave that example I'm not saying that it's correct was obviously it's not but we had some chrome based tasks that do the checks in every single instance and send an email in case something was going wrong you as you can imagine if the instance down there's no one to send the email so there are many things that are wrong with that solution and four they exert the external services like api's or things we are so what I call blackbox monitoring how we are being seen from the outside we trust a Pingdom they also have an ulcer system and they ping all of your services and check if that everything is all right with them what was the drawback of Pingdom the the drawback and I haven't seen no no drawback there the main drawback was for the operations metric and the business metric because yeah an in-house solution with that email sent was not what is not optimal as we started to grow and we start to scale that didn't scale how does Prometheus solve those problems and houses all the problem of scale first of all the as soon as you sell Prometheus you already have the node exporter that offers a ton of metrics that can be useful and that's by default so you don't have to spend any time creating those metrics and you already have them when you have the metrics you can settle our alarm sorry for them and you can have you we have an integration with Kingdom sorry with patron duty and we get directly paged for the thresholds we've set so it's very straightforward so you set thresholds or you know infamy theus and so when those thresholds are met then you get it within a pager duty you know response is generated that's correct that could happen or an email if it's not that important so you can also set the price there is the action you want the alert manager to trigger like if you an example and I'm gonna tell you how we iterate over that and we don't have it like that but on the per beginning for instance for the hard drive usage we had a first if that hard drive is over 80% capacity send us an email if it's 95% capacity page someone but there are a lot of things that are very interesting about Prometheus and there's for one example is the predict linear function and you can create alerts like if if with the current rate this hard drive will be full in two days send me an email so I have two days in advance to plan the actions so there's you don't have that urgency like it's 95 percent and as you can imagine you don't really know it's very difficult to set absolute thresholds mm-hmm pens on the rape some have choice like 500 megabytes and you can happen forever and other ones depends on the rate of how it's getting pulled what are you so you're using Prometheus for this perfect knowing you know for these purposes what are you using it in conjunction with using with kubernetes not yet but we're planning to move to kubernetes okay okay so how are you using it and kind of like in in in corollary with your overall technology stack you know how is it being integrated across your different across your platform yeah that's the we have to so as I said for the operations metrics we have another exporter for the black box monitoring or our API star performing we have there's black box explorer which is a component of Prometheus of course seamlessly but we had to implement our in-house exporter for the business metrics just obviously that's something very custom right so we have we have we actually get all the statuses for all migrations in a non relational database we digest that information to get some stats but apart from that we have an operation exporter that directly connects Prometheus or allows primitives to pull the data from that couchdb so we have direct connection through that exporter from Prometheus to the real results from the migrations this is what the in-house solution we had to implement which was very easy in comparison with other systems to connect those things so our engineering solution and Prometheus so you get all this information in real time view how do you visualize this how do you how are you actually seeing this information we use it's that's also something I mentioned it's really straightforward to put graphing on top of Prometheus okay so it's I mean literally takes lesson it takes minutes to connect them and then the the language used to create the charts in graph Anna it's a prom QL which is precisely the same one used in Prometheus for doing your query and so it's you try the you're querying primitives control-c control-v in graphene and you have to chart so it was very nice very easy so we have long story short we have graph Anna on top of Prometheus and we have the charts on how we performing at that time at any given time right how do you find prom QL I actually don't have any experience with any other monitoring systems so I am NOT the guys this was my first experience with an monitoring solution and I found it really logical like obvious not obvious in the sense that you there's some things and some syntax you have to learn but I think it's quite understandable what are the upstream you know requests that you're making now for you know for for Prometheus are you contributing to the project at all we are not yet because as I as I said we're only seven people yeah in our backlog we have some of the things we want to collaborate with like what like there's one right now there's one couch to be exporter but these one only monitors the metrics about how the database is performing but not its content and we trust a lot on we are on high level availability so we trust a lot on database replications and we have some metrics that we are right now monitoring about kouchi be replicas and this there's a feature in Prometheus I'm sorry it's taking too long for the answer but there's a feature in Prometheus that can parse data from a text file and we are doing this with that solution and it shouldn't be like that and we're planning to collaborate with that Gao Qi be exported to add that functionality so that there are some metrics about that can't of the database so what are some of the other you know aspect of this presentation that you discussed that maybe I didn't touch on that it's that you don't have to have a lot of money or a great budget to to implement Prometheus that's why the talk is called why it's good for your small startup because we've been there I mean when you're a small start-up you're not you don't have the people and you don't have the resources to implement an expensive solution and it's very straightforward very easy to implement but also right now we are currently monitoring 200 instances with it might not sound as that many but for a small team to hundreds already some of them with only one medium instance in GCE and a micro instance for meta monitoring as we're still not in a che but only with two instances and a thirty gigabyte hard drive we can monitor their whole 200 infrastructure and if you do the numbers that that's not a lot of money so in a monthly basis that compared to other solutions that achieve achieve in money in dollars yeah it's affordable yes so you're running on AWS we're currently running on GC we used to run an AWS but we migrated to GC okay so you're running yours on GC hmmm okay okay okay and were there any other aspect of it besides the the affordability of it that you touched on how simple it is to at least with with what we've done with it how easy it is to set up how do you how do you how do you kind of how do you quantify that you know what does it make it so easy is it just is it is it know what is it about that process that makes it easy well I would say that the you are just having the node exporter which comes directly with Prometheus and Prometheus itself you already have a ton of metrics that are very interesting and you only have to set up the other the alert so there's no in that sense you don't have to spend a lot of time developer time getting metrics or installing Prometheus so to begin with I think just to give it a try it's very easy and very affordable and if you are constrained by resources to have something that you can see the results easily and very quickly I think it's he right then I usability speaks to the cost-effectiveness yeah yeah right right why do you guys use GCE the not only because of the world I'd say that in GC we have a lot of traffic to Google services or Google API right so we are sort of a corner case because we use a lot the Google API and the traffic the bandwidth to those Google API is for free so for us it's a cheaper solution because we don't pay the traffic to the Google API that's why GC is very good so we're not mention a few but are there any other major things you'd like to see come out of prometheus that would be helpful for you guys in is you continue to deliver this service that you have that so obviously gonna grow as you know is more company yeah to integrate api's into their own services we are very excited as we are moving to kubernetes we're very exciting to try the monitoring of prometheus kubernetes so we don't really know what what we're gonna see but already from the talks we're very excited to do that and we're really looking forward to and to grow our prometheus and expertise and we could maybe with have in the future but currently with what we have we're not missing any major so you are going to what are your what are your thinking about doing with kubernetes oh I I guess we're not a lot of people might resonate with that but we're the under usage we have with some instances so in order to optimize our resources and we could allocate different parts in the same node so so that's not to have as we have right now one service per one instance per service so right now you have one instance per service that's correct so for the services that are directly migrating information data we don't have a problem because we are just we can do the fine-tuning of how much of the resource of instance we are using but for other resources we are we definitely can so you can run multiple containers on and you know then on a house for example exactly what you're not doing now you're running VMs essentially though not being directly there yeah okay so now you're just starting that process yeah yeah yeah we're still on the verr beginning but excited I'm looking forward to it what are you what are your hopes for what do you we know why why why would you invest your small startup why would you invest your time and energy in this because we believe it's it's a clean solution it's a good idea and we could benefit from it so it's not me why is it what makes it clean what's it what's the benefits yeah so first the economic reason that we're gonna save money having less instances or as many instances in place then because we've already seen some user cases where they already move their infrastructure to Prometheus and it's only when fine and third because we are trying to well we are in a process of having our system more in a I'd say more in a 12 hour factor yeah and that's all factor we I cannot say that we are already there but we are moving to that goal so that's why we believe that it's a it's a something we can benefit economically and also the expertise and the use case we've seen we're gonna go in the same direction well thank you very much he got the appreciate your time to take you know to talk with us about about what you presented on Prometheus here at KU bukan and the best of luck and keep us posted on your on your developments with kubernetes thank you for having me here and for having the interest and listen to our story thank you thank you very much [Music] thanks to Cisco for sponsoring our day of podcasting at KU bukan we had some great conversations thanks again to Cisco you can learn more about Cisco and their micro services platform at mantled IO that's ma n TL dot IO [Music] you [Music]

Original Description

With Prometheus the talk of KubeCon, CloudNativeCon, and last week's GrafanaCon, there can be no denying the fact that monitoring has become a focus for today’s enterprises. For those operating on a smaller scale or at a startup, choosing the right monitoring solution can not only save money, but can result in considerable benefits to overall workflow efficiency, allowing available resources to be better put to use elsewhere. For API migration and integration platform Shuttlecloud, there was never a ‘switch’ to Prometheus. Having a small team roughly 15-17 employees, the small team originally planned on cycling through monitoring solutions until finding one that clicked. Once they got Prometheus up and running, they never got around to trying another monitoring solution. On today's episode of The New Stack Makers, TNS Founder Alex Williams spoke with Shuttlecloud Software Developers Ignacio Pérez Carretero during KubeCon 2016 to hear more about Shuttlecloud’s experience working with Prometheus in production and their decision to migrate their stack over to Kubernetes. Listen on SoundCloud: https://soundcloud.com/thenewstackmakers/how-shuttlecloud-saves-time-and-money

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from The New Stack · The New Stack · 11 of 60

← Previous Next →

What's Next for the Cloud Foundry Foundation in 2017 with Executive Director Abby Kearns

What's Next for the Cloud Foundry Foundation in 2017 with Executive Director Abby Kearns

How Unikernels Can Better Defend against DDoS Attacks

How Unikernels Can Better Defend against DDoS Attacks

Weaveworks is Bringing Horizontal Scaling to Prometheus

Weaveworks is Bringing Horizontal Scaling to Prometheus

TNS Analysts Thanksgiving Special: The Evolution of Kubernetes and the Container Ecosystem

TNS Analysts Thanksgiving Special: The Evolution of Kubernetes and the Container Ecosystem

How Rancher Labs is Seeing Kubernetes Put to Work in Production

How Rancher Labs is Seeing Kubernetes Put to Work in Production

SAP Tests Kubernetes for Cloud-Native Enterprise Software Deployments

SAP Tests Kubernetes for Cloud-Native Enterprise Software Deployments

Event Marketing for Today's Developer Evangelists and Community Managers

Event Marketing for Today's Developer Evangelists and Community Managers

NodeSource Introduces Certified Modules to Improve Node.js Security

NodeSource Introduces Certified Modules to Improve Node.js Security

How Lightstep is Illuminating the Case for Distributed Tracing

How Lightstep is Illuminating the Case for Distributed Tracing

How OpenStack Aims to be More Inclusive without being Exclusive

How OpenStack Aims to be More Inclusive without being Exclusive

How Shuttlecloud Saves Time and Money by Monitoring with Prometheus

How Shuttlecloud Saves Time and Money by Monitoring with Prometheus

Creating Analytics-Driven Solutions for Operational Visibility

Creating Analytics-Driven Solutions for Operational Visibility

Understanding the Application Pattern for Effective Monitoring

Understanding the Application Pattern for Effective Monitoring

Building On Docker's Native Monitoring Functionality

Building On Docker's Native Monitoring Functionality

The Importance of Having Visibility Into Containers

The Importance of Having Visibility Into Containers

How Getting Your Project in the CNCF Just Got Easier

How Getting Your Project in the CNCF Just Got Easier

Tectonic Summit Pancake Breakfast: How to Sell Kubernetes to the Hypervisor-Minded

Tectonic Summit Pancake Breakfast: How to Sell Kubernetes to the Hypervisor-Minded

The Buzz at Tectonic Summit 2016 in New York City

The Buzz at Tectonic Summit 2016 in New York City

Bringing Clarity to the Future of Node.js Modules

Bringing Clarity to the Future of Node.js Modules

How FluentD Can Help Monitor Microservice Architectures Through Unified Logging

How FluentD Can Help Monitor Microservice Architectures Through Unified Logging

Reshaping Front End Development with Warehouse.ai

Reshaping Front End Development with Warehouse.ai

2016 Year End Wrap-Up: Discussing Docker, OpenStack, and Open Source

2016 Year End Wrap-Up: Discussing Docker, OpenStack, and Open Source

Here's Why You Should Build a Robot Using Node.JS: Because You Can

Here's Why You Should Build a Robot Using Node.JS: Because You Can

How the Node.js Foundation is Utilizing Participatory Governance Models

How the Node.js Foundation is Utilizing Participatory Governance Models

Set Up an MongoDB Replica Set in Less Than an Hour Using Bitnami Packages

Set Up an MongoDB Replica Set in Less Than an Hour Using Bitnami Packages

Determining Who Bears the Burden of Ensuring NPM Module Security

Determining Who Bears the Burden of Ensuring NPM Module Security

How Intel Snap uses Telemetry and Kubernetes to Drive Enterprise Efficiency

How Intel Snap uses Telemetry and Kubernetes to Drive Enterprise Efficiency

How the NFL Scored a Touchdown with its Open Source React Framework Wildcat

How the NFL Scored a Touchdown with its Open Source React Framework Wildcat

Aporeto CEO Dimitri Stiliadis: When it Comes to Security, Context is King

Aporeto CEO Dimitri Stiliadis: When it Comes to Security, Context is King

The Buzz at Node.JS Interactive

The Buzz at Node.JS Interactive

Why Going Serverless Doesn't Mean 'No Ops'

Why Going Serverless Doesn't Mean 'No Ops'

How Node.js is Transforming Today's Enterprises

How Node.js is Transforming Today's Enterprises

JJ Asghar Interview

JJ Asghar Interview

How Capital One is Using APIs to Streamline Auto Financing

How Capital One is Using APIs to Streamline Auto Financing

SXSW 2017: How Machine Learning Differs From Regular Programming

SXSW 2017: How Machine Learning Differs From Regular Programming

SXSW 2017: Data-Driven Applications with Capital One DevExchange's Hydrograph

SXSW 2017: Data-Driven Applications with Capital One DevExchange's Hydrograph

SXSW 2017: How Good Engineers Make Bad Business Decisions

SXSW 2017: How Good Engineers Make Bad Business Decisions

CloudNativeCon & KubeCon EU Pancake Breakfast 2017: Kubernetes and the Multi-Cloud

CloudNativeCon & KubeCon EU Pancake Breakfast 2017: Kubernetes and the Multi-Cloud

CNCF Executive Director Dan Kohn: What's Next for CNCF in 2017

CNCF Executive Director Dan Kohn: What's Next for CNCF in 2017

Exploring the Latest Container Runtime Projects in the CNCF

Exploring the Latest Container Runtime Projects in the CNCF

Exploring the Future of the Kubernetes Ecosystem

Exploring the Future of the Kubernetes Ecosystem

Kubernetes and Continuous Deployment

Kubernetes and Continuous Deployment

Kris Nova of Deis at CouldNativecon/Kubecon in Berlin

Kris Nova of Deis at CouldNativecon/Kubecon in Berlin

Docker's Quest for Simplicity with the Evolution of Containerd

Docker's Quest for Simplicity with the Evolution of Containerd

Developers First: The Cloud Foundry Service Broker API and Kubernetes

Developers First: The Cloud Foundry Service Broker API and Kubernetes

Mapping the Future of CoreOS's rkt in the CNCF

Mapping the Future of CoreOS's rkt in the CNCF

Red Hat and Dell EMC: Two Perspectives from DockerCon

Red Hat and Dell EMC: Two Perspectives from DockerCon

Capital One Opened its APIs to Third-Party Developers — Here’s What They Learned

Capital One Opened its APIs to Third-Party Developers — Here’s What They Learned

SUSE Joins the CNCF, Brings Kubernetes to OpenStack Cloud 7

SUSE Joins the CNCF, Brings Kubernetes to OpenStack Cloud 7

How Capital One Brings Open Source To The Banking Industry

How Capital One Brings Open Source To The Banking Industry

OSCON Is Coming Back To Portland, A Show Wrapup With Co-Chair Kelsey Hightower

OSCON Is Coming Back To Portland, A Show Wrapup With Co-Chair Kelsey Hightower

Dev Or Ops Doesn’t Matter, You Need Observability

Dev Or Ops Doesn’t Matter, You Need Observability

Taking The Next Steps In Developing An Open Source Culture

Taking The Next Steps In Developing An Open Source Culture

SXSW 2017: How Capital One Became Technology-First With Open Source

SXSW 2017: How Capital One Became Technology-First With Open Source

Apcera Old Apps Spanning New Clouds

Apcera Old Apps Spanning New Clouds

Provenance: The Peace of Mind Chef Habitat Seeks to Deliver

Provenance: The Peace of Mind Chef Habitat Seeks to Deliver

InSpec: Human Readable, Automated Compliance

InSpec: Human Readable, Automated Compliance

The Evolution of SAP HANA Express

The Evolution of SAP HANA Express

Women Engineers Who Inspire And Never Give Up

Women Engineers Who Inspire And Never Give Up

Three Perspectives on the Evolution of Container Security

Three Perspectives on the Evolution of Container Security

Shuttlecloud's use of Prometheus for monitoring and metrics has saved them time and money, and can serve as a model for other startups and enterprises looking to optimize their operations. By leveraging Prometheus's flexibility and labels, and integrating it with other tools like PagerDuty and Graphana, organizations can build scalable and cost-effective monitoring systems.

Key Takeaways

Set thresholds for alerts
Create alerts for specific metrics
Integrate with PagerDuty for direct paging
Use the linear function for predicting future usage
Implement in-house exporter for business metrics
Connect Prometheus to couchdb
Use Graphana for visualization
Query data with promQL

💡 Prometheus's flexibility and labels make it an ideal choice for monitoring and metrics, allowing organizations to build scalable and cost-effective systems that can be easily integrated with other tools and platforms.

🔒 Pro feature: Ask AI to explain this lesson →

More on: Startup Basics

View skill →

Launching Preorders For My New Business! (launch prep, new customers 🥺, working w/ my fiancé)

Launching Preorders For My New Business! (launch prep, new customers 🥺, working w/ my fiancé)

What Is Dropshipping? How To Start Dropshipping on Shopify

What Is Dropshipping? How To Start Dropshipping on Shopify

Learn With Shopify

Capstone - Launch Your Own Business!

Capstone - Launch Your Own Business!

Online Courses for Absolute Beginners (Live Training)

Online Courses for Absolute Beginners (Live Training)

How to Start A Clothing Brand

Learn With Shopify

Shopify Tutorial for Beginners l How to Make a Professional Online Store l Shopify Store Setup 2026

Shopify Tutorial for Beginners l How to Make a Professional Online Store l Shopify Store Setup 2026

Related Reads

DNS gets blamed last and breaks first: my symptom-to-root-cause playbook

Learn to identify and troubleshoot DNS-related issues using a symptom-to-root-cause playbook, crucial for maintaining website uptime and reliability

Dev.to · Luke Thomas

Why I built a lightweight remote Docker dashboard with a 4MB outbound agent

Learn how to build a lightweight remote Docker dashboard with a small outbound agent to simplify container management across multiple servers

Dev.to · Protik Mondal

Day 42 of My Cloud Engineering Journey: Discovering That DevOps Starts with People, Not Tools

You'll learn that DevOps success starts with people and culture, not just tools and technology, and why this mindset shift matters for cloud engineering teams

Medium · DevOps

Why Kubernetes Gateway API Exists: The Evolution Beyond Ingress

Learn why Kubernetes Gateway API is replacing Ingress and how it improves networking in modern Kubernetes, which matters for efficient cluster management

Medium · DevOps

Containers on Amazon ECS with Mama J