Private LLMs & Infra From Scratch – Episode 1: The Why, The Setup, The Stack #n8n #coolify #ollama

DailyAi.Studio · Intermediate ·🛠️ AI Tools & Apps ·5mo ago

Skills: LLM Foundations85%LLM Engineering80%Prompt Craft70%Agent Foundations60%

Key Takeaways

This video series covers building private LLMs and infrastructure from scratch, focusing on cost savings, control over AI upgrades, and internal infrastructure using tools like N8N, Llama, Tail Scale, Coolify, and Olama.

Full Transcript

All right, here we go. This is going to be a year of learning. 2026 starting early. We're going to hit into all of this infrastructure, private LLMs, private server in your office, in your home, but running all of these services, AI, LLMs, everything. So, I'm going to start off with a little introduction of why why this matters or why customers want this. Like, real customers ask me for this all the time. So you learning this now can help you with customers in the future, but also your own needs. And so yeah, enjoy. I'm glad you're here. Join the channel. This is paid, so you're not seeing this unless you paid, unless I'm advertising. And and that's it. It will really help make this whole year of 2026 full of this information. By the end of the year of 2026, you're going to understand Linux and how to do all the self-hosting LLMs, no no code solutions, NADN, active pieces, all this stuff running on your systems in your company on your own private servers. All right, enjoy. Hang in there. Okay, so let's dig into the why. Now, I was actually surprised by this. A lot of customers I've spoken to, a lot of meetings I've been in where they talk about AI and their company and automations, they want to bring it inhouse. They want the privacy, they want the cost, and they want something that they don't always they're not always aware of, but change control, and I'll talk about that in a moment. So, before we dig into the fun part of the doing, let's talk about the why. Now, if we look at the slides here, we look at privacy. You could use OpenAI or Claude and they will save your data. It's even in some of the contracts or agreements where they can use your data for training. And so a lot of customers just don't want that. So sure you could use Olama's cloud or you could use Bedrock or other ways of hosting your own and do really well. But I think overall a lot of companies are it's not uncommon to bring in infrastructure inhouse. And so whether you're a home office, small office, even medium to large office, you can have a system in your department, your warehouse, your office that then can do this. And that's what we're going to go over. We're going to have a we're going to be using a framework desktop, but you could use a Nvidia based system or other type of system because we're not really dealing with a lot of parallel tasks. And when it comes to using websites and the tools we're going to build, that's pretty easy. when it comes to doing the AI that will have some bottlenecks and we'll take them on as they come and then maybe we'll figure out how to use the system in a way that certain things get remote usage because we're not so worried about privacy but deep reporting on company data and stuff we use only local. So we'll find a balance there. Many times depending on the size of the company, it's not so much about the concurrent usage, it's about the long-term usage, the data usage, the overnight reporting and things like that. Again, make the privacy an issue. And then cost. Again, you could use APIs and those add up over time. H I was going at $600 a month for one company and that was just a lot of different small use cases they have going on. Web scraping is another one these companies want to do where they want to gather data and keep collecting data about their particular resources and goals and that just it's just very expensive and timeconuming especially because some of it's just getting it right and that costs money and then of course they might have a lot of files and rag and they just want to keep parsing all that data and new and build up this kind of like database that again gives the AI the context it needs and all that adds up and then you get into the database side of that forget AI and the other things that you can now save because it's all here. Of course, we'll focus on backing up and everything, but by the time they're done, even with me setting this up, putting the box in there, 5 years of long-term support with the particular Linux thing we're using, they can be doing well in 5 years and save a lot of money. So, cost is a pretty clear win. And then lastly, change control, which is actually a big deal. Many times we get to a place where the prompt is working, the AI is working and then some company forces up us to upgrade because they want you to have the latest AI and we just don't need to because the model we were using is doing that job just fine. It can keep doing that job just fine for 10 years. It just doesn't matter. So we get out of the hamster wheel, the cycle of upgrading for upgrading sake and we get into this position where we can just lock down the prompts, lock down the system so they just work and we can even like measure deviations and just have monitoring on in place. So it's nice and by doing this we can lock down those particular tasks to those models and just keep it going. It's really a big deal. All right, so that's the gist of like why this is important and let's give an overview of this guy. So we have this office network which is going to be where the user is whether it's an office in your home or in a company a warehouse anything. And the round circle represents the firewall. The fact your network doesn't have ports open that people can't just come in from anywhere. And that's where the system is going to live in there. So anyone in the office can get to the system. It's just that easy. They'll go to it using normal database sorry URL names like nen.youroff.com youroff.com and people from the outside can't get in and we can run anything. We're going to run N8N and Llama. But when people want to connect remotely, there's everybody knows of what VPNs are, but we're going to use tail scale. It's a little more simple, I think. And that will be running on the server and then it will run into the cloud and run back to them. Now, are they private? I think they are, but you could go further here and do your own kind of like tunneling. So, that's fine. There's a lot of options out there. I'm going to stick with them. I just like them. They're not a sponsor. I wish they were, but they're not. Maybe someday. And then the remote person can just connect that way. So, we're going to really have this experience here where this particular server can then do all of these different services, which we're going to add on to uh PDF services, Olama, NADN, Superbase, and all that database storage stuff we need as well. Events, websockets, no code tools like no codebase, tooljet, appsmith, all these things, buddy base. So we can build stuff internally, make them work, host them for free. Free is real. We're paying them for electricity. We're paying to set it up. We're paying for the hardware. But five years later, it's definitely a good investment. I think 37 signals has shifted from cloud to internal infrastructure or their own hardware hosted somewhere. But you can see the payoff within a number of years. And some of these companies I'm interviewing with to help consult with them, they've been there for 20 years doing the same thing, and they want to make a move now. It could be another 20 years before they move again. We don't know. But they can invest now in something that can last. So yeah. All right. So let's get started. Now the first thing we're going to do, as you saw with that overview, is we're going to set up a Linux. And it's just okay. It's we're going to do it. It's going to be okay. By the time we're done with this whole course, you're really not going to be that uncomfortable with this. But we could do this in a Mac. We could do this in Windows. And we might do a second and third version of this after once I do this first one. But we're going to dig into the Linux side of this. And yeah, let's get our hands on this because this is that's it for the intro. We're going to start digging in. >> Okay, Linux. Why are we using Linux? Why is there a bunch of stuff in the terminal? What is a terminal? All right, so there's a lot of websites on how to get going with Linux. And sorry, there's a lot of good YouTube videos that we'll I'll share in this in the description, but basically we're going to use Auntu Linux. And it's so that it's so that we can just use a pretty standard version of Linux that a lot of software is compatible with. There's so many options. I'm going to go this route. And like I said, we might even later on do Mac in Windows. Okay. So, we're going to go to Abuntu, download the ISO, and install it. What the heck does all that mean? I'm going to share some some videos that will show you how to do this. It's hard, but it's easy once you do it a few times. You'll see it's not a big deal, especially when it's a fresh machine. Once you have that installed and you're logged in, you're at like a normal desktop, what we would consider normal. Let me minimize everything. And it's even a nicer background than that. Not that it matters, right? You probably get something more like that or something. I don't know. Okay. So, at this point though, you're ready to get going. Now, again, Windows, Mac, Linux. I wanted something stable, something that will work with Docker, something that we can set things up on. And this will run, it has five years of long-term support, so we can keep it in that particular office or that situation for a while, running updates in the background. I will talk about later on in the series how to do automated background updates, backups, and whatnot. We can even get to a point where it's a rated system so that it has multiple drives. This particular machine is the framework desktop. It's not what you need, but it is a nice option. But there's so many good desktops right now for running local LLMs. I will also link to those in the bottom of the video. And there's some great videos on this stuff, including your own desktop. Sorry, your own server with Nvidia rack servers. It just depends on how far you go with this. I'm going to run it on this because I just want to see how much I can get out of it for office use as I throw at it at real use cases my customers have given me and then I'll watch it scale and but I think in the end based on the other videos and numbers we're going to see we're going to have a better sense of what can really scale here and what can happen and remember this isn't all about AI it's going to be about other things too raing which is AI but storing a ton of files having them automatically OCR or processed put into a database into a chat system or UI for people to use and get to data. Manage inventory in that database and have nightly reports running. Scanning websites to gather data. A lot of stuff. So, it's more than just AI. All right. So, now that we have Auntu installed and we're sitting here looking at our desktop, we're going to then set up Coolify. Now, the goal here is to start just using applications. So everybody talks about Docker and in the end this is basically a nice way to present and manage Docker but it's also managing the web URLs and everything. So let's just show okay so if I go to docs and I go to get started and I go installation I just grab this link and I open a terminal and I run that link and that's it. Okay, it's going to run. It's going to tell you a few jokes. It's going to be done and then you're ready to go. Now, ready to go means what? When you're ready, it will tell you at the end of that, and you'll go to the URL that it tells you to go to. Now, for me, it was this one here. So, I think if I just open up that, well, I already have this extension. Okay, for me, I went to this URL here, 10018,000, and it told me that. And that's just what my particular machine is running at for IP addresses. So here I am on my machine uh just running Coolify and Coolify is an open- source project that is there to make it so you can host your own open source applications that you're building or that have been built. So we can deploy our Nex.js here. We can deploy our websites. We can run our noode solutions. We can run our NEN here and we're going to see. So here we are. We installed this Coolifi system and we have a dashboard. We have our first project and we're ready to roll. Now, I don't want that first project. I'm going to make one with you right now. We're going to make one called We're going to add one. It's going to be our NAD because that's the first one I really want to get into for people. Oops. I don't want to call the project map. Let's go back. We'll go to settings. We're going to say now it's all for the office. So, in the end, I might just have one project. Okay? you organize it the way you want. And there's a production environment. There's environments for whatever you want. We're going to just focus on production. And then I'm going to install. Now, this is what I was trying to get at. Look at all these things I can install. And there's a lot of open source stuff here. Some of it, a lot of is just really amazing. Clean for Jira tickets. Sorry. Clean for managing projects. No code DB for for managing. It's like an air table but a little bit more obviously open source we can just do a lot of integrations with nadn you have sheets if you want to just do some more sheet heavy stuff but this one now that we have that built in n but we have superbase we have ne sorry all right we have neon so we have all these databases we can use we can there's another one called pocket base which is really interesting but we can do PDF sterling so there's a PDF system here that's really amazing for which I accidentally just clicked. So, let's just give it a run. Now, if I was to run this, it's going to have a different domain. So, it's going to be that. Now, this is this is a domain I put in there. So, we're going to cover that in a moment. Okay. So, if I run this right now, it's going to it's going to have a PDF thing. I really wanted them to do anything in, but let's just run this. This is going to be a long series. I'm not going to perfectly edit it. Sorry. So, now if we just run that guy, we then will be able to interact with it. Now, Coolify looking at 10.0.0.1. But that will get a little tricky because, and this is the thing, if we just did Docker by itself, we'd have all these Docker containers running on different ports and things. It would be hard to organize. Well, Coolifi is going to run the Docker containers for us inside of a proxy system so that we can go to the URL and it will go to the proxy and go to Coolify, say, "Oh, they mean this on this port." So, it helps to do a lot of the magic for us that then would be really hard to manage. Now, what I'm going to do though is just point my DNS to this IP and and we're going to look at that. What is this IP? I'm in my office. No one else can go here but me. But when I turn on my DNS, it's talking to sorry, not turn on. when I change my domain name provider to point. >> All right, instead of me blah blah about DNS, let's look for a moment because it's so easy. And sorry if I keep saying that this stuff is somewhat easy, but I I also know it's not because people haven't been doing this as long as I have. And I just want to make it clear that you'll learn and just stick in there and ask questions and you'll and if I miss the obvious, then I'll come back and cover it. But the point here is if you go to lama.api introd.getstrongbox how does it work and how can I serve this from inside my system and the particular system is on an IP address. So if I was to go to this and I'm going to show you how I do this in a moment and you can't get to this. That's the thing. So I think if I do ping that it's 100. So that's like my internal. No that's actually tail scale. So, how did this work out? Okay, so let's take a step back and look at tail scale in the DNS. So, first I'm going to look at the DNS because then I want to set up some stuff. Okay, so with the DNS, all I did, and I'm using Cloudflare, and I don't recommend Cloudflare just because I don't want to overkill. I have Hover. I could have used Hover. I have Namecheep for some things. I could use NameCheep. But the bottom line is I go in there and I configure my network, my DNS. And let me show you how I do that. See how I can say, hey, here's a type A record star. And then the IP we're going to get from tail scale. And I'll talk about that right after this edit. And so now the system knows anything that's wildcard star. dot the rest of this. Now the rest of this is getstrongbox.com just a nice domain I have right so now when you go there it will go to that IP and then render even though the IP is not accessible to you unless you're on my tail scale see but internally it still works because in the end those are going to render I could have put the IP of the server here and it would still work but then I couldn't get to from outside the internet. And there's a few ways to do that, but right now I'm okay with this. The disadvantage here is the internal stuff technically is a bit slower because it might have to do a little bit more network. What's the word for it? I don't even know the right word for it, but it just instead of having to go right in the network has to go out and then back in. But that's even questionable. So, we will look at that later as I access this thing remotely. Actually, I should be able to access it right now. So we're going to go look at that as we look at tail scale. But now that's why I can easily go to the the system inside here. So that's why that works in the browser. No, no hosting, nothing. So if I was to go to this on my Mac, me do that and I'll paste that over here as well. It will it should work once I'm on tail scale. So there it is not working. Okay. It's just never going to render or whatnot. But the moment I go on to tail scale, and I will show this after. Actually, let's go set that up now. So, as you can see, I can't get there. But we're going to go set up tail scale, which is super easy. All right, let's do that. And then we'll set up my Mac for tail scale and then you'll see how I can act as a remote user getting to that machine, and then we'll get back to the overall what we're building. All right, this is going to be choppy, but the the topic is what I care about. Okay, so look it, we did the DNS in that last clip and we pointed it to this IP, but what is that IP? How does this work? Okay, so all I simply did and they make this really easy is I go to tail scale and I asked to set up a Linux box and it generates the script and that's it. Next, my Linux box is on the tail scale. What does that mean? It basically means that we have a tunnel from their system to our systems and that's the next part to this. Okay, so now we have tail scale running on this system and I'm going to now put it on my Mac. So let's go to my Mac for a moment which makes this even trickier and we're going to go download Tail Scale. Here it is. So, I just went to their download site and I just click on something and then go do that. And then this is a new Mac, so I didn't have it running. So, that's good. Let's go run it though. And let's do that. So, now we're going to use their particular server and services as a tunnel. But you might say, doesn't that negate all the privacy? And I think at some point when you do need remote connections, you need to figure this out with VPNs or something. I'm choosing Tails Scale because it's rather easy to use. You could use VPNs and other devices. But the Let me get the thing running. But my point is it's encrypted tunnel data. So chances are them seeing it are potentially just zero. So then it just does this, which is pretty extreme actually. Network extension. Wow. See if I can bring this over here. And then Yep. And then I never knew it did that. That's pretty cool. And we just allow that because that's pretty serious. And VPN. Yep. I guess I don't like I'm not the stuff I know enough about, but I'm never going to go too deep into the weeds because I'm just not that in detail at this level. So, it's doing some VPN stuff which might break my connection over there. We'll find out in a moment in tail scale and authorizing it. Okay. So, where is my tail scale to authorize? I'm going to go here. Settings. Accounts. Add account. I'm hoping it would just pop into my browser in a moment. Add account. Sign into your network. I think at some point there we go. Perfect. That's what I wanted. And I use Google to sign into this guy. And again, you don't have to risk things. You could use your own emails. You can use your own whatever. All right. So then that means now my Mac should be part of this kind of network. So then what's it saying here? So maybe it needs to be restarted because sometimes after all that stuff you have to restart. Oh, seems like that's good. Seems like this is not good. Please log in. Thought I already did. Login successful. and turn on and off again. There we go. Let's see if that worked. Still have an exclamation point. You're about to connect. And that's fine. Yes. Start on login. No, manually start. I'm going to manually start just because I want to see what happens. Connect to your devices. Okay. So, let me see. There. Now, we're connected. So now at this point this which wasn't working should start working. Okay. And it's actually not the API I want. Let's go do something different here. One moment. Let's go. Oh yeah. Now I have to reconnect here. One moment. All right. And see no OS is perfect. Even though I have it on do not disturb, it still disturbs me. That's cool though. on here in my Mac and tail scale. It gets pretty neat what you can do. All right. So, at this point, I can go to here from my Mac. I just wanted to get the name of it. N8N internet. So, we go to this. And if we do that, now remember, if I jump off tail scale, I cannot connect to here. Okay. So, there we go. We're in. If we logged in, I don't think I have the login here because it's in that keeper thing. If I turn this off and I reload this, we don't get there. Okay, it's really that simple. I think it's a VPN concept, but to me, it's a little bit more approachable than VPN. All right, let's get back to the video for the DNS. We've done the DNS and then this and then the rest of it should make more sense as we start doing the more fun stuff where we install some of the software. So now as we see how the DNS works and it all points to here, then when I go to these wild card domains or subdomains, we can then see that domain. So in this case, let's go look at something because it's done. And I don't know if I can zoom in. I'll see if I can do it in the video after, but you'll see it says PDF.getstrongbox 8080. But we're not going to go to 8080. We're just going to go here. Now there, this is going to be one of the complications I'm going to try to fix. But if you look, we are here and it's not secure. And that's fine. We are secure because we're internal. We don't have certificates or HTTPS encrypted, which I'll explain later. We will have, but it might be a little bit like not ideal. I'll explain that after. Because we're internal, we're we don't have wild card certificates by Let's Encrypt because Let's Encrypt is not allowed to reach our systems to prove that they're there. Otherwise, they'd have to open to the world. Now, you can do it outside of that and I'm going to look into it so that we can still have these nice HTTPS no matter where we are. But that will come later. Now, this guy saying log in and so Coolify with one click I got ready to go. But let's look at something really quick. If we look here and we look at general, we have that. Do we have any environment variables and we have this? Okay, I don't see any passwords or usernames. So, it's okay. Go use your Sterling PDF. I went here and that's my domain. So, what is the login? I don't know what the login is. Oh, it's right here. Man, that's some horrible UI. Sorry about that. Maybe it's because I'm viewing this from low desktop style. Sterling admin. Okay, let's see. There we go. So, obviously, we want to change that. So, it's saying welcome. I'm going to paste that default password they gave me. Now, I'm going to use a password manager. I don't use this password manager. I'm just using it for the series so that I don't open up my password manager. And we're going to add a new one. Any password manager you use is better than none. And we're going to go add it. So now we have our password manager. We have the URL for it. We have the username which is this admin. And then we're going to generate a password. Again, I don't know any of my passwords. So I just generate them and let the system do it. So now when I go back to here which is not there and I just put in that new password it should be happy with it. Sometimes you get want they want certain things. All right. So we're going to go to get admin and that okay perfect. So now what we're getting out of sterling is something I will show later on. So let's see. And this actually might not even be Yeah, it is sterling. Cool. How do I get out of here though? Maybe later. Skip for now. Geez. So, we're running Sterling. You can do stuff with PDS and they have an API. This is cool stuff. We are going to use this a lot. I'm going to come back to this later, though. Okay, here's the API docs. Oh, they really did some updates here. Okay, we'll come back to this. What I do want to do is our NAN install because that's the most exciting one to start with. Now, look at that name. It's crazy. I'm going to just rename that PDF server. And this will turn our PDF pages into images which later on we'll use with AI vision to do more in-depth OCR. So we're just going to do the basic NAND. There's an NAN with a worker N with Postgress. This is the one I typically use, but I just want to start off small here. And like I said, we are able to say settings N8N internet. So, we're going to have our 18 N8N. And let's give it a name. That makes sense. I don't think this is where it matters, though. Sorry about that. Let's go back to here and give it get rid of that weird name. Okay. So, now we're going to deploy that. So, again, Coolify setup, Coolify login, Coolify, create the project in the environment, and then start adding our apps. You saw the little clip on how I did the domain name stuff that was added earlier. Sorry, this is like video editing. And now we're able to add this. And we're going to try adding Olama next. I've never done Lama in Coolify. I've done Olama out of Coolify, but by the time we're done, it should all work. So this is Nadn. So now it's running. Now if I go here, because of the magic of the DNS stuff, it's just there. Now it won't like this because it wants us to be running HTTPS. And that's fine. I just don't want to worry about that right now because then we get bad uh what's the word? Bad search because no good reason really. It's just something we will take care of. But it's not the first thing I want to do right now. So what we want to do is do what it said. So we just said, hey, if you're really going to stick to this, we're going to make that false. Now I went to the NN. I went to the environment variables. I made a change. I'm going to click save. Then I have to restart. Now this comes in handy later on with N8 because you might want to set particular environment variables here. You might want to change a domain name or something later on. You might want to increase the workers. There's other things you set here commonly in. So we'll come here again later. So this should have restarted. I think that was pretty quick. So if we reload this guy, it should be less complaining around HTTP. Now, this always gets a little tricky. HTTP Chrome does not really like this, and I don't blame it. This is the one flaw in my plan here that I'm trying to figure out how to get around. So, if we go back to here and we just click this, that's where you get your links. Try using current. So, we did set it to false. I don't think it really restarted. False. Restart. Let's give it another restart. That's a really quick restart. Let's stop and start I because that's too quick. But this should work. If not, we'll go at it a bit more. We'll turn on HTTPS. I'll show you what happens. So, I'll do a little diagram right now. I'll cut and show you something where we show how this is working. So basically when we set up Coolify, it uses let's encrypt to do some nice free SSL stuff that used to cost money in the day. But right now by default that lets encrypt and this is in the coolify docs is hey I'm going to the Python script or whatever that runs to say hey I'm going to set up the search for this site has to call home to them validate the ser by calling back to us to say hey is this site here and legit but it can't because we're behind our tunnel sorry our firewall basically our network router and we didn't open up any ports because we don't want to so at that point it just can't now you can work around that I forget how they want YouTube. I'm not sure how to connect all the dots with Coolify and that traffic and all the thing it runs. So, I'll come to that later. But it's just not the most important thing right now, but that's all. Otherwise, it just works and it's an amazing way to do search. You never pay for searchs. Um, unless you're AWS or something silly, we have bummers. Let's see. Sometimes, let's see if I can do that. Sometimes any not sometimes super chase too many names. Sometimes sometimes Chrome can be a little bit aggressive there. But no, I'm going to set up that recommended. You can run locally. No, I already did this. So, we already did this. Okay. So, why is it not working? Let me see something really quick. We'll troubleshoot this together. So, I'm going to go here and then here. So, this is the composer file. And if we look here, we have environment variables and then we can load them here. So, I'm going to do this. I'm going to just say Z environment. And does this guy have a clipboard? I can't remember. Let's go grab this again. Let's go back to Chrome. Okay. So, it's not here. So, I'm going to put it here instead of updating that file. And this is good. You start to see how you can do things. And we're going to set it to false right here. And so then if we click see now if we show the whole deploy thing this is what it runs later on and I bet you yeah I thought we would see it twice. So we see it where I do it there and then well says we see it twice. So let's see what happens now. Right there that's as in there as you can get. So again this is Docker. It's a website wrapped around all these Docker services to help manage them. I think it self runs in Docker. So, if we were to go to our terminal and get into the Docker side of things. I don't know if I can even do this as my guy. Yeah, sudo sue. What is sudo sue? When you have to do stuff as a root user for a moment, you super user do and you just take a moment to do that particular task without running everything as root. Root is that admin level command that can really break your whole machine. So, you stay away from it. But we can see that we have some things here called Coolify. So we can do Grep Coolify and we can see some of this stuff running. So Coolify is running in Docker as well. And the nice thing here is and I don't have Docker Desktop installed because I don't want to get in the way of anything. But the nice thing here is it manages Docker for me and I don't need to worry about it. It's just running. And there's a lot of benefits from that to you that we'll see later on because I just don't want to manage the Docker stuff at that level. All right, let's see if that finally worked. Let's go. R. Here we go. Okay. So, I had to put it in the config. Glad we all saw that together. And now we can get started. So, let's do this and then we'll connect it to a llama. So, at this point, I'm going to open up my I think I have an extension here. So, I think that I've never used this before. Sorry. Log in. So, I want to go create something. I have no clue. Hold on. Yeah, I know. It's my email. Remember, you want a system that works. That's just me typing in public. I'm going to go here because it's already opening. I'm going to add a new website. So, we're going to go to here and we're going to add this to the to the list here. So, we're going to say create new. And if you learn anything from this, manage your passwords. Just take a moment. If you have Chrome, it can do it for you. Things like that. It can be risky but to not use a password manager. So I'm going to do Alfred at Daily AI Studio and then we'll make it generate the password. I won't even care. And then we should be able to copy that. I think I don't know if I'm copying. Let's see. Okay. And then Alfred New. Now when we do this, we're going to get a license from them. And this is going to be important. One cap. All right. So, they have regulations and rules on this. So, this isn't really a public facing. So, I don't need to worry about this. That's weird. That's pretty good password. Maybe when I pasted the Yeah, I think I just pasted it wrong. So, now let's answer a few questions of theirs and then let's just this one. So, you want to make sure to do this license. This is actually key. Okay. So, I'm going to pause the video for a moment and go get that. I'm going to go to settings, enter key, and going to go enter it. Now, I'm going to pause for a moment. You need to do this otherwise you won't have particular features which are so key. And it's just you have to do it. So, let me go do it one moment. Okay. So, it was in my email. I could even click the link, but I might not be able to because it's this. Okay. Boom. We got a license key. So, now we have NADN running in our system in our office and it's going to connect to our Ola. So technically or typically I wouldn't run lama in coolify but we're going to. But here we are. We're done with the N8N part of it. So what that means is we could say trigger manually. Sorry actually let me do this one. AI chat and we could do a whole little world with our local running. Let's do that next. Struggle through it with me. All right. All right. I don't know if this is going to work. I've never done this one before. I've done Olama in a few other ways. I just want to do this with you so we can just see if it works together. So, I'm gonna click new lama. Can I spell again? We're getting all this stuff from the Coolifi interface on how to do services. We're just doing a new one. I'm going to do a llama. Now, it comes with web UI, which is fine. If this works, then we'll go to making it an actual service, which is what I really want. Now we have the weird No, it's right. We got this. So, let's see. But see, the thing is I need a llama running on a port I can connect to. So, let's just do this first and then we'll do that after. Okay. I think this is just the web UI. So, this is web UI. It's a cool way to chat with your models. I was going to set this up anyways, but I think now that they're connected, it just is a easy thing. I'll just keep it there. So, what is this going to do? How does it work? I don't know. I've always done this at the command line. So, let's see what happens here. Docker has a lot going on for it with local LLMs and running them. If I have to run them on the side via Docker, I'll do that. I was trying not to, but I'll just have to. And otherwise, let's see what happens. May take a while. It's probably a big download, but I don't know if it comes with a model or not. Maybe it does. Oh, let me start recording on this guy, too, just in case I have any issues. So, when you click build on this guy, it can be sometimes hard to see it stop. Obviously, I had to scroll. It will get to a point where it started and you can see the green back here. If you close out of this too soon, I've seen it just not really work. So, just remember to be patient and keep an eye on that. Coolify is great, though. Again, I can't say enough good about it. So, let's see. I'm going to go back and I'm going to go here. So, we're set to go. Now, we had some issues here with the what do you call it? Uh, oh, good. It's just HTTP. Good, good. And we talked about this in the last first part of this video. So, now I'm going to create my username and password. I'm using this guy here. So, let's keep using this. Have a password manager and then try to remember your password. Hold on a moment. Let me pause. All right, that took a moment. So, I'm going to put in my name and then I'm going to use keeper to set the password. Again, you want to get in a habit of having good passwords that you don't even know. So, this is probably just getting confused by the subdomain. So, I don't I'm not used to this software. Here we go. Create new record. It's not right. And I don't really care, but I'm going to copy that. It just gave me a warning, but I'm going to do that. So, I think we're set. Okay. So, does this just work? Let's just see what happens. Ah, no models. Okay. it it should have set it up for us. Let's go just dig into the docs. There isn't any docs, but let's just go look settings. So, in in Koolifi, you have your service and your service is running multiple services or applications which are Docker and T. Sometimes you'll see information here like passwords and stuff. You could click on environment variables and see what's going on here. Okay, nothing. Now, we'll come back to here because I'm curious what is that set to. Okay, but let's now go look at general and let's see. Here's where Olama's running. That's cool. And that's the Docker network, which is interesting. So, the question will become, can I do this outside of Docker? That would be tricky. Not impossible, just tricky. We'll test that in a moment, actually. So, yeah, I don't see anything there. And I don't see I've never used this in here. Let's see what happens. I'm just curious. So then we have functions and settings. I don't think that's it. Admin PL settings. See where are the models. Here we go. Oh, that's cool. So set up our local llama. So users, evaluation, settings, general, connections. Obviously, we're not using OpenAI for this one. Though I don't know why it's checked. So then it says, "Hey, go connect to us locally here." And then direct connection. And what is this? Manage. That's interesting. I've never seen this with this guy here. This is cool. Go away. Model ID tags. Model ID. So the question becomes, where are our models? Or I know how to pull them down the oldfashioned way. What do we got here? put pull a Oh, look at that. That's nice. Okay, so let's go to a llama. I'm trying to learn how they want me to do it. I'm going to do a simple one. This is not your best one, but let's just do this one anyways. So, let's go get this guy and let's go back to here. And again, we can do a lot of this at the command line, but I'm trying to just try something here. So, I'm going to pull that down and we'll give it a moment. It's nice. It's I've never done it through this UI. We'll do LM Studio as well before long. No, did I just break it? Hold on. Let's go look. Nope. It's still going. Cool. Cool. It has to download a big file. So, let's just give it a moment. And then the trick will be could N8 call to that. So, let's go find out. It seems Let's just get ahead of the game here. So, that would mean let's just go open up a new one. It seems like I could close that window just fine, honestly. But, I'm going to just keep it open. So, we went to settings and general general. Did we not go to settings? Admin panel settings. Okay. Models connections. This guy. This is what I want. I'm just curious. Can I do this? So, I'm going to go to here. I'm going to say, actually, I'll do this one because it shouldn't matter. going to say lama and then I'm going to go set it up and instead of this going to put in this now remember it has to talk to it through docker or through the network connections and it couldn't connect and that's okay. I got to I'm going to go look into that. We're going to make this work. I just got to remember how this guy works and then we'll come back to that. because I want them to connect inside of the defy destinations and I'll come back to that in a moment. We're almost ready to show this is working and I'm curious how the speeds will be. It's just doing some hash checking. All right, here we go. So, I should be able to select a model at this point. Let's go see if it's there. Yeah, there it is. Hello. There you go. Okay. Okay, so we got the basics. Now I need to connect them. So at this point we need to work on that. And so I'll explain that in a moment. I'm going to show it as a chart and then I'm going to go pause for a moment and read something to remember how to do destinations by so I could show you. And then they're going to talk to each other. And this is going to be cool because N8 will talk to our llama. Be right back. Okay. So how do we get N8N to talk to the system the OMA system? So if we look here let's see if we can oh sorry different machine let's see if we can visualize this right so we have right and then in there so let's just type outfi and in there we have n8 and that's running on docker okay so if we were to just say where is this guy here it is we have n8 now n8 is a service so we have to remember that because when we read these docs we're going to see that It's a service and a service is then basically and I'm not an I'm far from a Docker guy. I actually didn't like Docker for a number of years just because I thought it was overkill. But this is just a bunch of services. Now I don't think we have more than one service here. But let's just say we'll call this the NAM worker because later on we'll have a Q system. So this becomes our Docker image, our Docker service which is just a bunch of applications. Okay. But this is in in and of itself a network. Now I think reading their docs I'm keeping it simple for now but I could see taking this further later on and I'll explain that but we got over here then we have the service we made which is the open web. Okay so now this is a network of its own and this is a network of its own and this guy has the two services running. So that would be the docker service open web and then the docker service lama. And we can see that when we go back to this guy here and we look at our office systems production and we see the service and we see the service here with open web and we see that there's two running services. Now sometimes there'll be more and if we look at our edit compose which I altered thanks to help from cloud or something we were able to then do a little bit more with the network and I'll explain that in a moment again but see this is a production service that has these technically when I say technically if you look at the API for coolify it calls them applications so that's confusing because we also have applications outside of this. So, if we look then at the N8N one, we also see that it's a service with one service running. If we look at the I think I don't know if this guy had more than one. No, this one does not. We're going to set up Superbase in a moment and that has 10 or more applications running in that service. So, you see you'll see a more complex setup in a moment. But here we have Olama with web. Now I cheated here and I want to eventually get into this but I'm not sure how important it is for this particular series but I was able to connect to a predefined network which this service L8GWCC becomes a network and so that means that this network where these two things can talk to each other is self-contained but the moment we open it up and the moment we say hey that server localhost has destinations is the moment we can say hey let's make those different services their different networks communicate together. So we have the destination of the Coolifi server itself. It's Docker network because it's inside of Docker. And then we have the other two which one of these is the N8N and one of these is the open web. But now they are in the same destination area. So they have a certain amount of networking ability to each other and that's pretty cool. I'll try to visualize it later, but we're basically saying, let these guys talk to each other even though they're different services and we don't have to expose their information to the outside world. Now, we don't necessarily care because this is an internet. Okay? But you could see if it wasn't an internet, the value there, it's a tough one there, but let's go back and let's just keep it simple for a moment. So, what I did was I went here and then I went to here and then I went to here, which didn't have a URL before, and I just said, you know what? Let's put a domain here. And once I did that, we could not connect to it still because I did forward the port. And so again, just giving things to chat or claude, having it set up the networking so the port 11 434 will work was just all it had to do. And so at this point it was able to route traffic to the URL which defaults support 80 to then be mapped to one 434. And so I think I could have done it easier by just doing 1 1434 like this. But it's so tricky right now cuz like ops can be hard. But if you just give chatbt if you these tools what they need to answer the question they typically can. You just got to remember to keep them on task with the docs and whatever other context you can give them. But anyways, I'm going to remove that for now because I'm happy with what it did. And so now if we go to our N8, which lives here, but it also lives in its own network. Okay. And if we say to it here, add the Olama connect to the connect to Olama using that particular port URL. Sorry, it can connect. Now, this isn't a big deal because we're just we're no longer dealing with internal Docker networks. You've exposed it. So, in the real world situation, you might you don't want to do this. Not even might. So, that's why I would then head towards the more internal networking. Now, can I internally network this guy? Yes, I could if I just spent a bit more time. So, if I was to do something and I don't this won't work, but if I did something like this and I spent more time doing this, it would eventually connect. I don't think it will now. You can tell because it's taking so long. It would connect to the internally running in that Docker container. Hopefully during this video, I'll spin up some visuals so we can see as I talk. But because we're just dealing at the internet level, I'm going to leave this as is. Okay, it's pretty nice really. So, now we can connect. It's pretty quick, but that's neither here nor there. It's just IP at that point, internal IP. So, we see that it can connect. We see that I can say hello world. We see that it talks back to us with that particular model. Now, when I did set this up, I had to close this window and open it up again before it would actually show me the model. So, just if you get stuck, just do that. And at this point, you can start using them like normal. So, now if we were to go back here and add a model, let's see. I wonder how I wanted. Let's see. Let's just do this one. There's so many. I got to learn some more about them obviously. But this is six gig. Let's do this one. I don't think I'll have trouble running. And that's where we'll hit some details later. But now I'm going to say all models. I'm going to go to admin panel settings. And let me reload this just to make sure I didn't lose some settings. I did. Okay. One moment. Let's see. Fill. I don't use this tool much. This keeper thing. So we go models and then we go to our oss latest import presets export download manage models. Let's go get a model from them. This is cool. And we'll download it. And that will take a moment. Okay. So we have our NADN working and we have it using local data. And part of this the goal here is to do real stuff in an office. So we're not going to focus on benchmarking. We're going to just focus on real needs that customers have. So maybe the first need I'll do as a common scenario is and again the customers who want privacy this means a lot whether it's turning invoices or on-site work into data in the database it just depends on the need now in our case we're going to take data we'll take it from email I'll make a fake email we can use and send data there and in doing so then we'll have N8 capture that and bring it into the system put it into our database to store and then we'll ragify it so we can chat with it. And in doing that, we'll start our first moment of taking data from somewhere, moving it into our our systems memory or database and then using that as a real user to to then process it, put it into a database somewhere. So then they can see and use that data. We're going to introduce Superbase in a moment so that we can do all that. We're going to introduce a well-known rag workflow that was released by a YouTuber. And then we're going to have a UI in front of that that we don't have to build because we'll use something buddy base or Yeah, probably buddy base because no codebase isn't here yet for me to use. I don't think I'm going to look if it's here. Someone posted a PR for it and then we'll just keep going at it. All right, hang in there. Okay, so the last part of episode one, otherwise this will go on forever. In episode two, I got to go get that going. And it's going to be built on top of everything as we go. And we're building real systems. Before long, episode three or four, we'll have a UI using Buddy Base or one of those tools so that the people can use the system. And we'll see how that goes shortly. But episode one, we're going to wrap off right here and we're going to set up Superbase. We're going to show that they connect and then we're going to eventually move into this episode two that is where we can do a rag system and we'll email the rag system meeting notes and

Original Description

Join this channel to get access to perks: https://www.youtube.com/channel/UCZa3QWzy1z1G9FIw02pytdA/join #n8n #Ubuntu #Ollama #Supabase #Coolify #localllms Welcome to Video 1 of the 2026 Private LLM & Self-Hosting Series. This year-long deep dive will show you exactly how to build and run your own fully private AI stack — on hardware you control — using Linux, Coolify, Ollama, n8n, Supabase, and more. In this first video, we focus on The Why: • Why companies are suddenly asking for on-prem AI • Why privacy and cost are driving the shift away from cloud APIs • Why change-control matters when models break your prompts • Why a single office server can replace thousands per year in API spend • Why you should start learning this workflow now (even as a solo dev) Then we walk through the base setup: • Installing Ubuntu • Setting up Coolify • Running n8n inside your own network • Installing Ollama inside Coolify • Directly connecting n8n ↔ Ollama • Adding Supabase as your internal database • Using Tailscale to securely access your server remotely • Preparing the stack for real automations in Episode 2 By the end of this video, you’ll have the full foundation in place to start building internal automations, RAG pipelines, reporting tools, and AI agents — all privately, securely, and without cloud API costs. If you’re part of the paid channel, thank you — you make this full 2026 program possible. If you’re watching via a preview: join the channel to unlock the full course. # Chapters 00:00 – Introduction to the 2026 Series 00:35 – Why This Matters (for you + your clients) 01:10 – Privacy: Why Companies Want AI In-House 02:10 – Cost: How API Fees Add Up Fast 03:05 – Change Control: Escaping the “AI Upgrade Hamster Wheel” 04:05 – Overview of the System We’re Building 05:15 – Office Network, Firewalls & Internal Access 06:05 – Remote Access with Tailscale 07:00 – What This Local AI Server Will Run (Ollama, n8n, Supabase, etc.) 07:45 – Getting Ready to Instal

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Playlist UUZa3QWzy1z1G9FIw02pytdA · DailyAi.Studio · 25 of 45

← Previous Next →

Build an Event-Driven AI Backend with Supabase & N8N

Build an Event-Driven AI Backend with Supabase & N8N

Softr Vibe Coding, Docker’s AI Fix, & The Rise of Dify (Week 49)

Softr Vibe Coding, Docker’s AI Fix, & The Rise of Dify (Week 49)

Stop Manual Work: OpenAI & Zapier PDF Invoice Automation

Stop Manual Work: OpenAI & Zapier PDF Invoice Automation

N8N Tagging Execution History!

N8N Tagging Execution History!

NoCode News - Softr Vide Coding 🤔

NoCode News - Softr Vide Coding 🤔

NoCode News - Docker MCP Solution 🤔

NoCode News - Docker MCP Solution 🤔

Tried Olares for my Local AI Server... Here is the Good and the Bad.

Tried Olares for my Local AI Server... Here is the Good and the Bad.

No-Code News: Private AI Infrastructure, Microsoft Foundry, & GPT 5.2

No-Code News: Private AI Infrastructure, Microsoft Foundry, & GPT 5.2

Softr AI Tutorial: Build a No-Code CRM & Workflow Automation

Softr AI Tutorial: Build a No-Code CRM & Workflow Automation

N8N Queue Vs Non Queue

N8N Queue Vs Non Queue

No-Code 2025 Finale: Xano vs N8N, OpenAI App Store & The Rise of Agents

No-Code 2025 Finale: Xano vs N8N, OpenAI App Store & The Rise of Agents

Remote Desktop TailScale!

Remote Desktop TailScale!

NoCode News 2026: Notion AI Agents, n8n Security & On-Prem AISupport the channel and join

NoCode News 2026: Notion AI Agents, n8n Security & On-Prem AISupport the channel and join

Notion the new No-Code Agentic Platform 🤔

Notion the new No-Code Agentic Platform 🤔

No-Code News - Apple Clusters and Exo

No-Code News - Apple Clusters and Exo

On-Premise Episode 3 - Email to Ai Agent and Back!

On-Premise Episode 3 - Email to Ai Agent and Back!

🔥 Open-Source On-Prem Server Olares 🔥

🔥 Open-Source On-Prem Server Olares 🔥

No-Code News: Zapier Acquires Panda, N8N Security Patch, & AI Wearables (WK 2-2026)

No-Code News: Zapier Acquires Panda, N8N Security Patch, & AI Wearables (WK 2-2026)

No-Code News - Best Automation Platform 2026?

No-Code News - Best Automation Platform 2026?

No-Code News 🔥 Agentic Desktops!

No-Code News 🔥 Agentic Desktops!

On-Prem: From "Stuck" to "Deployed" with AppSmith & Coolify 🚀

On-Prem: From "Stuck" to "Deployed" with AppSmith & Coolify 🚀

Zapier Agents: What, Why, and How! (No-Code AI Automation Tutorial)

Zapier Agents: What, Why, and How! (No-Code AI Automation Tutorial)

Can open-source models handle real business tasks? #llm #onpremise #ai #n8n #opensource

Can open-source models handle real business tasks? #llm #onpremise #ai #n8n #opensource

Can Open Source LLMs Models Perform Common Business Tasks?

Can Open Source LLMs Models Perform Common Business Tasks?

Private LLMs & Infra From Scratch – Episode 1: The Why, The Setup, The Stack #n8n #coolify #ollama

Private LLMs & Infra From Scratch – Episode 1: The Why, The Setup, The Stack #n8n #coolify #ollama

No-Code News WK 3-4 2026 #claudecode #mcp #aiagents #n8n

No-Code News WK 3-4 2026 #claudecode #mcp #aiagents #n8n

Vibe Coding a Real Business: Meal Planning App Start to Finish

Vibe Coding a Real Business: Meal Planning App Start to Finish

🔥 One Prompt + My Phone = A Working Game #gaming #automobile #smartphone

🔥 One Prompt + My Phone = A Working Game #gaming #automobile #smartphone

No-Code News, Open-Source Ai and More - 2026 Week 5

No-Code News, Open-Source Ai and More - 2026 Week 5

Best AI Agents for Project Management 2026 (Zapier Builds Them All)

Best AI Agents for Project Management 2026 (Zapier Builds Them All)

I Automated My Meeting Notes With Granola.ai — Here's How

I Automated My Meeting Notes With Granola.ai — Here's How

Granola's AI Notepad Recipes & How It Can Easily Save You Hours A Week #granolaai #productivity

Granola's AI Notepad Recipes & How It Can Easily Save You Hours A Week #granolaai #productivity

Stop waiting on API calls just to tweak your prompt #zapier #zapieragents

Stop waiting on API calls just to tweak your prompt #zapier #zapieragents

No-Code and Ai News - Interview with Noloco Founder Darragh Mc Kay

No-Code and Ai News - Interview with Noloco Founder Darragh Mc Kay

No-Code and Ai News - 2026 WK 9

No-Code and Ai News - 2026 WK 9

How to Get Business Reports From a Database (No SQL Required)

How to Get Business Reports From a Database (No SQL Required)

Supabase AI Queries Your Data So You Don't Have To #supabase #businessautomation #ai

Supabase AI Queries Your Data So You Don't Have To #supabase #businessautomation #ai

No-Code and AI News for Your Day to Day Work | Part 1 of 3 #AINews #nocode

No-Code and AI News for Your Day to Day Work | Part 1 of 3 #AINews #nocode

No-Code and AI News for Your Day to Day Work | Part 2 of 3 #ainews #nocode #technologynews

No-Code and AI News for Your Day to Day Work | Part 2 of 3 #ainews #nocode #technologynews

No-Code and AI News for Your Day to Day Work | Part 3 of 3 #ainews #technologynews #nocode

No-Code and AI News for Your Day to Day Work | Part 3 of 3 #ainews #technologynews #nocode

No-Code & AI - Interview - Stuart Mason - AI and Changing with the Times #ai #developer #nocode

No-Code & AI - Interview - Stuart Mason - AI and Changing with the Times #ai #developer #nocode

Part 1 of 3 - No-Code News and AI - Interview changing with AI

Part 1 of 3 - No-Code News and AI - Interview changing with AI

How I Chat With Supabase using Claude Desktop and Connections

How I Chat With Supabase using Claude Desktop and Connections

Chat with your Data - Supabase and Claude Destkop #shorts #supabase #claudedesktop #ai

Chat with your Data - Supabase and Claude Destkop #shorts #supabase #claudedesktop #ai

I Built an iOS App in 8 Minutes (No Code, No Developer)

I Built an iOS App in 8 Minutes (No Code, No Developer)

This video series teaches how to build private LLMs and infrastructure from scratch, covering topics like cost savings, control over AI upgrades, and internal infrastructure. It provides hands-on experience with tools like N8N, Llama, Tail Scale, Coolify, and Olama.

Key Takeaways

Set up a Linux server
Configure N8N and Llama
Set up a VPN using Tail Scale
Host various services such as PDF services, Olama, and database storage
Install and configure Coolify
Deploy Nex.js on Coolify
Create a new project on Coolify
Install open-source applications and databases on Coolify

💡 Building private LLMs and infrastructure from scratch can provide cost savings, control over AI upgrades, and improved security and privacy.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Related AI Lessons

This ChatGPT Prompt Replaced 3 Hours of PowerPoint Work

Learn to generate pitch-ready presentation decks in 5 minutes using ChatGPT, replacing hours of manual work

This ChatGPT Prompt Replaced 3 Hours of PowerPoint Work

Learn to generate pitch-ready presentation decks in 5 minutes using ChatGPT, replacing hours of manual work

Medium · ChatGPT

How AI Assist Turns a Rough Draft into a Polished Document in Minutes

Learn how AI Assist can transform a rough draft into a polished document in minutes, streamlining your writing process

Dev.to · paperquire

13 ways to make money with AI in 2026, ranked by how fast you will see your first dollar.

Learn 13 ways to monetize AI in 2026, ranked by time-to-earnings, to start generating income quickly

Chapters (10)

Introduction to the 2026 Series

0:35 Why This Matters (for you + your clients)

1:10 Privacy: Why Companies Want AI In-House

2:10 Cost: How API Fees Add Up Fast

3:05 Change Control: Escaping the “AI Upgrade Hamster Wheel”

4:05 Overview of the System We’re Building

5:15 Office Network, Firewalls & Internal Access

6:05 Remote Access with Tailscale

7:00 What This Local AI Server Will Run (Ollama, n8n, Supabase, etc.)

7:45 Getting Ready to Instal

Salesforce Flow New Features (Summer '26) | Open Record, URL & Show Toast Messages