“Automation 2.0 coming…No more boring data entry job”

AI Jason · Beginner ·🛠️ AI Tools & Apps ·2y ago

Key Takeaways

This video teaches how to automate data entry using GPT to read invoices and enter data into Xero and HubSpot

Full Transcript

today I want to show you how can you use GPT to automate boring data entry job so last week I spent almost a whole weekend doing the tax reimbursement and it was such pain as I reviewed around 60 to 100 different invoice extra which company it is how much it is and then manually put into zero for the reimbursement I feel so exhausted about this process on the outside I also feel lucky because I only needed to do this once a year but in the business setup this process happened every single day any operating companies especially traditional business like retail or manufacturing you receive tons of documents every day they normally have one or two people doing the data entry job into different systems they are using and this whole process was not possible to be automated because every company's document format is different there are literally millions of different formats for even just invoice so there's no standard way to process all those information but with large language models like qpt it's actually possible now as long as we can extract tax data accurately from this PDF file we can feed those tax data to gbt let extra structured information and then automatically syncs into different Business Systems like zero Salesforce but one problem you will quickly find is that it's very hard to extract tags from PDF file accurately because the PDF file is not just text part of the PDF file could be image from scanned documents or screenshots and the text can be in a table format or with two columns inside so it is not easy job but luckily after a bunch of research I finally found a kind of bulletproof way and this is what I want to show you today you will be able to create an AI app where you can Define the list of data points drag and drop PDF files extract all the structured information and send those data to integration platform like make.com to trigger different workflow in bin assistance like zero Salesforce and many others and if you don't know what maker.com is it's one of the most flexible integration platform that allow you to connect to business apps that you already use you if you don't have account on making integration platform yet click on the link in my description below to sign up the first thing we're gonna do is extract hacks from PDF files accurately as I mentioned before it is not as easy as we saw but luckily I learned a lot from a media article written by zomanna about how to extract text from any PDF file accurately so definitely go give it if you want to dive deep but at high level the message we're gonna use is a combination of two different python libraries pet vm2 which return any PDF file into image file and then we'll use pass react which did a really good job extracting tags from images so our process is basically convert all the PDF file in image and extract tags from it this might sound a lot of unnecessary work but it works the best with all the methodologies we try lynchings default PDF loader or unstructured fire loader did a really bad job in terms of extracting information from images and same as the other Library called Pi PDF but with this messes it gets almost 99 of text so let's get it as always let's create a folder in Visual Studio code and create a DOT EMV file where you will add your open AI API key here once you did that we will create a second file called app.py and let's break down how we're gonna implement this there will be four steps we will firstly convert all those PDF files into images and we will extract the text from those images while some OCR python Library extract structured information from those texts using large language model like GPT and in the end we will send the data to make.com Via webhook and the first thing we will need to do is we will import a list of different libraries I'll explain later how we're going to use each Library here but once you did that the first thing is convert PDF files into images while a library called Pi PDF if you haven't installed this Library yet click on the button on top right corner and type in PIP install pdfm 2. and this should install this libraries that we're gonna use so let's create a function that converges PDF files into image so the way this function works is it basically breaks down our PDF files into different pages and then for each page we are converting it into images with the python library that we are using and once that's done we will create a second function extract tags from images and give this list of images that we got from previous function it basically runs through each image and try to extract text from it and in the end it will return the whole content and for this function we will need to use another python Library called pits rack and again if you don't have page rack installed you can open the terminal and then do pip install pay this rack once we finish this two functions it basically give us everything we need to extract information from any PDF file so in the end I will create a final function that will connect as two functions together called extract content from URL where we will give it a URL pass of the PDF file and it will firstly generate a list of images from the first function and then extract the text from the second function and return the content that we extracted that's pretty much it let's try this out I have this invoice PDF from GitHub so I can drag and drop this file into the folder once you did that we will move down to the button and create if name equal to main you will need to add this multi-processing free support and then call a function Main and inside main function is where you already write the execution code so we'll create a variable called content call this function that we just created point to the PDF file URL that we uploaded and then print it so let's try this python app.v1 okay so we got results and as you can see it successfully extract all the information that inside my PDF file if you compare it extract almost everything including the money I paid the data I paid and the item that I paid we will also try something like this which is screenshots image inside the PDF file and as you can see even for this image it extracts all the information in including the receipt number the amount of payment date so this is working really well the next thing is we will feed those information to large language model like GPT and let's extract structured information like invoice ID the amount of payment so we will create a new function here called extract structured data wherever pass on two variables one is a Content that will extract from PDF file and the second is data points so data points basically allow users to Define what kind of data points that they want to extract so it can be a bit more flexible for example users will be able to define a data structure like this that they want to extract invoice item the amount of money which company issues invoice and invoice date so we will pass on these two variables create a variable large language model equal to chat open AI with temperature 0 and GPD 3.5 turbo June 13 model then I will create a prompt template you are an expert in many people who will extract core information from documents and will pass on the content that we extract it above is the content please try to extract all the data points from the content above and Export it into a Json array format and we will pass on the state points that the users will be passing on and now Place extra details from the content and export in a Json array format return only the Json array and we'll create a prompt template from there and create a variable called chain equal to large language model chain and then we will run this by passing on the content and data points and we will return the readouts so our move down to our main function here and add a new variable called data which equal to extract structure data passing on the content and the default data points and we will print out data so let's try this all right so we got this results which is a Json format the extract invoice item payment to launch house among the 2500 as a company name and invoice date it is working really well okay so now let's try to add a UI layer while stream list if you don't know what a streamlined is it it's a framework that allow you to create a web app from your python code very easily as they provide a wide range of different UI components so we import the streamlit as St and move down to the Min function and I will replace this part to the streamlined app UI so we reversely try to set up the title for your website do streamlit.setpageconfig page title equal to Doc extraction and then I will give a header and I will create a text error where users can change the data points but there will be a default value that we Define above and then second is we will have a component for user to upload files and I will turn on this accept mobile files and then if uploaded files is not known and data points is known we will trigger the function to extract information so we'll create a results which is array in case people upload multiple different PDF files and for each file we will create a temporary file first so that we can get a temporary URL that we can pass on to our functions so are do F write file get buffer and then we'll create the variable content to extract content from the URL from app.name which is file pass then do the extra structured data and we will also try to convert the data into Json format because even though the results large language model return looks like Json is actually a string which means just text so we will need to convert it into proper Json format and we will do results equal to results plus Json data which means we will combine all the structured data that we extracted together and put into results okay and once we did all those things we want to also have a UI component to display the structured information I plan to use data editor which is one type of UI component that streamlably provide is basically a table that is interactive our given header St subheader results and then as the data editor DF which is data frame object that we created from the results also do the error handling here so this should create a web app that we can upload a file and review the results but let's try that I will click on the top right corner and then to streamlit run app.py alright so we got this web app running and you can see there are two parts one is data points and I can modify it so if I don't want invoice data I can just remove it and then I can upload a file here this time I will try another invoice that is a lot more complicated that it has a few different items so I will drag and drop this file in and on the top right corner you can see it is running all right great you can see it extract the list of all the invoiced item as well as the amount of money each item cost so this is doing really well and if I want to change I can also just click on then change the name if I need all right so we got everything working the last thing is we want to add a button here at a bottom which will create invoice in xero or in Salesforce automatically and to do that you can use xero or Salesforce API and points they both provide documentations where you can take a look and set up the Integrations however it is quite a bit of work those systems normally have complicated authentication process which makes the integration a lot more difficult than just calling an API endpoint that's why I prefer to leverage platform like make.com if you don't have account yet click on the link in the description below you will get a one month per plan for free after you sign up account click on create new scenario button on the top right corner and click on this plus button start searching for webhook and we will choose custom web hook and I will click on this create a web hook button you can rename that to Doc extraction GPT and click save so now we have this API endpoint that we can send data to so click on this copy address to clipboard and return to our python code and we will write a function to send data to make.com to do that we will need to use a library called requests which allow us to send API requests I will move down here before the streamlined app and create one function called send to Main with one variable data or create a webhook URL that we copy from the webhook on make.com then create a Json data and try send out a post request to the webcook URL with our data once it's finished we'll print out a message that data sent successfully otherwise we will have the arrow handling here the last thing is I will move down to the streamlit app and just after this data editor table I will add a new component called streamlit button with the title sync to make and if this button is click to trigger the send to make function that we created above and once it's finished to show a message called sync to make successfully let's run this again streamlined wrong app.py and this time I will change the data point a little bit because I wanted to match the information that the zero API will need to create those invoice which is item description quantity and unit price drag and drop on the invoice I have here and once it's finished you will see new button here called sync to make but before you click on this making sure go back to to make.com and then try to click on this wrong ones button and we will come back here click on this sync to make.com button once it's finished and you will see a little bubble on top right corner here that means it actually receive a new data and if you click on that it is exactly the invoice data that we send so we have successfully built up the connection so next we're gonna build out the whole workflow on make.com to create invoice on xero so click on this flow control which is a geo button at bottom and add an iterator so the reason we end iterator is because more likely you will drag multiple different invoice files so it will be array of invoice to be created and iterator is basically as a way for you to run a for Loop for an array so we will click on this and our choose array to be data and click save and then I will click on this plus button choose zero and the first thing we want to do is search if the company name of the invoice actually exists in our system if it didn't exist out then it will create a contact in zero first our select search by field choose name equal to use company name at top and click OK and then next thing we will create a if condition so you will click on this little tool icon and then select add a router and this is basically a if condition so you can add two router one router is if contact exists and that would be the total number of bundles from the search which means the search results is greater than or equal to one that means the content exists and for the other one you can click on ADD and click and choose setup a filter and give a label that if contact doesn't exist and for the first one here if the content does exist or choose create invoice and then select either it is build or sales invoice and I will put in the content in or choose Content ID and then add a line item and this is where our map to the invoice information I received from our web app description quantity and unit amount and I will click save and the other one if contact doesn't exist then I will choose create a contact and then the name of content that we're going to create will be a company name click OK and after that we would repeat that process of creating invoice except this time the content ID will be from the Creator contact option and I will choose a contact ID we're going to do the same thing of mapping and click ok so this simple workflow should do the job of sending the data from webhook to your xero and let's try this I will click on this wrong ones button go back to our web app we can actually add a few more invoice items click on sync to make and this time you can see here it actually trigger two different paths one pass for the GitHub which will already have the contact the other it will create a contact first and then create invoice and if you go back to the zero there are two new invoice items created already confusing all those details so this example of how you can automatically create zero invoice from those data that you extracted but they are normal system that you can integrate like Salesforce HubSpot Dropbox and many others pretty Keen to see what kind of things that you start building only I hand another quick tip I want to share is for this specific type of document extraction use case you actually don't have to build this whole python code and streamlined app because there are no whole platforms like Realms AI they handle the document extraction extremely well so in the randoms AI platform I can click on new chain called document extraction and then we'll add some inputs first to create a file to URL inputs which will allow us to upload a file and then I will create a table input that allow me to add in different data points okay so I just quickly fill in those data points and I can save those at default value as well and then I'll move down here and a step that can convert PDF to text and I will choose PDF URL to be the file URL that we upload here and below here you will see this option called use OCR and I will turn this on as yes this basically means it will automatically do the converting from PDF to image and then do the scanning the next thing I would do is create a large language model step with this prompt you are an expert at Main people who can track for information from documents please try to extract all the data points from the content below and output the results in Json format so this is a data point to be extracted which will point to the date to table data that we're putting above and this is content which point to the text that we extract from PDF and then export extract data points in a Json array format like below for each data point so I will give example which is array and inside array there could be multiple different Json format for different invoice items now please extract details from the content and exporting a Json array format return only the Json array and once we did this the last part we will do is we use the API step and use the URL that we get from the webhook on make.com with post request and our ad header to be the content type J application slash Json and for the body I will select edit as object click on this button where it will enable us to input the variable this will send the large language model data in a Json format so let's try this click around all once we finish we go back to to make.com you can see it successfully send out the results and it actually created invoice in the zero the good thing about using relevance AI is that once you finish you actually already have a deployed app you can click on this button to share with others as well so this is a quick example of how can you automate those data entry job by leveraging large language model and platform like make.com I'm really Keen to see what kind of use case you created so please comment below as I mentioned if you don't have make.com account yet click the link in my description below and you can get a one month Pro Plan for free I will continue sharing different type of AI experiments I'm doing so please subscribe and I see you next time

Original Description

The real AI Automation is coming - Let GPT reads invoices and enter data into Xero - The step by step guide from extracting structured data from docs, to send data to Xero, HubSpot and more; 🤘 Get 1 month Pro plan on make.com free: https://www.make.com/en/register?pc=jason&utm_source=jason-ai&utm_medium=influencer&utm_campaign=data-entry-gpt 🔗 Links - Join my community: https://www.skool.com/ai-builder-club/about - Follow me on twitter: https://twitter.com/jasonzhou1993 - Join my AI email list: https://www.ai-jason.com/ - My discord: https://discord.gg/eZXprSaCDE - Github link: https://github.com/JayZeeDesign/gpt-data-extraction - Zoum’s video for extract data from PDF: https://www.youtube.com/watch?v=nnZRBAzW3CA&t=23s&ab_channel=ZoumDataScience - No code alternative: https://relevanceai.com/ ⏱️ Timestamps 0:00 Intro 1:35 Quick demo 2:05 Step1: PDF to Text 6:05 Step2: LLM extract structured data 7:55 Step3: Streamlit GUI 10:48 Step4: Xero integration 16:00 No code alternative 👋🏻 About Me My name is Jason Zhou, a product designer who shares interesting AI experiments & products. Email me if you need help building AI apps! ask@ai-jason.com #gpt #autogpt #ai #artificialintelligence #tutorial #stepbystep #openai #llm #largelanguagemodels #largelanguagemodel #langchain #nocode #langflow #flowise #chatgpt #automation #aiautomation#aiautomationagency

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from AI Jason · AI Jason · 14 of 60

← Previous Next →

Build Your Own Auto-GPT Apps without coding Step by Step (Dust.tt Tutorial)

Build Your Own Auto-GPT Apps without coding Step by Step (Dust.tt Tutorial)

AutoGPT tutorial: Build your personal assistant WITHOUT code (Via Relevance AI)

AutoGPT tutorial: Build your personal assistant WITHOUT code (Via Relevance AI)

Create your own AI girlfriend that talks ❤️

Create your own AI girlfriend that talks ❤️

How to build with Langchain 10x easier | ⛓️ LangFlow & Flowise

How to build with Langchain 10x easier | ⛓️ LangFlow & Flowise

I build an autonomous researcher via GPT | Langchain ⛓️ Tutorial

I build an autonomous researcher via GPT | Langchain ⛓️ Tutorial

Smol AI tutorial in 5 mins | Build ENTIRE codebase with a single prompt

Smol AI tutorial in 5 mins | Build ENTIRE codebase with a single prompt

Hugging Face + Langchain in 5 mins | Access 200k+ FREE AI models for your AI apps

Hugging Face + Langchain in 5 mins | Access 200k+ FREE AI models for your AI apps

How to let GPT control anything & 10x powerful | 8 mins tutorial about GPT funtion calling

How to let GPT control anything & 10x powerful | 8 mins tutorial about GPT funtion calling

Extract data & automate EVERYTHING | 10x GPT function calling power

Extract data & automate EVERYTHING | 10x GPT function calling power

Finally, an AI agent that actually works

Finally, an AI agent that actually works

"okay, but I want GPT to perform 10x for my specific use case" - Here is how

"okay, but I want GPT to perform 10x for my specific use case" - Here is how

"Wait..this AI Agent does research for you 24hrs without hallucination?!" - Here is how

"Wait..this AI Agent does research for you 24hrs without hallucination?!" - Here is how

"How to give GPT my business knowledge?" - Knowledge embedding 101

"How to give GPT my business knowledge?" - Knowledge embedding 101

“Automation 2.0 coming…No more boring data entry job”

“Automation 2.0 coming…No more boring data entry job”

"How to 10x chatbot UX? 🤖 🖼️ " - Add Image Responses to GPT knowledge retrieval apps

"How to 10x chatbot UX? 🤖 🖼️ " - Add Image Responses to GPT knowledge retrieval apps

“LLAMA2 supercharged with vision & hearing?!” | Multimodal 101 tutorial

“LLAMA2 supercharged with vision & hearing?!” | Multimodal 101 tutorial

"Next Level Prompts?" - 10 mins into advanced prompting

"Next Level Prompts?" - 10 mins into advanced prompting

Build AI agent workforce - Multi agent framework with MetaGPT & chatDev

Build AI agent workforce - Multi agent framework with MetaGPT & chatDev

How to scale your AI automation pipeline

How to scale your AI automation pipeline

AI agent manages community 24/7 - Build Agent workforce ep#1

AI agent manages community 24/7 - Build Agent workforce ep#1

Autogen - Microsoft's best AI Agent framework that is controllable?

Autogen - Microsoft's best AI Agent framework that is controllable?

StreamingLLM - Extend Llama2 to 4 million token & 22x faster inference?

StreamingLLM - Extend Llama2 to 4 million token & 22x faster inference?

AI agent + Vision = Incredible

AI agent + Vision = Incredible

After 7 days letting AI agents control my email inbox... 📮

After 7 days letting AI agents control my email inbox... 📮

How to use New OpenAI DevDay features - GPT4V x TTS demo tutorial

How to use New OpenAI DevDay features - GPT4V x TTS demo tutorial

What is Q* | Reinforcement learning 101 & Hypothesis

What is Q* | Reinforcement learning 101 & Hypothesis

"Research agent 3.0 - Build a group of AI researchers" - Here is how

"Research agent 3.0 - Build a group of AI researchers" - Here is how

GPT4V + Puppeteer = AI agent browse web like human? 🤖

GPT4V + Puppeteer = AI agent browse web like human? 🤖

Real Gemini demo? Rebuild with GPT4V + Whisper + TTS

Real Gemini demo? Rebuild with GPT4V + Whisper + TTS

AI Robot's ChatGPT moment at 2024?

AI Robot's ChatGPT moment at 2024?

GPT5 unlocks LLM System 2 Thinking?

GPT5 unlocks LLM System 2 Thinking?

The REAL cost of LLM (And How to reduce 78%+ of Cost)

The REAL cost of LLM (And How to reduce 78%+ of Cost)

OpenAI's Agent 2.0: Excited or Scared?

OpenAI's Agent 2.0: Excited or Scared?

Real time AI Conversation Co-pilot on your phone, Crazy or Creepy?

Real time AI Conversation Co-pilot on your phone, Crazy or Creepy?

INSANELY Fast AI Cold Call Agent- built w/ Groq

INSANELY Fast AI Cold Call Agent- built w/ Groq

AI Employees Outperform Human Employees?! Build a real Sales Agent

AI Employees Outperform Human Employees?! Build a real Sales Agent

Future of E-commerce?! Virtual clothing try-on agent

Future of E-commerce?! Virtual clothing try-on agent

Unlock AI Agent real power?! Long term memory & Self improving

Unlock AI Agent real power?! Long term memory & Self improving

"I want Llama3 to perform 10x with my private knowledge" - Local Agentic RAG w/ llama3

"I want Llama3 to perform 10x with my private knowledge" - Local Agentic RAG w/ llama3

“Wait, this Agent can Scrape ANYTHING?!” - Build universal web scraping agent

“Wait, this Agent can Scrape ANYTHING?!” - Build universal web scraping agent

"Make Agent 10x cheaper, faster & better?" - LLM System Evaluation 101

"Make Agent 10x cheaper, faster & better?" - LLM System Evaluation 101

Claude 3.5 struggle too?! The $Million dollar challenge

Claude 3.5 struggle too?! The $Million dollar challenge

Make your agents 10x more reliable? Flow engineer 101

Make your agents 10x more reliable? Flow engineer 101

"I want Llama3.1 to perform 10x with my private knowledge" - Self learning Local Llama3.1 405B

"I want Llama3.1 to perform 10x with my private knowledge" - Self learning Local Llama3.1 405B

AI process thousands of videos?! - SAM2 deep dive 101

AI process thousands of videos?! - SAM2 deep dive 101

"Wait, I'm using OpenAI Structured Output wrong ?!" - Advanced Structured Output tutorial

"Wait, I'm using OpenAI Structured Output wrong ?!" - Advanced Structured Output tutorial

How to use Cursor AI build & deploy production app in 20 mins

How to use Cursor AI build & deploy production app in 20 mins

Best Cursor Workflow that no one talks about...

Best Cursor Workflow that no one talks about...

This is how I scrape 99% websites via LLM

This is how I scrape 99% websites via LLM

Better than Cursor? Future Agentic Coding available today

Better than Cursor? Future Agentic Coding available today

EASIEST Way to Train LLM Train w/ unsloth (2x faster with 70% less GPU memory required)

EASIEST Way to Train LLM Train w/ unsloth (2x faster with 70% less GPU memory required)

1000x Cursor workflow for building apps

1000x Cursor workflow for building apps

Easiest way to build fancy UI with Cursor/Windsurf/Bolt/Lovable

Easiest way to build fancy UI with Cursor/Windsurf/Bolt/Lovable

From $0 to $4m with just 2 people (ComfyUI Crash-course for E-commerce)

From $0 to $4m with just 2 people (ComfyUI Crash-course for E-commerce)

Deepseek R1 - The Era of Reasoning models

Deepseek R1 - The Era of Reasoning models

Yep, o3-mini is WORTH the money - Build your own reasoning agent

Yep, o3-mini is WORTH the money - Build your own reasoning agent

The ONLY way to run your own Deepseek on mobile...

The ONLY way to run your own Deepseek on mobile...

Those MCP totally 10x my Cursor workflow…

Those MCP totally 10x my Cursor workflow…

MCP = Next Big Opportunity? EASIST way to build your own MCP business

MCP = Next Big Opportunity? EASIST way to build your own MCP business

Gemini 2.0 blew me away - The future of Multimodal Model

Gemini 2.0 blew me away - The future of Multimodal Model

Related AI Lessons

Best AI Tools and Software Reviews: 2026 Picks

Discover the best AI tools and software for your specific needs in 2026, and learn how to match them to your work for optimal results

Verify real estate listings with Dwell, a platform that checks claims against records before you sign

Reddit r/artificial

X now offers an MCP server to make its platform easier for AI tools to use

X launches a hosted MCP server to simplify AI tool integration with its API

n8n Automation Repurpose Video Content: The 2025 Production Guide

Learn to repurpose video content using n8n automation, replacing manual labor with a self-hosted workflow solution

Chapters (7)

Intro

1:35 Quick demo

2:05 Step1: PDF to Text

6:05 Step2: LLM extract structured data

7:55 Step3: Streamlit GUI

10:48 Step4: Xero integration

16:00 No code alternative

How to Open HPL Files (HP-GL Plotter)

File Extension Geeks