Jump to content

Using Document AI to automate procurement workflows


Recommended Posts

Earlier this month we announced the Document AI platform, a unified console for document processing. Document AI is a platform and a family of solutions that help businesses to transform documents into structured data with machine learning. With custom parsers and Natural Language Processing, DocAI can automatically classify, extract, and enrich data within your documents to unlock insights. 

We showed you how to visually inspect a parsed document in the console. Now let's take a look at how you can integrate parsers in your app or service. You can use any programming language of your choice to integrate DocAI and we have client libraries in Java, Node.js, Go and more. Today, I'll show you how to use our Python client library to extract information from receipts with the Procurement DocAI solution. 

Step 1: Create the parser

After enabling the API and service account authentication (instructions), navigate to the Document AI console and select Create Processor.
create the parser

We'll be using the Receipt Parser, click on it to create an instance.

create

Next you'll be taken to the processor details page,  copy your processor ID.

Step 2: Set up your processor code

In this code snippet, we show how to create a client and reference your processor. This code snippet shows how to create a client and reference your processor. You might want to try one of our quickstarts before integrating this into production code.

Note how simple the actual API call looks. You only have to specify the processor and the content of your document. No need to memorize a series of parameters, we've done the hard work for you. 

You can also process large sets of documents with asynchronous batch calls. This is beneficial because you can choose to use a non-blocking background process and poll the operation for its status. This also integrates with GCS and can process more than one document per call. 

Step 3: Use your data 

Inspect your results, each of the fields extracted per processor are relevant to the document type. For our receipt parser, Document AI will correctly identify key information like currency, supplier information (name, address, city) and line items. See the full list here. Across all the parsers, data is grouped naturally where it would be otherwise difficult to parse out with only OCR. For example, see how a receipt's line items attributes are grouped together in the response. 

receipt

Use the JSON output to extract the data you need and integrate into other systems. With this structure, you can easily create a schema to use with one of our storage solutions such as BigQuery. With the receipt parser, you'll never have to manually create an expense report again!

output

Get started today! Check out our documentation for more information on all the parser types or contact the Google Cloud sales team.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...