Google Cloud (GCP) Document AI allows you to create document processors that help automate tedious tasks, improve data extraction, and gain deeper insights from unstructured or structured document information. Document AI helps developers create high-accuracy processors to extract, classify, and split documents.

This guide provides an overview of setting up the Google Cloud Document AI Connector in Peliqan. For technical assistance or custom integration requirements, please contact our support team.

Setup

Perform these steps:

Create a Project on Google Cloud (GCP)
Copy your Project ID

Enable the Document AI API
Set up billing in your Google Cloud account
Enable a processor (e.g. “OCR”) in Document AI:

Go to https://console.cloud.google.com/ai/document-ai
Click on Explore Processors

Select a processor, give it a name and click on “Create”

Copy the Processor ID

Add a connection in Peliqan for “Google Document AI” and authorize Peliqan to access the Google GCP project

Example script

google_document_ai_api = pq.connect('Google Document AI')

import base64
import requests

allowed_upload_file_type = 'pdf'
uploaded_file = st.file_uploader(f"Upload {allowed_upload_file_type} file", accept_multiple_files=False, type=[allowed_upload_file_type])

if uploaded_file is not None:
    file_contents = uploaded_file.read()
    file_base64 = base64.b64encode(file_contents).decode('utf-8')

    params = {
        'base64_content': file_base64,
        'mimetype': "application/pdf"
    }
    
    result = google_document_ai_api.get('process_document', params)
    st.json(result)

Google Cloud Document AI - Getting started in Peliqan

Setup

Example script