Auto-completing Insurance Applications

Today I am going to look at the first of several scenarios where we have captured customer information within our database, and we need to send a down-stream insurer the applicant's information. However, the information is in the insurer's own proprietary format. For those that don't know the process, here's a typical scenario. The agency creates a template, usually in Word, PDF or Excel, and the questions are "customer friendly". The agent either works with the customer or the customer alone completes this "agent questionnaire", which is then sent to the agent. The agent uses the information in the questionnaire to complete one of several proprietary insurance applications, where there is a unique format for each of the insurers with whom the agent is appointed. For each application, the agent must copy and paste the information from their questionnaire into the application. This copying and pasting is the purpose of this article.

One of the template formats that I just received is Excel. This particular insurer uses Excel as an application template and includes questions to capture all of the information required for this insurer to be able to underwrite an applicant. This is a proprietary format, and this insurer will not accept applications in any other format but their own (which seems silly because they are otherwise turning away business, but that is a different issue). Here, we are going to use the information that I captured from the front-end web application and stored in my database to populate the Excel template automatically.

Excel now uses a open document format for its file structure, but adds Excel "features" to the format in order to leverage Excel's built-in capabilities. What this means is that the format is more or less just a zip file with a bunch of XML documents within the zip folder. You can see this by changing the name of the Excel file from .xlsx to .zip and then opening the file in Windows Explorer. The same can be done with Apache's OpenOffice .ods files. Within the new zip file, you will see the various .xml files that correspond to spreadsheets, layout, themes, colors, functions, and a lot of other features. Because our template is already written for us, we just need to unzip the file, find the correct spreadsheet and cell reference for the value we want to change, change it to our value, re-zip the folder with our changes into a new Excel file.

Before getting into all of that Python stuff, I want to try to accomplish the same actions using a Python module called "openpyxl" by Eric Gazoni and Charlie Clark. As always, I'll start with the instructions and then write the test case before writing out the code. Our goal is to transpose a record from my database into the Excel template provided by my insurance carrier. For now, the record will be in either .csv or JSON format to mimic the database.

ExcelWriter install XlsxWriter

Here is a snippet of the Excel template into which I am going to be writing:

ExcelWriter Template Snippet

And, here is my JSON data source file (for now):

{
    "applicationdata": [{
        "id": 1,
        "template_label": "Brokerage Name",
        "template_value_cell": "D9",
        "value": "Sample Broker Company, Inc."
    }]
}

From the sample JSON, you can see that I am going to attempt to change cell C9 to "Sample Broker Company, Inc.", so I will write a test case that can read the Excel file and check the value in cell D9.

import unittest
import ew1
from openpyxl import Workbook, load_workbook

class TestChatbot1(unittest.TestCase):

    def test_check_cell_value(self):
        workbook=load_workbook(filename="test_excel.xlsm") # read the file
        sheet=workbook["Broker & Insured Information"]
        self.assertEqual(sheet["D9"].value, "Sample Broker Company, Inc.") # test

if __name__ == '__main__':
    unittest.main()

This test case will open the file, select the worksheet, and test the value of cell D9 to see if it matches "Sample Broker Company, Inc."

For the code, we need to add a couple of steps. First, we need to open and read the JSON file. Second, we need to iterate through the JSON file to populate the variables for our cell and value combination.

import json
from openpyxl import Workbook, load_workbook

filename = "data_source.json"

with open(filename, "r") as applicationfile:
    data = json.load(applicationfile)

for line in data['applicationdata']:
    cell_to_change = line['template_value_cell']
    value_to_change = line['value']

workbook=load_workbook(filename="application.xlsm")
sheet=workbook["Broker & Insured Information"]
sheet[cell_to_change] = value_to_change
workbook.save("test_excel.xlsm")

And running this code against the test gives the following:

ExcelWriter Template Snippet

We are on the right track! Next, we want to write the code once, but fire for any value in the JSON source. We need to add the title of the JSON file to include the worksheet title and re-run. This will then select each of the values that we are changing from our datasource. The unit test can remain the same, but we need to modify the JSON to include the following line:

"template_worksheet": "Broker & Insured Information", 

And modify the code to work as follows:

for line in data['applicationdata']:
    target_worksheet = line['template_worksheet']
    cell_to_change = line['template_value_cell']
    value_to_change = line['value']

workbook=load_workbook(filename="application.xlsm")
sheet=workbook[target_worksheet]
sheet[cell_to_change] = value_to_change

Re-running the test case shows all is good:

ExcelWriter Template Snippet

The concept is proven and it looks like the package is going to work for this purpose. NOTE: I started with an .xlsm file that included an image for the company logo at the top, however:

ExcelWriter Template Snippet

and this was causing the .xlsm to fail while trying to open it in Excel. Instead, I wrote to a new file as an .xlsx extension, and although it complained about the dropped image, it seems to be opening ok.

The next step is going to be to write out a function that will iterate over all of the values in the JSON and then write the corresponding file. With a few modifications, we can adjust the code to call a simple function to select the worksheet and cell for multiple values in the JSON (we can still use the same test case to test).

import json
from openpyxl import Workbook, load_workbook

filename = "data_source.json"
workbook=load_workbook(filename="application.xlsm")

def write_value_from_json(workbook, target_sheet, cell, value):
    sheet = workbook[target_sheet]
    sheet[cell] = value

with open(filename, "r") as applicationfile:
    data = json.load(applicationfile)

for line in data['applicationdata']:
    sheet = line['template_worksheet']
    cell = line['template_value_cell']
    value = line['value']
    write_value_from_json(workbook, sheet, cell, value) 

workbook.save("test_excel.xlsx")

And the JSON includes all of the fields for that first box:

{
    "applicationdata": [
    {
        "template_worksheet": "Broker & Insured Information",
        "template_label": "Brokerage Name",
        "template_value_cell": "D9",
        "value": "Sample Broker Company, Inc."
    },    
    {
        "template_worksheet": "Broker & Insured Information",
        "template_label": "Address",
        "template_value_cell": "D10",
        "value": "123 East West St. Suite #567"
    },    
    {
        "template_worksheet": "Broker & Insured Information",
        "template_label": "City",
        "template_value_cell": "D11",
        "value": "Anywhere"
    },    
    {
        "template_worksheet": "Broker & Insured Information",
        "template_label": "Broker Contact Name",
        "template_value_cell": "D12",
        "value": "Robert Dundas"
    },    
    {
        "template_worksheet": "Broker & Insured Information",
        "template_label": "State",
        "template_value_cell": "F11",
        "value": "CA"
    },    
    {
        "template_worksheet": "Broker & Insured Information",
        "template_label": "Zip Code",
        "template_value_cell": "H11",
        "value": "92123"
    },    
    {
        "template_worksheet": "Broker & Insured Information",
        "template_label": "Does your brokerage control the account?",
        "template_value_cell": "G12",
        "value": "YES"
    }
]
}

Running the test case yields the following:

ExcelWriter Template Snippet

And when we open the file:

ExcelWriter Template Snippet

Perfect!

I am not really leveraging the JSON structured format by flattening sheets, cells and values into a single group of rows. I suppose we could create a section for each spreadsheet with a simple cell/value pair for each of the elements in the template. But, I think I am good with this.

So, how does this "automate the agency"? Assuming that the insurance agency is capturing customer application information through a website and saving that information to a database, one could write a very simple program to pull the information out of the database, as I have done with the JSON file, and populate a proprietary application template from one of the agency's list of carriers who is using Excel. Other carriers may use PDF and Word, which will be different articles. If there are three carriers and each uses a different format; i.e. PDF, Word and Excel, we just implemented a 33% efficiency improvement for the activity of cutting and pasting from the questionnaire to the carrier's proprietary application format! From experience, I can say that one of these templates would take about 20 minutes to complete depending on the complexity of the applicant's questionnaire.

Now, if I could just get the carrier to use Google Sheets for collaborative worksheets, or accept a JSON to import into their back-end system, I'd be set! But, for now, I'll consider this battle as "won".

Let me know your thoughts on Twitter.