Generate Insights



The generate_insights tool allows the user to add explanatory insights, generated with OpenAI API, to a dataset of various natures. This could be generic data provided by the user, such as a table, a bar chart, etc., or it could be an output file generated by one of our tools. Currently, the implemented tools are as follows:

  • generic_insights: For any table introduced by the user, a series of bullet points with insights about the data is returned.

  • partial_dependence: Given the data frame containing the partial dependence evaluations, df_pdp.csv, generated in the Train Classification function, this tool provides a textual explanation of each potential one-dimensional partial dependence graph available.

  • drivers_barriers: This tool starts from the table of drivers and barriers, df_db.csv, generated in the Train Classification or Predict Classification functions. To every row, it adds a textual description explaining which inputs contribute the most, both positively and negatively, to the target taking a specific value. Executions are currently limited to 15 rows at a time.

Version 1.0.0 of the tool requires user access to the OpenAI model gpt-4-1106-preview, to ensure proper functionality.

Step 0: Get ready

Make sure you have followed these steps first: Setup and Requirements

Step 1: Initialize the Client and set up your workspace

Import necessary libraries and initialize the Shimoku client with your credentials. Define a workspace and a menu path for organizing your AI models.

import os
import time
from io import StringIO
import pandas as pd
from shimoku import Client


s = Client(


menu_path_name = "insights"

Note: you must have your SHIMOKU_TOKEN, UNIVERSE_ID, WORKSPACE_ID, OPENAI_API_KEY and OPENAI_ORG_ID saved as environment variables.

For steps 2 to 5, choose the tab below according to the task you want to perform.

Step 2: Prepare and Upload Your Data

Upload any type of table on which you wish to request relevant information. No additional format is imposed.

input_file = pd.read_csv('./input_data.csv')
        input_files={'input_data': input_file.to_csv(index=False).encode()},

Step 3: Execute the Generate Insight Function

Call the insight generator function and adjust the arguments for the generic_insights task.

run_id =

ai_function: str Label for this functionality, which will have the value 'generate_insights'.

openai_api_key: str Your OpenAI unique API key.

openai_org_id: str Your OpenAI organization id.

task: str 'generic_insight' requests to generate insights about a table in any type of format.

data: str Name chosen in create_input_files to refer to the table.

Step 4: Monitor the Process

Wait for the insights to be generated and the outputs to be uploaded.

attempts = 20
wait = 60

for _ in range(attempts):
        results =
        if results:
            print("Successfully obtained the output.")
            break  # Exit the loop if results are obtained
    except Exception:
        pass  # Ignore errors and continue
    time.sleep(wait)  # Wait before retrying
    print("Failed to obtain the output after the maximum number of attempts.")

Step 5: Accessing the GPT insights

Once execution is complete, insights are available.

insights = results['insights.txt'][0].decode()

Last updated