Code Summary with GPT-4oMini
OpenAIClient
Class to interact with the OpenAI API to generate documentation using AI based on a prompt and source code.
The class contains the following methods:
- call_openai: Authenticate to and call the OpenAI API to generate documentation based on the provided prompt and source code.
- save_results: Save the generated results to a specified output table.
Example
from fleming.code_summary.fourO_mini_summary import call_openai
from pyspark.sql import SparkSession
# Not required if using Databricks
spark = SparkSession.builder.appName("openai_client").getOrCreate()
spark_input_df = "your_spark_input_df"
output_table_name = "your_output_table"
prompt = "The following code is the contents of a repository, generate a short summary paragraph describing what the repository purpose is. A paragraph detailing the key functionalities and technologies integrate with and a list of key words associated with this repository underneath. Focus on the purpose of the code contained in the repository, and the technologies, data and platforms it integrates with"
api_key = "your_api_key"
endpoint = "https://api.openai.com/yourendpointhere"
headers = {
"Content-Type": "application/json",
"api-key": api_key,
}
client = OpenAIClient(spark, delta_table, output_table_name, prompt, api_key, endpoint, headers)
client.call_openai()
Parameters:
Name | Type | Description | Default |
---|---|---|---|
spark |
SparkSession
|
Spark Session |
required |
input_spark_df |
DataFrame
|
Source spark DataFrame containing the input data |
required |
output_table_name |
str
|
Name of the output table to save results |
required |
prompt |
str
|
Prompt to send to the OpenAI API |
required |
api_key |
str
|
API key for OpenAI |
required |
endpoint |
str
|
Endpoint for OpenAI API |
required |
headers |
dict
|
Headers for the API request |
required |
Source code in src/fleming/code_summary/fourO_mini_summary.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 |
|
call_openai(title, concatenated_content, total_token_count)
Call the OpenAI API to generate summarised content based on the provided prompt and source content.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
title |
str
|
Column name for column containing summarised text title |
required |
concatenated_content |
str
|
Column name for column containing concatenated content |
required |
total_token_count |
str
|
Column name for column containing total token count |
required |
Returns:
Name | Type | Description |
---|---|---|
results_df |
DataFrame
|
PySpark DataFrame containing summarisation of each entry |
Source code in src/fleming/code_summary/fourO_mini_summary.py
display_results()
Display the generated results.
Returns:
Name | Type | Description |
---|---|---|
results_df |
pyspark_df
|
returns image of dataframe |
Source code in src/fleming/code_summary/fourO_mini_summary.py
save_results()
Save the generated results to the specified output table.
Returns:
Type | Description |
---|---|
None
|
None |