Run Databao
Before you begin
Initialize an agent
An agent in Databao acts as the main interface for database connections and context.
To initialize an agent, use the following code:
import databao
from databao import LLMConfig
llm_config = LLMConfig(name="gpt-4o-mini", temperature=0)
agent = databao.new_agent(llm_config=llm_config)Add data sources
You can add configured database connections and data frames to the agent as follows:
# Add a database connection
agent.add_db(conn)
# Add a dataframe
agent.add_df(df)
# Add a dataframe with context as a string
agent.add_df(df, context="some context")
# Add a dataframe with context as a file
# If you add context as a file, make sure to import Path (`from pathlib import Path`)
agent.add_df(df, context=Path("context.md"))Create a thread and ask a question
Threads are conversations within an agent and you can think of them as single chats in ChatGPT or Claude. Threads have their own message history, so you can ask follow-up questions based on a previous answer in the same thread.
-
To start a conversation thread:
thread = agent.thread() -
To ask a question and get a dataframe as a response, use the
df()method:df = thread.ask("List all the shows produced in Germany").df() print(df.head()) -
To ask a question and get a text response, use the
text()method as in either ot the following options:thread.ask("List all the shows produced in Germany") print(thread.text())thread.ask("List all the shows produced in Germany").text() -
To generate a visualization, use the
plot()method on the thread:thread.plot("Create a bar chart of shows by country")Databao uses Vega-Lite to generate visualizations, and you can specify any supported chart or plot type in your prompt.
You can also access the generated plot code as follows:
print(plot.code)
Chain questions
Because all questions in a thread have the same context, you can chain them if you data processing flow requires several steps:
thread \
.ask("List all the shows produced in Germany") \
.ask("Sort them by the year") \
.df()