Building an Automating Exploratory Data Analysis (EDA) with Lyzr Agent

Prajjwal Sule
GenAI Agents Unleashed
4 min readMar 19, 2024

In today’s data-driven world, organizations constantly seek efficient ways to analyze and derive insights from their data. Traditional data analysis methods often require specialized skills and can be complex and time-consuming.

However, with the advent of Lyzr’s Automate EDA (Exploratory Data Analysis), data analysis is now more intuitive and accessible. In this article, we’ll explore how Lyzr SDK simplifies the data analysis process by providing a conversational interface that empowers users to interact with their data effortlessly.

Automate the process of EDA by Lyzr SDK

Setting Up the Environment

Before diving into the development of Lyzr Automate EDA, let’s first set up our environment.

python3 -m venv venv
source venv/bin/activate

We’ll start by installing the necessary libraries, including Streamlit for building the web application interface.

pip3 install lyzr streamlit

Additionally, we’ll configure environment variables, such as the OpenAI API key, which is essential for leveraging Lyzr’s powerful natural language processing capabilities.

# Create a .env file
apikey = "your open ai api key"
# Importing necessary libraries and modules
import os
from PIL import Image
from pathlib import Path
import streamlit as st
from utils import utils
from dotenv import load_dotenv; load_dotenv()
from lyzr import DataConnector, DataAnalyzr


# Setting up environment variables
apikey = os.getenv('apikey')


# Creating directories for data and plots
data = "data"
plot = 'plot'
os.makedirs(data, exist_ok=True)
os.makedirs(plot, exist_ok=True)

DataConnector and DataAnalyzr classes from the Lyzr library. These classes provide functionalities for connecting to data sources and performing exploratory data analysis, respectively.

Create directories named “data” and “plot” using the os.makedirs() function. These directories are intended for storing data files (such as CSV files) and generated plots, respectively.

The exist_ok=True parameter ensures that the directories are only created if they don’t already exist, preventing errors during the creation process.

Creating the User Interface

We’ll design the interface to be user-friendly and intuitive, allowing users to seamlessly upload their data, explore dataset descriptions, receive analysis recommendations, run custom queries, and visualize insights — all through a conversational interface.

# Automate EDA Application
def data_uploader():
st.title("Data")
st.subheader("Upload CSV file for analysis")

# Upload csv file
uploaded_file = st.file_uploader("Choose csv file", type=["csv"])
if uploaded_file is not None:
utils.save_uploaded_file(uploaded_file)
else:
utils.remove_existing_files(data)
utils.remove_existing_files(plot)

This function provides a simple and intuitive way for users to upload CSV files for analysis in the Automate EDA application. It ensures that uploaded files are appropriately handled and prepares the environment for further analysis.

Automating EDA with Lyzr

Now, let’s delve into the core functionality of our application — the Automate EDA powered by Lyzr. We’ll utilize Lyzr’s DataConnector and DataAnalyzr classes to fetch and analyze data from CSV files.

Users can explore dataset descriptions, receive analysis recommendations, run custom queries, and visualize insights effortlessly.

# Analyzr Functionality
def analyzr():
path = utils.get_files_in_directory(data)
path = path[0]
dataframe = DataConnector().fetch_dataframe_from_csv(file_path=Path(path))
analyzr = DataAnalyzr(df=dataframe, api_key=apikey)
return analyzr

Implementation of the DataAnalyzr Functionalities

def display_description(analyzr):
description = analyzr.dataset_description()
if description is not None:
st.subheader("Dataset Description:")
st.write(description)

def display_recommended_analysis(analyzr):
analysis = analyzr.analysis_recommendation()
if analysis is not None:
st.subheader("Analysis Recommendations:")
st.write(analysis)

def display_queries(analyzr):
queries = analyzr.ai_queries_df()
if queries is not None:
st.subheader("AI-Generated Queries:")
st.write(queries)

def display_analysis(analyzr):
query = st.text_input("Write your query")
if st.button("Submit"):
analysis = analyzr.analysis_insights(user_input=query)
if analysis is not None:
st.subheader("Analysis Insights:")
st.write(analysis)

def display_recommendation(analyzr):
query = st.text_input("Write your query")
if st.button("Submit"):
recommendation = analyzr.analysis_insights(user_input=query)
if recommendation is not None:
st.subheader("Recommendations:")
st.write(recommendation)

def visualization_for_analysis(analyzr):
query = st.text_input("Write your analysis query")
if st.button("Submit"):
utils.remove_existing_files(plot)
visualiation = analyzr.visualizations(user_input=query, dir_path=Path('./plot'))
plot_files = os.listdir("./plot")
for plot_file in plot_files:
st.subheader(f'Visualization: {query}')
st.image(f"./plot/{plot_file}")

These functions provide interactive capabilities for users to explore dataset descriptions, recommended analyses, AI-generated queries, analysis insights, recommendations, and visualizations based on their custom queries, all facilitated through the DataAnalyzr object.

Executing the application

if __name__ == "__main__":
st.sidebar.title("Automate EDA")
selection = st.sidebar.radio("Go to", ["Data", "Analysis"])

if selection == "Data":
data_uploader()
elif selection == "Analysis":
file = file_checker()
if len(file) > 0:
analyzr = analyzr()
# create buttons
st.header("Select an Action")
options = ['Select',"Description", "Recommended Analysis", "Queries", "Analysis", "Recommendation", "Visualization"]
selected_option = st.radio("Select an option", options)

if selected_option == "Description":
display_description(analyzr)
elif selected_option == "Recommended Analysis":
display_recommended_analysis(analyzr)
elif selected_option == "Queries":
display_queries(analyzr)
elif selected_option == "Analysis":
display_analysis(analyzr)
elif selected_option == "Recommendation":

This will orchestrate the navigation flow of the Automate EDA application. It allows users to seamlessly upload data for analysis and choose from various analysis options to explore and derive insights from their data.

Lyzr SDK revolutionizes the data analysis workflow by offering a user-friendly and conversational approach. Users can effortlessly explore, analyze, and derive insights from their data through its intuitive interface with minimal effort and expertise. With Lyzr Automate EDA, data analysis has never been more accessible.

References

For further exploration and engagement, refer to Lyzr’s website, book a demo, or join the community channels on Discord and Slack.

GitHub — Automate EDA by Lyzr

Lyzr Website: https://www.lyzr.ai/

Book a Demo: https://www.lyzr.ai/book-demo/

Lyzr Community Channels: Discord and Slack

--

--