Generate a GEXF file from text with ChatGPT and visualize it in Gephi and NodeXL
By Dr. Verónica Espinoza, 2024
▪Twitter (X) @Verukita1 ▪LinkedIn: Dra. Verónica Espinoza ▪website: www.nethabitus.org
What will we review in this story?
In this story I will share with you how I used ChatGPT to generate a GEXF file from a text. Also, I’ll show you how I saved the result and how I visualized it in Gephi and NodeXL. I did this exercise with the free version of ChatGPT.
It is important to mention that if you want to convert text into a GEXF file there are definitely other specialized tools such as Nocodefunctions tool, developed by Clément Levallois [1,2], however, I found this exercise interesting, so I decided to share it with you in case you want to explore this alternative.
Limitations: You can process short texts, when the text is long ChatGPT could show you an error. The results may vary even when the Prompt and the text are the same. It may not be as efficient or accurate compared to other specialized tools for generating semantic networks.
Learn more about the Nocodefunctions tool in this tutorial I wrote on Medium.
Tools used in this story
ChatGPT
ChatGPT is a chatbot and virtual assistant developed by OpenAI and launched on November 30, 2022. Based on large language models (LLMs), it enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language. Successive user prompts and replies are considered at each conversation stage as context [3].
ChatGPT is built on OpenAI’s proprietary series of generative pre-trained transformer (GPT) models and is fine-tuned for conversational applications using a combination of supervised learning and reinforcement learning from human feedback [3].
Gephi
Gephi is a tool for data analysts and scientists keen to explore and understand graphs. Like Photoshop™ but for graph data, the user interacts with the representation, manipulate the structures, shapes and colors to reveal hidden patterns [4].
The goal is to help data analysts to make hypothesis, intuitively discover patterns, isolate structure singularities or faults during data sourcing. It is a complementary tool to traditional statistics, as visual thinking with interactive interfaces is now recognized to facilitate reasoning. This is a software for Exploratory Data Analysis, a paradigm appeared in the Visual Analytics field of research [4, 5].
😉Learn more about Gephi in this story I wrote: What is Gephi? Meet this useful network analysis tool.
🌐 Gephi Website: https://gephi.org/
NodeXL
The Social Media Research Foundation is the home of the network analysis tool NodeXL and is supported by a global network of academics from a wide variety of disciplines [6,7].
- NodeXL-Pro adds menus and features to Excel to simplify the tasks of getting network data, storing it, analyzing and visualizing it, and generating reports that share insights into connected structures.
- NodeXL-Pro supports all features to conduct professional social network analysis (SNA): Community clustering, influencer detection, content analysis, sentiment analysis, time series analysis and a lot more .
😉Learn more about NodeXL in this story I wrote: Meet NodeXL-Pro: one tool, many possibilities!
🌐 NodeXL Website: https://www.smrfoundation.org/nodexl/
Summary of the main steps of this tutorial
Below is an illustration that describes the general steps we will carry out in this tutorial. With the help of ChatGPT we will transform a text into a semantic network in GEXF format, then we will save the generated file and finally we will visualize it in Gephi and NodeXL.
🏁Lets start!
STEP 1. Using ChatGPT to generate the GEXF file.
In this step I wrote a PROMPT with the precise instructions for ChatGPT to transform a text into a semantic network in GEXF format. Some of the instructions I asked ChatGPT were the following: to generate a network based on the number of co-occurrences between terms in the text, to omit terms whose co-occurrences were equal to or less than 2, to use the NetworkX library, that applied stop words in English, etc.
For this exercise, I used a Wikipedia paragraph from the article called Netnography
Below, I show my PROMPT and the text I processed:
My PROMPT:
Generate a file that visualizes a semantic network from the English text that I will provide you. You must generate a GEXF file from the co-occurrences that exist between the terms in the text. Omit co-occurrences equal to or less than 2. Perform this analysis using the NetworkX Python library (and the libraries you need). Apply stop words in English to clean up the text. Give me the result in GEXF file format right here (don’t give me the code I just want the result in GEXF format). The text to analyze is the following:
Important: In this step, you can adjust the requests in the PROMPT according to your needs. For example, you can apply stopwords in another language, you can also readjust the number of omitted co-occurrences, you can indicate whether your terms in the network are represented in unigram, bigram, etc., you can select the libraries that you want for the process, among others parameters.
Likewise, you can add to the PROMPT an indication to add a column with the respective occurrences for the nodes, something like this: “Adds a column with the respective occurrences for each node” Adding this column will allow you to rank the nodes according to their occurrence (I already tried it and it works well).
You could also modify the PROMPT to ask ChatGPT to provide the Python code for you to run directly in Jupyter Lab!
😉Try to explore with different versions of the PROMPT according to your viewing needs!
Below is the partial view of the GEXF file generated with ChatGPT for this
exercise:
STEP 2. Save the file in GEXF format
The file saving process is described as follows:
Copy and paste the information generated from the previous step into a notepad > save this file by assigning a name and replacing the .txt extension (which appears by default) with the .gexf extension. This way you will get the GEXF file ready to visualize it in Gephi or NodeXL.
Below is an illustration describing this process:
The following gif shows the complete process described in this STEP 1 and STEP 2:
STEP 3. Visualize the GEXF file in Gephi and NodeXL.
a) Gephi
In your Gephi tool, open the GEXF file generated with ChatGPT > In the statistics section apply modularity > In the appearance section color the nodes by modularity, rank the size of the nodes by degree, adjust the font size by degree > In the Layout section, apply forceatlas 2, select Stronger Gravity and Prevent Overlap.
Finally, make other adjustments that you consider necessary to have an effective visualization such as Expansion, Contraction, Noverlap, among others.
Note: this is just an example, you can apply other types of statistics, rank the size of the node and the label by other attributes, and even apply other Layouts! This will depend on your visualization needs.
The resulting visualization is shown applying the described parameters.
Below is the complete process described at this point in which the GEXF file generated in ChatGPT is imported and displayed in Gephi.
a) NodeXL
The process to visualize the file generated by ChatGPT in NodeXL is described as follows:
Open your NodeXL-Pro template > Import the recipe (number 1) > Import the file generated by ChatGPT in GEXF format (number 2) > run the recipe (number 3).
Note: I have created a recipe for this type of data. You can download it here
The result obtained in NodeXL when applying the recipe is shown.
Below is a gif that exemplifies the complete process to import and visualize in NodeXL the GEXF file generated in ChatGPT.
Conclusion
In this exercise we have reviewed how to generate a GEXF file from text with the help of ChatGPT. Also, we have reviewed how to save the result and how to visualize the file in Gephi and NodeXL.
This exercise could be useful if we want to generate a semantic network from a not so extensive text. For example, we could use it to generate semantic networks from abstracts of papers or from a section of an article on the Internet, a transcript of a video, a transcript of an interview, among other uses.
I found this exercise interesting, which is why I decided to share it with you but as I have mentioned, it can be useful if you want to generate GEXF files from short texts, however, if you want to transform a long text into GEXF format and in a more precise way, I recommend using the Nocodefunctions tool which I already described in the first part of this article.
The PROMPT that I share with you in this story is just an example, but please try to generate several versions of the PROMPT according to your visualization needs.
Limitations: With this exercise, you can process short texts, when the text is long ChatGPT could show you an error. The results may vary even when the PROMPT and the text are the same. It may not be as efficient or accurate compared to other specialized tools for generating semantic networks.
Thanks for reading this story. I encourage you to explore for yourself and adapt the PROMPT according to your viewing needs!
😉Thanks for reading this story.
👉Find more stories I have written here
✔Follow me on Twitter (X) @Verukita1
✔LinkedIn: Dra. Verónica Espinoza
✔Website: www.nethabitus.org
Resources
🌐ChatGPT Website
🌐 NodeXL website. 👨🎓 Meet NodeXL-Pro: one tool, many possibilities!
🌐 Gephi website. 👨🎓 What is Gephi? Meet this useful network analysis tool
🌐 Nocodefunctions website. 👨🎓 Transform any text into a semantic network with Nocodefunctions App (in just 4 steps)
References
[1] Explore your data at a click [Internet]. Nocode functions. [cited May 24, 2024]. Available in: https://nocodefunctions.com/
[2] blog N functions-. Nocode functions is one year old! [Internet]. [cited May 24, 2024]. Available from: https://nocodefunctions.com/blog/nocodefunctions-is-one-year-old/
[3] ChatGPT. In: Wikipedia [Internet]. 2024 [cited 2024 May 24]. Available from: https://en.wikipedia.org/w/index.php?title=ChatGPT&oldid=1225281008#cite_note-guardianpos-2
[4] Bastian M., Heymann S., Jacomy M. (2009). Gephi: an open source software for exploring and manipulating networks. International AAAI Conference on Weblogs and Social Media
[5] Gephi — The Open Graph Viz Platform [Internet]. [cited Nov 8, 2022]. Available in: https://gephi.org/
[6] Social Media Research Foundation [Internet]. Social Media Research Foundation. [cited Nov 11, 2022]. Available in: https://www.smrfoundation.org/
[7] Smith M a, Rainie L, Shneiderman B, Himelboim I. Mapping Twitter Topic Networks: From Polarized Crowds to Community Clusters [Internet]. Pew Research Center: Internet, Science & Tech. 2014 [cited Nov 11, 2022]. Available in: https://www.pewresearch.org/internet/2014/02/20/mapping-twitter-topic-networks-from-polarized-crowds-to-community-clusters/