Generate a GEXF file from text with ChatGPT and visualize it in Gephi and NodeXL

Dr. Veronica Espinoza
9 min readMay 27, 2024

--

By Dr. Verónica Espinoza, 2024

▪Twitter (X) @Verukita1LinkedIn: Dra. Verónica Espinozawebsite: www.nethabitus.org

Network by the author. Semantic network visualized in Gephi from a file previously generated with ChatGPT

What will we review in this story?

In this story I will share with you how I used ChatGPT to generate a GEXF file from a text. Also, I’ll show you how I saved the result and how I visualized it in Gephi and NodeXL. I did this exercise with the free version of ChatGPT.
It is important to mention that if you want to convert text into a GEXF file there are definitely other specialized tools such as Nocodefunctions tool, developed by Clément Levallois [1,2], however, I found this exercise interesting, so I decided to share it with you in case you want to explore this alternative.

Limitations: You can process short texts, when the text is long ChatGPT could show you an error. The results may vary even when the Prompt and the text are the same. It may not be as efficient or accurate compared to other specialized tools for generating semantic networks.

Learn more about the Nocodefunctions tool in this tutorial I wrote on Medium.

Tools used in this story

ChatGPT

ChatGPT is a chatbot and virtual assistant developed by OpenAI and launched on November 30, 2022. Based on large language models (LLMs), it enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language. Successive user prompts and replies are considered at each conversation stage as context [3].

ChatGPT is built on OpenAI’s proprietary series of generative pre-trained transformer (GPT) models and is fine-tuned for conversational applications using a combination of supervised learning and reinforcement learning from human feedback [3].

Gephi

Gephi is a tool for data analysts and scientists keen to explore and understand graphs. Like Photoshop™ but for graph data, the user interacts with the representation, manipulate the structures, shapes and colors to reveal hidden patterns [4].

The goal is to help data analysts to make hypothesis, intuitively discover patterns, isolate structure singularities or faults during data sourcing. It is a complementary tool to traditional statistics, as visual thinking with interactive interfaces is now recognized to facilitate reasoning. This is a software for Exploratory Data Analysis, a paradigm appeared in the Visual Analytics field of research [4, 5].

😉Learn more about Gephi in this story I wrote: What is Gephi? Meet this useful network analysis tool.

🌐 Gephi Website: https://gephi.org/

NodeXL

The Social Media Research Foundation is the home of the network analysis tool NodeXL and is supported by a global network of academics from a wide variety of disciplines [6,7].

  • NodeXL-Pro adds menus and features to Excel to simplify the tasks of getting network data, storing it, analyzing and visualizing it, and generating reports that share insights into connected structures.
  • NodeXL-Pro supports all features to conduct professional social network analysis (SNA): Community clustering, influencer detection, content analysis, sentiment analysis, time series analysis and a lot more .

😉Learn more about NodeXL in this story I wrote: Meet NodeXL-Pro: one tool, many possibilities!

🌐 NodeXL Website: https://www.smrfoundation.org/nodexl/

Summary of the main steps of this tutorial

Below is an illustration that describes the general steps we will carry out in this tutorial. With the help of ChatGPT we will transform a text into a semantic network in GEXF format, then we will save the generated file and finally we will visualize it in Gephi and NodeXL.

Figure 1. Process to transform a text into GEXF format with the help of ChatGPT and visualize it in Gephi and NodeXL. Image by the author.

🏁Lets start!

STEP 1. Using ChatGPT to generate the GEXF file.

In this step I wrote a PROMPT with the precise instructions for ChatGPT to transform a text into a semantic network in GEXF format. Some of the instructions I asked ChatGPT were the following: to generate a network based on the number of co-occurrences between terms in the text, to omit terms whose co-occurrences were equal to or less than 2, to use the NetworkX library, that applied stop words in English, etc.

For this exercise, I used a Wikipedia paragraph from the article called Netnography

Below, I show my PROMPT and the text I processed:

My PROMPT:

Generate a file that visualizes a semantic network from the English text that I will provide you. You must generate a GEXF file from the co-occurrences that exist between the terms in the text. Omit co-occurrences equal to or less than 2. Perform this analysis using the NetworkX Python library (and the libraries you need). Apply stop words in English to clean up the text. Give me the result in GEXF file format right here (don’t give me the code I just want the result in GEXF format). The text to analyze is the following:

Figure 2.- The text used for this exercise is displayed. Taken from: https://en.wikipedia.org/wiki/Netnography

Important: In this step, you can adjust the requests in the PROMPT according to your needs. For example, you can apply stopwords in another language, you can also readjust the number of omitted co-occurrences, you can indicate whether your terms in the network are represented in unigram, bigram, etc., you can select the libraries that you want for the process, among others parameters.
Likewise, you can add to the PROMPT an indication to add a column with the respective occurrences for the nodes, something like this: “Adds a column with the respective occurrences for each node” Adding this column will allow you to rank the nodes according to their occurrence (I already tried it and it works well).
You could also modify the PROMPT to ask ChatGPT to provide the Python code for you to run directly in Jupyter Lab!
😉Try to explore with different versions of the PROMPT according to your viewing needs!

Below is the partial view of the GEXF file generated with ChatGPT for this
exercise:

Figure 2. Partial view of the GEXF file obtained in this step.

STEP 2. Save the file in GEXF format

The file saving process is described as follows:

Copy and paste the information generated from the previous step into a notepad > save this file by assigning a name and replacing the .txt extension (which appears by default) with the .gexf extension. This way you will get the GEXF file ready to visualize it in Gephi or NodeXL.

Below is an illustration describing this process:

Figure 3. Process to save the file in GEXF format. Image by the author.

The following gif shows the complete process described in this STEP 1 and STEP 2:

Figure 4. The complete process of steps 1 and 2 is shown.

STEP 3. Visualize the GEXF file in Gephi and NodeXL.

a) Gephi

In your Gephi tool, open the GEXF file generated with ChatGPT > In the statistics section apply modularity > In the appearance section color the nodes by modularity, rank the size of the nodes by degree, adjust the font size by degree > In the Layout section, apply forceatlas 2, select Stronger Gravity and Prevent Overlap.

Finally, make other adjustments that you consider necessary to have an effective visualization such as Expansion, Contraction, Noverlap, among others.

Note: this is just an example, you can apply other types of statistics, rank the size of the node and the label by other attributes, and even apply other Layouts! This will depend on your visualization needs.

Figure 5. GEXF file generated from ChatGPT and imported into Gephi. Image by the author.

The resulting visualization is shown applying the described parameters.

Figure 6. Gephi tool: visualization of the GEXF file generated with ChatGPT. Image by the author.

Below is the complete process described at this point in which the GEXF file generated in ChatGPT is imported and displayed in Gephi.

Figure 6. Process to import and visualize the GEXF file generated in ChatGPT.

a) NodeXL

The process to visualize the file generated by ChatGPT in NodeXL is described as follows:

Open your NodeXL-Pro template > Import the recipe (number 1) > Import the file generated by ChatGPT in GEXF format (number 2) > run the recipe (number 3).

Figure 7. Process to visualize the file generated by ChatGPT in NodeXL. Image by the author.

Note: I have created a recipe for this type of data. You can download it here

Figure 8. The location to download the recipe for NodeXL is shown. Image by the author.

The result obtained in NodeXL when applying the recipe is shown.

Figure 9. Network map in NodeXL from the GEXF file generated with ChatGPT. Image by the author.

Below is a gif that exemplifies the complete process to import and visualize in NodeXL the GEXF file generated in ChatGPT.

Figure 10. Process to import and visualize the GEXF file generated in ChatGPT.

Conclusion

In this exercise we have reviewed how to generate a GEXF file from text with the help of ChatGPT. Also, we have reviewed how to save the result and how to visualize the file in Gephi and NodeXL.

This exercise could be useful if we want to generate a semantic network from a not so extensive text. For example, we could use it to generate semantic networks from abstracts of papers or from a section of an article on the Internet, a transcript of a video, a transcript of an interview, among other uses.

I found this exercise interesting, which is why I decided to share it with you but as I have mentioned, it can be useful if you want to generate GEXF files from short texts, however, if you want to transform a long text into GEXF format and in a more precise way, I recommend using the Nocodefunctions tool which I already described in the first part of this article.

The PROMPT that I share with you in this story is just an example, but please try to generate several versions of the PROMPT according to your visualization needs.

Limitations: With this exercise, you can process short texts, when the text is long ChatGPT could show you an error. The results may vary even when the PROMPT and the text are the same. It may not be as efficient or accurate compared to other specialized tools for generating semantic networks.

Thanks for reading this story. I encourage you to explore for yourself and adapt the PROMPT according to your viewing needs!

😉Thanks for reading this story.

👉Find more stories I have written here

✔Follow me on Twitter (X) @Verukita1

✔LinkedIn: Dra. Verónica Espinoza

✔Website: www.nethabitus.org

Resources

🌐ChatGPT Website

🌐 NodeXL website. 👨‍🎓 Meet NodeXL-Pro: one tool, many possibilities!

🌐 Gephi website. 👨‍🎓 What is Gephi? Meet this useful network analysis tool

🌐 Nocodefunctions website. 👨‍🎓 Transform any text into a semantic network with Nocodefunctions App (in just 4 steps)

References

[1] Explore your data at a click [Internet]. Nocode functions. [cited May 24, 2024]. Available in: https://nocodefunctions.com/

[2] blog N functions-. Nocode functions is one year old! [Internet]. [cited May 24, 2024]. Available from: https://nocodefunctions.com/blog/nocodefunctions-is-one-year-old/

[3] ChatGPT. In: Wikipedia [Internet]. 2024 [cited 2024 May 24]. Available from: https://en.wikipedia.org/w/index.php?title=ChatGPT&oldid=1225281008#cite_note-guardianpos-2

[4] Bastian M., Heymann S., Jacomy M. (2009). Gephi: an open source software for exploring and manipulating networks. International AAAI Conference on Weblogs and Social Media

[5] Gephi — The Open Graph Viz Platform [Internet]. [cited Nov 8, 2022]. Available in: https://gephi.org/

[6] Social Media Research Foundation [Internet]. Social Media Research Foundation. [cited Nov 11, 2022]. Available in: https://www.smrfoundation.org/

[7] Smith M a, Rainie L, Shneiderman B, Himelboim I. Mapping Twitter Topic Networks: From Polarized Crowds to Community Clusters [Internet]. Pew Research Center: Internet, Science & Tech. 2014 [cited Nov 11, 2022]. Available in: https://www.pewresearch.org/internet/2014/02/20/mapping-twitter-topic-networks-from-polarized-crowds-to-community-clusters/

--

--

Dr. Veronica Espinoza

👨‍🎓 PhD Humanities 🧠M. Sc Neurobiology 🧪B.S. Chemistry. 👉 X: @Verukita1 🌐website: www.nethabitus.org