Nocode functions tool: explore your data at a click!

Dr. Veronica Espinoza
7 min readNov 10, 2022

--

by Dr. Verónica Espinoza, 2022 / ✔Twitter @Verukita1 ✔LinkedIn Dra. Verónica Espinoza

Image by the author

Nocode functions is a web app which makes best-in-class data analysis functions available to all.

Nocode functions is a free, no-registration web app for click-and-point data analysis. The long-term objective with nocodefunctions is to provide a free, user friendly, robust web app helping a variety of audiences to use common (but sophisticated) data analysis functions, offered in their best-in-class versions.

Nocode functions was developed by Clément Levallois, a professor based in Paris, with a passion for extracting information from social media and networks. He has published studies in academic journals.

Find more about Clément Levallois in the Facebook group for Gephi, Twitter, LinkedIn or review his website.

How do I start using this tool?

a) First, open the tool here

Figure 1. Nocode functions interface. Screenshot taken on Nov 9, 2022.

b) Prepare your data. All you need is a file (Excel, pdf, csv, txt…) or a Google Spreadsheet containing your data.

c) Select the function. Choose the function of your interest depending on your research project, your objectives and your data.

d) Perform the following 3 steps: upload your data > analyze > view the results.

Figure 2. Three steps to get results in Nocode functions (Image by the Author).

In this section, the main functions of the tool are described.

  1. -Sentiment analysis.

This function performs sentiment analysis, also called opinion mining. It analyzes the text and determines whether the sentiment is neutral, positive or negative. It works best on social media such as tweets for Twitter, comments on Instagram posts and other very short texts in English, French and Spanish. In a comparison with 23 alternatives, this tool was found to be the best tool for sentiment analysis on social media. Born in 2012, this funtions is under continuous development.

The principles followed by this function are described in this academic publication about Umigon, published in the anthology of the Association fo Computational Linguistics.

Steps for sentiment analysis process:

Figure 3. Example of sentiment analysis results (Screenshot taken on Nov 9, 2022).

2.- Transform text into networks.

The function identifies pairs of terms in each line of the text. These pairs are called co-occurrences. Aggregating all pairs of terms and selecting the most frequent ones, a network of terms is constructed where any two terms are connected if they often appear together in the text.

The model

The principles followed by the tool are described in this academic publication studying how to find communities and topics on Twitter. The technology follows these steps:

  1. cleaning of the text: flatten to ASCII, removal of urls, removal of punctuation signs.
  2. lemmatization.
  3. decomposition of the text in n-grams up to four-grams, removal of less relevant n-grams. This step is identical to the one followed by the function for sentiment analysis
  4. count of co-occurrences: which pairs of n-grams tend to appear frequently in the same lines of the text?
  5. the list of cooccurring n-grams is used to create a network: it is made of the most frequent n-grams. Two n-grams are connected if they are frequently cooccurring.
  6. the strength of the connections in the network is corrected using a procedure called Pointwise Mutual Information (PMI).

Steps to transform text into semantic networks:

Figure 4. Examples of semantic network visualizations (Networks by the author).

4.-Find key topics in your text.

This function identifies automatically the key topics in a text, an operation called topic extraction or topic modeling. It analyzes the text line by line and determines groups of words and expressions which tend to cluster together, forming topics.
It works on texts written in a large variety of languages (including texts in non Latin alphabet). The function follows the principles of unsupervised learning, which is a type of machine learning.

How to define the numbers of topics to be found?

The most classic approach for topic detection is based on a clustering technique called the “k-means”. With it, the user decides how many topic should be found in the text, and then the algorithms finds these topics.
This approach can make sense when we know in advance how many topics there are in the text. But what is the point of topic detection if we know the topics already?
In nocode functions, the number of topics to be found is not predetermined. The analyst will learn a lot by discovering how many topics the algorithm can find in the text, without a preset limit. The analyst remains fully in control thanks to the precision parameter, which helps tune the algorithm to find more or less topics — but always with a degree of freedom on the exact number.

Steps to find key topics in the text:

Figure 5. Examples of the results of key topics analysis (Screenshot taken on Nov 9, 2022).

5.-Create networks from lists.

Create networks of two ways: 1.-Co-occurrences and 2.-Sources and targets as shown in the following examples:

Figure 6. Two options for generating networks (Screenshot taken on Nov 9, 2022).

Exploring the graph.

It is recommended to use two free software for the exploration of the semantic network produced by the function:

· Gephi provides the best features to filter, colorize, resize and run descriptive graph statistics on your network (such as betweenness centrality).

· VOSviewer provides the visualizations which are the cleanest and easiest to interpret for semantic networks. VOSviewer is developed with scientometrics as first use case but it is useful for any kind of semantic network.

Steps to create networks from lists:

Figure 7. Examples of semantic network visualizations using option 1 (Networks by the author).

6.-Predict new links in a network.

This function is a direct application of the Gephi plugin by Marco Romanutti and Saskia Schüler, supervised by Michael Henninger at FHNW. Their code is visible on Github.

The prediction is based on preferential attachement. It is limited to undirected, unweighted networks. The reasoning is simple: the most likely link to be created is the one between two nodes which have the most neighbors, but don’t have a connection yet.

How to interpret this link prediction? The absence of a link can mean that:

  • There is no potential for this link (it is not “relevant” to the nodes that would be involved).
  • There is a potential for this link to get created, and this potential is not actualized yet.
  • There is a potential for the link but the two nodes choose not to actualize the link.

This means that “predicting” a link can address one of these three cases.

Steps to predict new links in a network:

Figure 8. Example of the results of the predictions (Screenshot taken on Nov 9, 2022).

7.-Find text in PDF files.

This function allows you to identify in the pdf files, a word or phrase of interest, with context.

Steps to find text in PDF files:

Figure 9. Example of the results (Screenshot taken on Nov 9, 2022).

👍Thanks for reading

😃This is my Twitter

REFERENCES

  1. Explore your data at a click [Internet]. Nocode functions. [cited Nov 9, 2022]. Available in: https://nocodefunctions.com/
  2. Levallois C. Umigon: sentiment analysis for tweets based on terms lists and heuristics. En: Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) [Internet]. Atlanta, Georgia, USA: Association for Computational Linguistics; 2013 [cited Nov 9, 2022]. p. 414–7. Available in: https://aclanthology.org/S13-2068
  3. Benabdelkrim M, Levallois C, Savinien J, Robardet C. Opening Fields: A Methodological Contribution to the Identification of Heterogeneous Actors in Unbounded Relational Orders. M@n@gement. March 31, 2020;4–18.
  4. Blog N functions-. Pointwise mutual information and tf idf, when to use them [Internet]. [cited Nov 10, 2022]. Available in: https://nocodefunctions.com/blog/pmi-tf-idf/

--

--

Dr. Veronica Espinoza

👨‍🎓 PhD Humanities 🧠M. Sc Neurobiology 🧪B.S. Chemistry. 👉 X: @Verukita1 😉 Support my work here: https://acortar.link/1ZonMU 🌐website: www.nethabitus.org