Network Visualization of Media coverage of violence against women in Bangladesh

Tahsin Mayeesha
Jul 24, 2016 · 6 min read

I just finished my first network visualization project in python and Gephi on the data from a prominent newspaper of Bangladesh named “Dhaka Tribune”. The visualization is based on the relationship of co-occurrence between entities such as location, organizations and people cited in the news articles in Dhaka Tribune from 2012–2016.

We wanted to explore the media coverage on the articles about harassment or violence against women, including rape and murder related cases. This project was done with the help of KolpoKoushol , an initiative by former MIT alumni’s of Bangladesh to gather people from many fields for learning about interdisciplinary projects and prototyping some of our own. Our mentor, Syed Arefinul Haque, helped us with data extraction from news and guided us throughout the project.

Methodology :

  1. Data was gathered from crawling the “Dhaka Tribune” by Arefinul Haque. By using Stanford NER Tagger, unique entities : Location, Organizations and Person’s were extracted. A total of 49055 articles were found.
  2. News dataset was in JSON format. Each news article has following attributes : news_crawled_date, news_ml_tags, newspaper_url,news_url,news_headline,news_reporters’,news_original_tags,news_text,news_ner_tags,news_publish_date,news_naive_tags,news_image_urls,news_location, newspaper_name,news_keywords, id,is_negative.
  3. I filtered the data based on specific key words such as rape/gang rape, sexual assault, acid victim. A full list of keywords can be found in the associated notebook.
  4. In the attribute news_ner_tags Stanford NER Tagger generated Location, Persons and Organizations were filtered into their unique values.
  5. We generated the network from the filtered data set. Locations, Persons and Organizations were used as nodes for the network while their co-occurrence in a specific news article was an edge. For each news article, we generated the complete graph of the co-occurrence of those nodes.
  6. After iterating over 607 filtered articles, we found the entire network along with important locations, general organizations working on harassment related topics such as different medical colleges who worked on helping the victims, police and crime teams of Bangladesh, other organizations who have worked together from Supreme Court to BDR, specific locations which had a lot of coverage such as Dhaka and Chittgong, small clusters of events which didn’t have much coverage.
  7. We used NetworkX for generating the network. Then we exported the network with 2777 nodes and 21793 edges to Gephi and visualized it. After noticing “Dhaka Tribune” and “Bangladesh” had disproportionate number of edges and knowing they don’t add much to understanding we removed those terms. However, we kept the term “Dhaka” to show that we had when the location is Dhaka, most of the coverage is just about it. Places from rural areas just don’t get attention.
  8. Code can be found here, but I’ve only maintained the key parts of the scripts instead of the exploratory parts for reproducing.

Visualization :

Here’s a snapshot of the visualization :

Interactive version can be found here :

Media coverage of violence against women in Bangladesh

Highlights :

  1. The network was colored based on modularity class and I’ve also emphasized the hubs. Interestingly, the clusters tended to emphasize different and very specific cases. For example, the green cluster on the left is mostly about the death of a girl named Felani by BSF, Indian Border police. Felani Khatun , a 15-year-old Bangladeshi girl, was shot and killed by India’s Border Security Force (BSF) on 7 January 2011, at India-Bangladeshborder.[1][2] A photograph showing Felani Khatun’s dead body hanging on a border fence made of barbed wire was picked up by international media, and the publication of these photographs evoked international concern.[3]. Here is the portion of the network about Felani, which has always been a very popular case.

2. Similarly, another rape and murder case Tonu which was also recently very popular in Media had it’s own class in violet. Her case still remains unsolved as police has not been able to find the killers.

3. This cluster corresponds to a controversy about a cricketer called Rubel Hossain who was charged for rape by Naznin Akter Happy. (charge later dropped though), but we can see that this case had collaborated with many entities.

4. This is the most interesting network of people in Jamaat-I-Islami. Some of the people may have been charged with crime against women in the covered news, as the cluster is dense, but some of the benign entities such as Ibn-Sina Hospital is also here. Here’s some people and organizations Motiur Rahman Nizami is associated with who have been recently convicted and sentenced to death for war crimes.

5. Dhaka seemed like the most important hub when it comes to the coverage of events. Dhaka does have the important organization headquarters such as CID, RAB which often investigate violence against women cases, along with the Supreme Court, police in general and Medical Colleges. Dhaka based events were covered more because they were ‘reported’ more, while the other gray clusters around were mostly underreported. I don’t have the exact rape/violence stats for Dhaka vs other regions so I can’t draw an exact comparison. And the network actually depicts the ‘co-occurrence’ of entities and it’s likely a high profile case will get a lot of attention from Dhaka media and organizations. So far I’m not sure if I should draw any conclusions from here or not, but I’m erring on the side of being skeptical.

So, I think for me the realization that many nodes in the network is a victim was very traumatic. Sohagi Jahan Tonu, Felani has disproportionate coverage and they were killed brutally, but there are many small cases like some Aklima, Lipi, Parvin, Shilpa, Antara here and there in the network which has very small clusters, mostly colored in gray.

I feel for them as the small cases seem to be mostly forgotten and overlooked by media , and suddenly looking at a network of crimes and the associated organizations which were saving them in some cases(hospitals for examples) and the associated people(who can be both attackers or the police) and location was a very different sort of experience for me.

I was guided to go out of my comfort zone by the mentors here and I didn’t expect the network visualization project to turn out so well, however, I’m glad it did and I hope that people would also ‘feel’ for the victims after looking at the network.

Learning Machine Learning

Blog posts for my machine learning and data visualization…

Learning Machine Learning

Blog posts for my machine learning and data visualization projects!

Tahsin Mayeesha

Written by

Deep Learning Engineer. New grad, CSE.GSOC 19 participant@Tensorflow. Previous GSOC 18 @ Berkman Klein Center of Internet and Society. Kaggler,fast. ai internat

Learning Machine Learning

Blog posts for my machine learning and data visualization projects!