Projects of Data Science for the Public Good Program

With the support of the US Embassy, ​​we successfully completed the first training of the DSPG program which we implemented in partnership with the Istanbul Metropolitan Municipality, TED University Applied Data Science Center and University of Virginia Biocomplexity Institute, between November 14, 2020 and January 31, 2021.

kodluyoruz
Kodluyoruz
14 min readApr 5, 2021

--

Throughout the program, students from social sciences, engineering and other different disciplines developed projects to better understand social problems and find solutions to them with open data provided, after they were trained in the field of data science. Aiming to find solutions to different problems of the city from social distance violations to traffic problems; students in the Istanbul program have implemented 7 projects, and students in the Ankara program have implemented 5 projects and they had the chance to present their projects at the Graduation & Project Presentations event on 24–25 March 2021.

Traffic Density Analysis

Niyazi Ülke (Bogazici University-Department of Computer Engineering) & Esra Şekerci (METU-Sociology)

With the machine learning algorithms, it is aimed to predict how the planned road maintenance, to be made on a certain date, will affect the traffic and to ensure that these maintenances will be done at appropriate times.

In the project, road maintenance data carried out between January-June 2019 and the average speed data sets corresponding to certain roads in Beyoğlu, Beşiktaş, Esenler, Kağıthane, and Şişli regions were used.

District maps were downloaded using the Osmnx (OpenStreetMap Networkx) library and the roads which were not included in the data provided by the municipality were deleted from the maps. After the road maintenance coordinates were matched with the closest roads on the map, the machine learning models were tested and it has been estimated what the hourly average speed on that road will be during maintenance work.

Variables such as whether the date is a public holiday, whether the date is before or after the COVID-19 precautions, the amount of daily rainfall, and the presence/absence of maintenance on the road at a certain time were added to the model.

Decision Tree Regressor, Random Forest Regressor, Linear Regression, and Ridge Regression algorithms were used in the project and it was decided to train different models for each district.

As a result, the average speed estimates were reached for a certain road, date, and time with or without road maintenance.

Why to implement this project?

According to the average speed estimates achieved through our project, it may be possible to reduce the traffic density by better planning the dates and times of road maintenances.

Next steps:

  • One of the challenges we encountered in the project development process is the number of maintenances on the roads we mapped. With more data, a study with more consistent results can be made.
  • In addition to the variables used in the model, it would be nice to include data on social events that took place at the same date and location to the model.
  • Average speed data was used as the only variable that can show traffic density. It is also necessary to use the data of the number of vehicles passing through these locations in the same time zones.

*You can watch the project presentation on this link.

Prior!st (Urgency Classification)

İsmet Özer (Bahçeşehir University-Neuropsychology) & Eda Nur Ersu (Yıldız Technical University-Mathematical Engineering) & Didar Tutan (Koç University-Comparative Studies in History and Society)

Thousands of messages are sent daily to Istanbul Metropolitan Municipality (IBB) in the form of requests, complaints, and suggestions. Among these messages, there may be those that contain danger or those that are critical in terms of response time. However, since IBB deals with these messages in the order they arrive, it is not possible to detect such messages early.

The purpose of the Priorİst project is to highlight urgent messages among non-urgent messages and to make them noticed in a timely manner. In the urgent messages category, there are messages that include life and property safety, messages of citizens who cannot pay for medicine, food and rent, and natural disaster reports. Different text classification machine learning models are used with Natural Language Processing methods in order to dissociate such messages from the rest. Naive Bayes, Support Vector Machine, and Logistic Regression models were successful at a rate of 85%, 86%, and 85%, respectively. Finally, an interface that can be integrated with related systems has been developed to make this classification practical. The outputs of this project have the potential to have an impact on effective time, manpower and other resource optimization for the municipality.

*You can watch the project presentation on this link.

AnomalyHunter (Call Center Anomaly Detection)

Ayşenur Erbahar (Istanbul University-Political Science and International Relations) & Ezgi Karakuş (Istanbul Technical University-Urban and Regional Planning) & Gizem Dağdeviren (Yıldız Technical University-Mathematical Engineering) & Melih Sarı (Middle East Technical University-Business Administration)

The aim of the project is to detect an anomaly in the call center of the municipality if the general call about the same issue, the same location and the general call related to that subject and location generated traffic outside the development. Determine the anomaly by analyzing these calls retrospectively at 15-minute intervals in a service in the back.A service to be developed will detect this anomaly, provide assistance to the municipality employees and provide faster action on the detected issue.

About the project and results:

  • In the data set consisting of approximately 3 million rows and 11 columns, including İstanbul Metropolitan Municipality county location and call center requests, 35 topics among 335 different topics that could cause anomaly were selected.
  • Studies were carried out with Isolation Forest, ARIMA K-Means Clustering and Autoencoder models, which are unsupervised learning algorithms. A detection model was built by the Autoencoder model, which gives better performance compared to the others.

Requirements for implementing the project:

● Labeled data set with anomaly definition

*You can watch the project presentation on this link.

Covid-19 Public Transport Simulation

Miray Ercan (Altınbaş University-Electrical-Electronics Engineering) & İlayda Bağdatlı (Dokuz Eylul University-Econometrics) & Sefa Mutlu (Dokuz Eylul University-Department of City and Regional Planning) & Melis Sönmez (Bogazici University-Sociology) & Ege Süalp (Galatasaray University-Economics)

Our goal in Covid-19 Simulation project is to simulate spread of the virus in M1 metro line in Istanbul. The 2020 hourly metro usage data in İBB Open Data Portal is used and number of passengers in March and April are compared. Since the virus was first encountered in March and necessary precautions were not implemented consciously, it is observed that the virus spread more rapidly and aggressively. Yet in April, due to lockdown restrictions, increased public awareness and with other measures taken the decrease in the use of public transportation and the spread of the virus were observed. With this simulation, we achieved our goal to show how important the individual measures to be taken and the policies to be applied to both citizens and IBB in the fight against the pandemic.

*You can watch the project presentation on this link.

CrowdZone (Density Analysis with Image Processing)

Yeşim İpek (Istanbul Technical University-Management Engineering) & Fırat Ülgen (Mimar Sinan Fine Arts University-Department of City and Regional Planning | Istanbul Technical University-Geographical Information Systems) & Naz İrem Baz (Istanbul Technical University-Industrial Engineering)

CrowdZone is a project that focuses on analyzing the density where it aims to control the crowd in highly-populated cities. CrowdZone is primarily aimed to identify the risk points for the pandemic and to determine places where people show great interest. Analyzes and studies have been carried out, mainly based in Istanbul. Therefore, contributions to Istanbul can be listed in three points: Arrangements in places and streets to ensure safety against pandemic ; Optimizing transportation planning by detecting peak times on public transportation lines ; Finding the best locations for municipality services (for example Halk Ekmek) depending on the popular areas.

Two different image processing models have been evaluated during the process. Aims behind those models training are person class detection and their location detection for density maps. Consequently, different convolutional neural network models have been developed either for the population density map and for the determination of interest points. With the developed models, the crowd countings and density maps of highly-populated squares in Istanbul were predicted.

*You can watch the project presentation on this link.

Sentimap

Yeşim İpek (Istanbul Technical University-Management Engineering) & Didar Tutan (Koc University-Philosophy and International Relations |Comparative Studies in History and Society) & Ege Zeytun (Bogazici University-Politics and International Relations | George Washington University-Politics) & Büşra Öner (Istanbul University Business Management | Marmara University-Product Management and Marketing)

Sentimap aims to measure Istanbulites’ attitudes towards municipal services on a county basis by processing call center data with sentiment analysis models. Sentimap allows the Metropolitan Municipality to expand its service resources and to make better social policies in line with citizens’ changing feelings in call center messages.

SentiMap is a data science project developed in collaboration with the Istanbul Metropolitan Municipality. SentiMap applies Natural Language Processing tools and sentiment analysis models to the Municipality’s Call Center data in order to identify emotions and to measure citizens’ satisfaction of public services. SentiMap facilitates the identification of public services that are in need of improvement and increases Istanbul residents’ rapport with the Istanbul Metropolitan Municipality.

During the project process, data analysis and cleaning have been done. Typos in the text have been corrected and proper names have been removed to prepare data for processing and modelling. After training sentiment analysis models on data, emotions of the Call Center applications are categorized as negative, positive, or neutral. Findings are mapped to demonstrate district-based emotion scores.

*You can watch the project presentation on this link.

Social Distance Violation Detection

Tuğçe Köroğlu (Yildiz Technical University-Industrial Engineering) & Ege Süalp (Galatasaray University-Economics) & Furkan Ağca (Istanbul University-Industrial Engineering)

The social distance violation project was carried out by the dISTance team consisting of Furkan, Ege and Tuğçe. The aim of the project is to ensure that people who violate social distance are detected quickly and automatically.

In Turkey and around the world since the beginning of the pandemic, the adaptation to the social distance rule is one of the most highlighted issues. Although the definition of distance varies from country to country, the goal is to prevent the spread of the virus by maintaining a certain distance. The goal of this project is to make people more aware of social distance, to reduce the workload of the officers who monitor the social distance rule, and to make it easier to take special measures for places where social distance is not respected.

During the implementation phase of the project, firstly, information about deep learning and image processing was obtained from online platforms such as Github and from the courses given by the coding instructors and assistants. After the information obtained, the first step of the project, the model selection and the selection of the algorithms, were made. In this section, Tensorflow models, which are frequently used in the field of image processing, were applied and from these models, the most suitable two models were selected for the project. After the model selection was completed, the tests were made and the project was completed.

*You can watch the project presentation on this link.

Employment and Education Based Service Gap Mapping for Refugees #AWARENESS

Alper Tanrıverdi (METU-Political Sciences and Public Administration) & İrem Sezer (Data Analyst) & Fatma Öztel (Industrial Engineer) & Rumeysa Layık (Ankara University-Computer Engineering) & Selçuk Yusuf Arslan (Teacher) & Sercan Gürsoy (Financial Reporting Specialist)

Making the Needs of Refugees and Asylum Seekers Visible #AWARENESS is a project that occurred as a result of efforts of a combination of six people from different disciplines, some of whom continue their educational life, some of whom continue their professional life. The aim of the project is to develop a platform that provides information to relevant institutions and organizations by making data-oriented work on the problems faced by refugees and asylum seekers living in Ankara visible and accessible. As a meeting point for everyone who wants to benefit and contribute, the project aims to be a data-driven solution to the needs and problems of asylum seekers and refugees, as well as a common mechanism that can be used by institutions, organizations and people working in the field.

The project was born with the aims of enhancing transparency and accessibility of data on asylum seekers and refugees; improving the usage of existing data resources more effectively; creating a common platform for the institutions which work for helping refugees and asylum seekers. Being an accessible source of information for all; making visible the problems and needs of asylum seekers and refugees; being an useful tool by helping relative institutions on their effort to improve essential solutions for refugees and asylum seekers are the positive social effects expected from the project.

*You can watch the project presentation on this link.

Software industry Job Seekers and Employers’ Expectations Analysis

Dilay Ercan (Cumhuriyet University-Sociology) & Dolunay Gülşah Arıcı (METU-Economics) & Betül Ergin (METU-Gender and Women Studies) & Azem Özdemir (Ege University-Electrical and Electronics Engineering) & Baran Emre Türkmen (Sakarya University-Electrical and Electronics Engineering) & Mubina İpek (Ankara University-Economics)

The project ‘Eliminating Skill Gap and Knowledge Gap in Solving the Youth Unemployment Problem’ has emerged from an interdisciplinary group that consists of six people. In the fields that have yet to develop in Turkey and rapidly changing, such as data and software, the knowledge and skill gap is common. This situation results in a bigger problem: the capacity of young people not being fully utilized in the market. The project aims to inform young people about the current situation in the job market and provide young people and career planning centers with a tool to enhance data-driven decision-making behavior. The focus groups are young people who are actively looking for a job or planning a career in data and software fields and career planning centers.

For this purpose, we developed a web-based platform. In this platform, the end-users have access to analyses regarding the job market’s current situation for filters such as job position, job sector, university, and university majors. Part of the data on job postings on the job search websites is collected using a data science method: namely, web scraping. The rest of the data on job postings are provided by the stakeholders. The critical information such as required skills and preferred university majors is listed in the job posting description. This information is gathered by using text mining. Finally, the results of the analyses are presented in a user-friendly format with the help of data visualization and exploratory data analysis. This web-based portal can help in job search by guiding young people in which skills to develop or choosing a sector or position that suits their skills. As a continuation of the project, we aim to cooperate with key stakeholders for our project to be used in large-scale education or employment policies as a decision support mechanism.

  • You can watch the project presentation on this link.

Nudging Carbon Emission Reduction Resulting From Waste and Waste Collection

Ayçe İdil Ağca (Ege University-International Relations) & Aykut Aniç (Ankara University-Department Human and Economic Geography) & Hatice Ceren Şahin (Gazi University-Industrial Engineering) & Berke Yağmur (Hacettepe University-Statistics) & Bertuğ Kaan Demir (TED University-Architecture and Urban Studies)

In the Yenimahalle pilot area, it is aimed to create a risk map based on vehicle-based waste and route information, and calculate carbon emissions based on these maps, and to create settlement typology / socio-economic patterns according to the amount of waste, in order to raise awareness on waste management.

*You can watch the project presentation on this link.

Agricultural Production Resistant to Climate Change

Bahar Özdemir (METU-Economics) & Gözde Güldal (METU-Department of City and Regional Planning) & Ayşenur Özaslan (Kırıkkale University-Electrical and Electronics Engineering ) & Pınar Çomak (METU-Mathematics) & Büşra İzkan (METU-Department of City and Regional Planning)

The climate crisis which is a global problem threatens life. Today, we observe the devastating consequences of climate change in many sectors. Unfortunately, agricultural production which is the most important part of the food supply chain is one of the sectors that has been affected mostly by climate change. The reason behind that agricultural production directly depends on nature and climate. Agricultural production is fragile in the face of extreme weather events such as droughts, floods, frost, hurricanes. We know that they are all some of the consequences of global warming and climate change.

Activities to deal with climate change has to be discussed on both macro and micro scales. With this project, we are looking for a solution in Ankara locality. Firstly, we focused on two grain crops: barley and wheat. We aimed to set an agricultural mechanism that is more resilient to climate change with data science methods. In this respect, we have created a web-based platform. We aim to enable users from every segment of the sector to benefit from the platform. In consideration of climatic and production data, we try to estimate the future agricultural yield with artificial learning models. As a result, we create useful information for users such as “The Most Significant Climatic Parameters Affecting the Yield of Wheat and Barley” and “Yield Predictions of Wheat and Barley”. Thus, all stakeholders in the agricultural sector, from farmers to policymakers, will be able to access the future predictions from our online platform and will be able to adapt their activities in a planned way for a resilient and sustainable agricultural production in the future.

*You can watch the project presentation on this link.

Micro Mobility and Local Transportation Network Integration

Merve Demiröz (Politecnico di Torino-Urban and Regional Development) & Seçkin Çiriş (Istanbul Technical University-Transportation Engineering) & Yücel Torun (Istanbul Technical University-Disaster and Emergency Management) & Güldudu Fırtına (Erciyes University-Industrial Engineering) & Gizem Cemile Çelik (METU-Statistics) & Buse Demir (Kırıkkale University-Computer Engineering)

This project was created to provide support to decision-makers, with the optimum route and Hubs for the integration of micro-mobility in Ankara with public transportation. A moving visualization has been made that works for a specific region with limited data for now. One of the main goals of the project is the integration of micro-mobility into public transport and the widespread use of vehicles such as bicycles and scooters. Efforts have been made to create a faster, more economical, accessible, and sustainable transport plan for all. In the later stages of the project, by expanding the pilot regions and including data from different factors, guiding decision-makers to create an ideal mobility plan will cause the project to become widespread.

*You can watch the project presentation on this link.

--

--