GSoC Project Report : Improve LibreCores.org in Terms of Discoverability
LibreCores.org lists free and open source “IP Cores” on the website for the community to view and use. All listed projects are backed by a git repository. Currently, LibreCores.org website extract the project README and LICENSE and render them on the project page, along with links to the project homepage and git repository.
A user browsing for cores on LibreCores will be interested search for a specific category of projects to determine which projects will come under which classification. A user may wants to search for a project through a list of projects. The indexing of all the IP Cores and Improving the search experience in LibreCores in terms of discoverability. The IP Cores details will be classified and indexed for making search experience better and more efficient.
In this project I am working on defining a classification hierarchy, classifying the projects into categories, and making a search engine for improving the searching standards and making searching experience more user interactive.
I have worked on my GSoC project from May to August 2018. My work was divided into four phases:
- Defining a classification hierarchy.
- Specifying and Deleting classifications for projects.
- Indexing Projects, Organizations and Users data into algolia
- Basic search features(Autocomplete and InstantSearch)
The original proposal for the projects is available here.
The entire tracking map for my project can be found here.
The list of all issues and PRs (tagged GSoC 2018) for my project can be found here.
The corresponding project on GSoC 2018 Website can be found here.
Defining a Classification Hierarchy
During my community bonding period, I was working on the database structures for Classification hierarchy and Project classifications. The first objective was to decide how to structure the classification hierarchy. My mentors — Phillip and I decided to have a Faceted Classification that the owner of the project can specify to them to their projects. The during our community bonding period Phillip and I were working on structure of the classification hierarchy and on the Classification Hierarchy entity structure to store these categories and the Project classification entity to store the classifications specified for the projects.
During my first month of Google Summer of code Phillip had created one issue specifying the classification hierarchy (#235) I was working on the database Entity structure of Classification hierarchy which can be found here and Project classification which can be found here.
In this phase I made the following PR:
- #234 — Create database entities and update the database
Specifying and Deleting Classification for a Project
During my second phase of first evaluation I was working on adding and deleting project classifications for a project. This part includes UI as well as backend work.
First of all I have created the respective issues to work on. The backend work includes validation of the classification and inserting the data. I implemented the User Interface in project settings page and now users can specify and delete classifications from their project settings page. For specifying the classifications I have purposed to do it through the `select` element of the HTML. The next child categories are being displayed at that time when you select a parent category. For security reasons my mentor and I decided to keep two stage validation. First stage is from the UI(select element) and the second stage is in the backend itself. An incorrect classification cannot be added to a project.
New classifications can be added when the user updates the project and the same thing is being implemented for delete process.
In this phase I have made the following PR:
- #236 — Insert and Delete Project Classifications
- #273 (Bug Fix) — Resolve Classification validation Bug
Indexing Projects, Organizations and Users data into algolia
I have planned to work on indexing data into algolia and implement the basic search feature during my second month of Google Summer Of Code. Phillip and I decided to install algolia-search-bundle as our project is based on Symfony framework.My Initial work were divided into two phases
- Indexing data into algolia
- Implement basic search engine features
After gone through the previous search functionality of LibreCores I decided to index Project, Organizations and Users data through custom normalizer of search bundle into algolia. After discussing with Phillip I had listed all the fields that I need to index into algolia and wrote the normalizers here(Serializer/Normalizer).
I made the required configuration change in the config.yml file.
In this phase I have made the Following PR:
- #244 — Algolia integration and Indexing Entities
- #262(BugFix) — Index project classification data when project is not being updated.
Basic search features(Autocomplete and InstantSearch)
For this phase I have divided my work into three stages
- Algolia Autocomplete
- Algolia Instant search
- Searching through the classifications
In this phase I was working on algolia Autocomplete feature. I implemented this feature in the home page where a user can search through different sections. In this feature I added autocomplete.js. The the current state of the project have autocomplete over user, projects and organizations. The suggestion are shown in a dropdown field. The projects suggestions are shown first then the organizations and then the user suggestions. For now I had made configuration to show maximum five suggestions for each sections.
You can find the configuration for it here.
The second part of search feature is to implement instant search in search result page. I faced some problems while implementing instant search for three different indices and to specify the respective templates for each indices. To overcome this situation I make a config, which will select the particular indices that the user has selected. As each indices have different datas to deal with It took me some time to filter all the data and to render it accordingly.
You can find the backend codes here
Searching through classifications
The second objective is to implement searching through classifications. I implemented two things for this feature. First the classifications can be searched by the search box and second one is by the Alogolia hierarchicalMenu. Algolia hierarchical menu provides multi level search of a hierarchical tree. As this feature satisfies our requirement that’s why I decided to keep this feature of instant search. But the problem is algolia’s Backend does not support multi search functionality for the classifications. At a single time we can search only one hierarchical tree.
To make Algolia’s hierarchicalMenu I pushed the data in a specific format i.e I specified each level data and indexed those data in algolia.
In this phase I have made the following PRs
- #246 — Algolia instant search and Autocomplete feature.
- #257(Bug fix) — Resolve Search for classification bug
- #256 (Bug fix) — Remove space after user or org in search result
- #270 (Bug fix) — Remove space between search box and search suggestion
- #275 (Bug fix) — Remove comma separation in classification
Apart from the main deliverables above, I also contributed a few other patches.
- PR #208 — Add a emailer and send confirmation email
- PR #213 — Make list of GitHub repos searchable in project import
- PR #216 — Styling and content of page after email is verified
What tasks were accomplished
Although, this was my task for GSoC, there is still much scope for future work. In Algolia and project classifications can be improved and can be categorized in many sections. Custom ranking formula can be improvised. We can integrate algoilia’s various functionality by which the search experience will be much better.
What did I learn for Google Summer of Code
- Valuable experience in designing database schemas — denormalizing schemas for performance reasons.
- Search Engine functionality such as Algolia
- Symfony working structure and framework
- Observed a lot of patterns, styles, conventions and best-practices of development in the real (open-source) world.
- Learned a lot about importance of linters, CI, and code styles while working on a project with many contributors.
The actual blog is at sandipbhuyan.com