Building Trending Activity Feeds Using GraphQL And Neo4j
Leveraging the Neo4j Community Graph For Social Discovery
Recently the Neo4j Developer Relations team decided to devote a few hours to work together, hackathon style, to launch a new Neo4j community forum. One feature we decided was needed for launch was a trending activity feed that showed blog posts and projects that are currently trending in the community.
Fortunately, my colleague Mark Needham has built out the Neo4j Community Graph, a series of serverless functions that import data from APIs like Github, Twitter, Meetup, and Stackoverflow into a Neo4j instance. We use Neo4j Community Graph to keep track of what’s going on in the Neo4j community. In fact if you’ve seen Mark’s newsletter This Week In Neo4j (AKA TWIN4j), a few queries to community graph surfaces each week’s content.
In this post I want to explain how we built out the trending activity feed feature on the Neo4j Community Forum using a GraphQL First development approach.
Community Graph
One of the benefits of a graph database like Neo4j is the ease with which it allows for combining datasets and querying across them. We’ve been using Neo4j to help keep track of what the Neo4j community has been working on. Neo4j Community Graph is powered by a set of serverless functions that run on AWS Lambda, periodically checking the APIs of GitHub, Twitter, StackOverflow, and Meetup for new Neo4j related updates. Any matches are imported into Neo4j. Mark has even created a community graph CLI so that anyone can spin up a community graph for their community!
The Awesome Discourse API
The first step was adding content from our new Discourse forum into community graph. Discourse exposes an API to enable developers to build tooling on top of Discourse. In this case Mark registered a webhook on the Discourse API so that events are sent to the serverless community graph ingest Lambda functions. So the new Discourse site becomes a new developer community data set in community graph. This allows us to capture new Discourse users and new posts, including likes and replies. We link Discourse users in the graph with their other accounts (Github, Twitter, Stackoverflow, etc) when a user fills in their Discourse profile with their user names across services.
Community Trending Activity Feeds
Trending activity feeds provide an overview of what is going on in the community at a glance and quickly show the posts, topics, and projects that are trending. Rather than just showing the most recent posts, we want to rank items to surface content of interest to viewers that show what is currently trending in the community.
Exponential Time Decay Functions
Discussion forums such as Hacker News and Reddit make use of exponential time decay functions to rank what posts float to the top and what are banished to the second page and beyond. The basis for these ranking algorithms are a function that calculates a score for each post which consists of a rating divided by the time since the post was created raised to some exponent. The rating can consist of things like upvotes, views, subjective scoring by a reviewer, or as in our case, all three. You can find implementations of the functions used by Hacker News and Reddit, but for our purposes we’ll use a much simpler version.
Here’s how we can query the community graph using Cypher to find our top community blog posts using an exponential decay function:
const cypherQuery = `MATCH (u:DiscourseUser)-[:POSTED_CONTENT]->(t:DiscourseTopic)
WHERE t.approved AND NOT “Exclude” IN labels(t)
WITH *,
1.0 * (duration.inSeconds(datetime(), t.createdAt)).seconds/10000
AS ago
WITH u, t, ( (10.0 * t.rating) + t.likeCount + t.replyCount)/(ago^2) AS score
WITH u, COLLECT(t)[0] AS topic
RETURN u, topic LIMIT $first`;
An exponential time decay function with these parameters will surface almost every new post to the top, to give it a chance for the community to vote on it. Posts that the community votes up, or that involve active discussion stay at the top, while posts that are less captivating to the community drop off the list.
GraphQL API
We expose a GraphQL API on top of the Neo4j community graph instance hosted on Neo4j Cloud that we can query from Discourse to populate the activity feed on the landing page.
The GraphQL Schema
Defining the GraphQL schema was actually one of the first steps we took. Once we defined the schema we were able to just return mocked data from the GraphQL API so Jennifer and David could work on the UI. Here’s the GraphQL schema for our top blog post query:
type CommunityBlog {
title: String
url: String
author: DiscourseUser
}type DiscourseUser {
name: String
screenName: String
avatar: String
}type Query {
topCommunityBlogsAndContent(first: Int = 10): [CommunityBlog]
}
Resolvers
In a GraphQL server, resolvers are the functions that contain the logic for fetching data from our data layer. In our case each query entry point to the GraphQL API is a single Cypher query, so we can use the Neo4j JavaScript driver to execute our Cypher query and return the results, ordered by our time decay ranking score.
export const resolvers = {
Query: {
topCommunityBlogsAndContent: (_, params, context) => { let session = context.driver.session();
const baseUrl = ‘https://community.neo4j.com/’; // cypherQuery is defined above
return session.run(cypherQuery, params)
.then( result => {
const resData = result.records.map(record => {
const user = record.get(“u”).properties,
topic = record.get(“topic”).properties; return {
title: topic.title,
url: baseUrl + “t/” + topic.slug,
author: {
name: user.name,
screenName: user.screenName,
avatar: getAvatarUrl(user.avatarTemplate)
}
}
})
return resData; })
.catch(error => {
console.log(error);
})
.finally( ()=> {
session.close();
})
}
}
};
Apollo Server
We use Apollo Server in a Node.js app to allow us to build a GraphQL API from a GraphQL schema (type definitions) and resolvers. We inject a Neo4j driver instance into the context so that it is available in the resolver functions.
import { typeDefs, resolvers } from “./graphql-schema”;
import { ApolloServer, makeExecutableSchema } from “apollo-server”;
import { v1 as neo4j } from “neo4j-driver”;
import dotenv from “dotenv”;dotenv.config();const schema = makeExecutableSchema({
typeDefs,
resolvers
});const driver = neo4j.driver(
process.env.NEO4J_URI || “bolt://localhost:7687”,
neo4j.auth.basic(
process.env.NEO4J_USER || “neo4j”,
process.env.NEO4J_PASSWORD || “neo4j”
)
);const server = new ApolloServer({
context: { driver },
schema: schema
});server.listen(process.env.GRAPHQL_LISTEN_PORT, ‘0.0.0.0’).then(({ url }) => {
console.log(`GraphQL API ready at ${url}`);
});
One of the benefits of using Apollo Server is that we automatically get GraphQL Playground, a tool for exploring GraphQL APIs. GraphQL Playground allows us to explore the schema, including entry points and types, and to execute GraphQL requests against the API.
The Neo4j Community Forum Landing Page
The GraphQL first development approach was a fun and productive way for us to quickly build out a feature that allowed us to put it into production after a few hours of collaborative work. When it’s all said and done, here’s how the landing page looks:
I skipped over how we populate the “This Week in Neo4j”, “Popular Community Projects”, and “New Certified Developers” sections, but it all comes from Neo4j Community Graph! You can find the GraphQL API code here to see how those bits work.
neo4j-graphql-js and GRANDstack
If you’re interested in exposing a GraphQL API on top of a graph database like Neo4j (we think GraphQL and graph databases are an obvious match for each other) be sure to check out the neo4j-graphql-js library and GRANDstack for building full stack applications.
Be sure to join the Neo4j community forum on Discourse
You can find the code for the GraphQL API here.
Further Reading
And finally, here are a few more resources about Community Graph:
- Community Graph Github org — includes code for spinning up your own community graph
- GraphQL APIs for Developer Communities
- Five Common GraphQL Problems and How Neo4j-GraphQL Aims to Solve Them
- GRANDstack.io — GraphQL, React, Apollo, Neo4j Database full stack development