Building Trending Activity Feeds Using GraphQL And Neo4j

Leveraging the Neo4j Community Graph For Social Discovery

--

Trending activity feeds for the Neo4j Community Forum. Data comes from a GraphQL API on top of Neo4j Community Graph.

Recently the Neo4j Developer Relations team decided to devote a few hours to work together, hackathon style, to launch a new Neo4j community forum. One feature we decided was needed for launch was a trending activity feed that showed blog posts and projects that are currently trending in the community.

Fortunately, my colleague Mark Needham has built out the Neo4j Community Graph, a series of serverless functions that import data from APIs like Github, Twitter, Meetup, and Stackoverflow into a Neo4j instance. We use Neo4j Community Graph to keep track of what’s going on in the Neo4j community. In fact if you’ve seen Mark’s newsletter This Week In Neo4j (AKA TWIN4j), a few queries to community graph surfaces each week’s content.

In this post I want to explain how we built out the trending activity feed feature on the Neo4j Community Forum using a GraphQL First development approach.

Community Graph

Data model for the Community Graph. Combining data sets and querying across them is a natural fit for a graph database like Neo4j.

One of the benefits of a graph database like Neo4j is the ease with which it allows for combining datasets and querying across them. We’ve been using Neo4j to help keep track of what the Neo4j community has been working on. Neo4j Community Graph is powered by a set of serverless functions that run on AWS Lambda, periodically checking the APIs of GitHub, Twitter, StackOverflow, and Meetup for new Neo4j related updates. Any matches are imported into Neo4j. Mark has even created a community graph CLI so that anyone can spin up a community graph for their community!

The Awesome Discourse API

The first step was adding content from our new Discourse forum into community graph. Discourse exposes an API to enable developers to build tooling on top of Discourse. In this case Mark registered a webhook on the Discourse API so that events are sent to the serverless community graph ingest Lambda functions. So the new Discourse site becomes a new developer community data set in community graph. This allows us to capture new Discourse users and new posts, including likes and replies. We link Discourse users in the graph with their other accounts (Github, Twitter, Stackoverflow, etc) when a user fills in their Discourse profile with their user names across services.

Community Trending Activity Feeds

Trending activity feeds provide an overview of what is going on in the community at a glance and quickly show the posts, topics, and projects that are trending. Rather than just showing the most recent posts, we want to rank items to surface content of interest to viewers that show what is currently trending in the community.

Exponential Time Decay Functions

Discussion forums such as Hacker News and Reddit make use of exponential time decay functions to rank what posts float to the top and what are banished to the second page and beyond. The basis for these ranking algorithms are a function that calculates a score for each post which consists of a rating divided by the time since the post was created raised to some exponent. The rating can consist of things like upvotes, views, subjective scoring by a reviewer, or as in our case, all three. You can find implementations of the functions used by Hacker News and Reddit, but for our purposes we’ll use a much simpler version.

A simple exponential time decay function for scoring posts in a discussion forum. Functions like this allow new content to appear high on the leaderboard, but quickly drop off if the community does not find it engaging.

Here’s how we can query the community graph using Cypher to find our top community blog posts using an exponential decay function:

const cypherQuery = `MATCH (u:DiscourseUser)-[:POSTED_CONTENT]->(t:DiscourseTopic)
WHERE t.approved AND NOT “Exclude” IN labels(t)
WITH *,
1.0 * (duration.inSeconds(datetime(), t.createdAt)).seconds/10000
AS ago
WITH u, t, ( (10.0 * t.rating) + t.likeCount + t.replyCount)/(ago^2) AS score
WITH u, COLLECT(t)[0] AS topic
RETURN u, topic LIMIT $first`;

An exponential time decay function with these parameters will surface almost every new post to the top, to give it a chance for the community to vote on it. Posts that the community votes up, or that involve active discussion stay at the top, while posts that are less captivating to the community drop off the list.

GraphQL API

We expose a GraphQL API on top of the Neo4j community graph instance hosted on Neo4j Cloud that we can query from Discourse to populate the activity feed on the landing page.

The GraphQL Schema

Defining the GraphQL schema was actually one of the first steps we took. Once we defined the schema we were able to just return mocked data from the GraphQL API so Jennifer and David could work on the UI. Here’s the GraphQL schema for our top blog post query:

type CommunityBlog {
title: String
url: String
author: DiscourseUser
}
type DiscourseUser {
name: String
screenName: String
avatar: String
}
type Query {
topCommunityBlogsAndContent(first: Int = 10): [CommunityBlog]
}

Resolvers

In a GraphQL server, resolvers are the functions that contain the logic for fetching data from our data layer. In our case each query entry point to the GraphQL API is a single Cypher query, so we can use the Neo4j JavaScript driver to execute our Cypher query and return the results, ordered by our time decay ranking score.

export const resolvers = {
Query: {
topCommunityBlogsAndContent: (_, params, context) => {
let session = context.driver.session();
const baseUrl = ‘https://community.neo4j.com/’;
// cypherQuery is defined above
return session.run(cypherQuery, params)
.then( result => {
const resData = result.records.map(record => {
const user = record.get(“u”).properties,
topic = record.get(“topic”).properties;
return {
title: topic.title,
url: baseUrl + “t/” + topic.slug,
author: {
name: user.name,
screenName: user.screenName,
avatar: getAvatarUrl(user.avatarTemplate)
}
}
})
return resData;
})
.catch(error => {
console.log(error);
})
.finally( ()=> {
session.close();
})
}
}
};

Apollo Server

We use Apollo Server in a Node.js app to allow us to build a GraphQL API from a GraphQL schema (type definitions) and resolvers. We inject a Neo4j driver instance into the context so that it is available in the resolver functions.

import { typeDefs, resolvers } from “./graphql-schema”;
import { ApolloServer, makeExecutableSchema } from “apollo-server”;
import { v1 as neo4j } from “neo4j-driver”;
import dotenv from “dotenv”;
dotenv.config();const schema = makeExecutableSchema({
typeDefs,
resolvers
});
const driver = neo4j.driver(
process.env.NEO4J_URI || “bolt://localhost:7687”,
neo4j.auth.basic(
process.env.NEO4J_USER || “neo4j”,
process.env.NEO4J_PASSWORD || “neo4j”
)
);
const server = new ApolloServer({
context: { driver },
schema: schema
});
server.listen(process.env.GRAPHQL_LISTEN_PORT, ‘0.0.0.0’).then(({ url }) => {
console.log(`GraphQL API ready at ${url}`);
});

One of the benefits of using Apollo Server is that we automatically get GraphQL Playground, a tool for exploring GraphQL APIs. GraphQL Playground allows us to explore the schema, including entry points and types, and to execute GraphQL requests against the API.

GraphQL Playground makes it easy to explore GraphQL APIs. Try it here.

The Neo4j Community Forum Landing Page

The GraphQL first development approach was a fun and productive way for us to quickly build out a feature that allowed us to put it into production after a few hours of collaborative work. When it’s all said and done, here’s how the landing page looks:

Not bad for a few hours hacking on a rainy day ☔ Check it out here.

I skipped over how we populate the “This Week in Neo4j”, “Popular Community Projects”, and “New Certified Developers” sections, but it all comes from Neo4j Community Graph! You can find the GraphQL API code here to see how those bits work.

neo4j-graphql-js and GRANDstack

If you’re interested in exposing a GraphQL API on top of a graph database like Neo4j (we think GraphQL and graph databases are an obvious match for each other) be sure to check out the neo4j-graphql-js library and GRANDstack for building full stack applications.

Be sure to join the Neo4j community forum on Discourse

You can find the code for the GraphQL API here.

Further Reading

GraphQL Community Graph. A GraphQL API for the GraphQL Community Graph on top of Neo4j.

And finally, here are a few more resources about Community Graph:

--

--