Ljubica Lazarevic
Neo4j Developer Blog
7 min readAug 3, 2020

--

*Update! Now with hints and solution!*

Summer of Nodes: Week 1 — The Barbecue

Hello everybody!

Summer of Nodes 2020 is now over. If you’ve not had a chance to look at the challenges, you can always have a go at your leisure:

First week’s theme — the Barbecue

Nothing better marks summertime than a barbecue — it is a wonderful combination of food, amazing weather, and great company with friends and family. Whilst the current situation is making the latter a bit harder, it is still possible to share the occasion through social distancing and virtual means.

Our first Summer of Nodes challenges are going to be around learning more about barbecues around the world; and planning for that socially-distanced event!

These challenges should take approximately an hour each, depending on experience. Of course, you are welcome to try both challenges!

Beginner’s challenge: model barbecues around the world!

Introduction

There are many different ways one can barbecue — from the different cooking fuels in use, different types of cooking implements, down to cooking style (nice and quick, or low and slow), and what actually makes it onto the grill! There are many country and regional variations worldwide, and you can find out more in this wikipedia article: r.neo4j.com/wiki-bbq

This week’s challenge — based on the article, think about how you’d model all of the variety in a barbecue. The types of questions we might ask are:

  • What is the most common type of barbecue?
  • What countries tend to have vegetarian-friendly barbecues, based on the food?
  • What type of food is the most popular?
  • Is there a relationship between cooking surface, cooking type and fuel?

Potential modelling tools and links:

If you’re new to graph data modelling, you may find this guide helpful. You can also complete this free online training course.

You can use any sensible modelling too, pen and paper are acceptable! Can’t think of one? We suggest you use Arrows. You can find out how to use it here.

What we are looking for?

Use these labels: Barbecue, Country, CookingType, Fuel, Food, Dietary, CookingSurface. Add details, e.g. appropriate relationship types and properties. An example value for a property is fine. No multiple nodes with the same label!

Examples of what we’re looking/not looking for!

Hints please!

Ah, go on then!

Make sure you use all of the node labels we have provided you! We asked you a set of questions further up. Think about how you might connect all of the node labels based on the questions we have asked you. You may find it easier to underline the corresponding labels ‘mentioned’ in the questions.

Don’t forget to add details to help you answer the questions. For example, what property might you need to add to Barbecue so that you can determine what is the most common type?

Alex also gave a quick demo of Arrows. You can watch it here.

Experienced challenge: the socially distanced barbecue

Introduction

Around the world we’re starting to see lockdown easing, which is allowing us to once again see our nearest and dearest. Of course, we need to be careful, and maintain social distancing.

This week’s challenge — you will be provided with data to work with. This data contains the guests you’ve invited to your socially-distanced barbecue. The guests are individuals, or members from the same household, and so forth. You are to use the graph to figure out the optimal seating plan for all your guests, whilst respecting social distancing guidelines.

The rules:

  • There are a total of 6 tables, each with 6 places. Each place is 1 metre apart from each other
  • All guests not from the same household must be at least 2 metres apart from each other. Guests from the same household may sit closer together
  • A table must have no more than a maximum of 2 people from the same household sat there
  • A table must have members of at least two households

The data:

You can get the CSV from here.

What are we looking for?

  • The query/ies you used to load and prepare your data
  • The query you ran on this data to allocate guests to tables
  • The output should be: guest, household, tableNumber
  • You do not have to use pure Cypher, you may use the standard Neo4j plugins
  • The queries must be executable!

Hints please!

Of course, this was bit of a challenger, but a fun one at that. If we revisit the rules, we discover that we will need to:

  • Think about what to do with households with more than 2 people, as they won’t be able to sit at the same table! Think about how you might pre-process the data
  • Think about what are the maximum possible numbers of guests to tables, i.e. 4 from 2 households, or 3 from 3 households, etc.
  • Think about using a weighting system of some sort. Something that will allow you to determine when the maximum capacity for the table has been reached, that works for the scenarios mentioned above
  • Don’t be shy to draw out some of your thinking on paper, this may help you determine a way to resolve the challenge!
  • We will need to iteratively allocate guests to tables until they are all done. APOC has some tools to help us with that. For example, could apoc.periodic.commit() help us?

Solutions

We covered potential options for solutions in our live stream on 10th August. Watch the recap here.

Beginners challenge

There were a number of ways this could be solved, and what we were looking for was:

  • Using all of the node labels provided
  • Choosing sensible relationship types, with no type repetition
  • Choosing sensible property key names
  • A model that could answer the provided questions in a reasonable way

Here is an example of what your data model could have looked like. Don’t worry if it’s not identical!

You data model may have looked similar to this

Experienced challenge

Again, there were a number of ways this could be solved, and we received some very novel approaches!

Let’s take a quick look at what our options were. In the diagram below, using the rules, we could have one of four seating options (where the same colour indicates members from the same household at the table):

Seating plan options as per the rules

An approach we found helpful to solve this conundrum was to think about applying weights. This was the following system used:

  • Total weight allowed at a table = total distance / minimum distance separation. In this scenario the total weight is, 6 / 2 = 3
  • Household weight = 0.5 + number of members in that household / 2. So for a 1-member household it’s 1, for 2 members it’s 1.5, etc.
  • Use household weights to split out to sub households if they’re larger than 2
  • Allocate households to tables based on these weights

We also used the assumption around how internal node IDs are allocated, i.e. when splitting out a household, the ID won’t be consecutively before/after the original household.

Based on these assumptions, the following set of queries were used to load and pre-process the data:

//load the householders
LOAD CSV WITH HEADERS FROM
'https://raw.githubusercontent.com/summer-of-nodes/2020/master/week1/bbq_households.csv' AS row
CREATE (p:Person {name:row.Name})
MERGE (h:Household {id:row.`Household Group`})
ON CREATE SET h.weight = 1.0
ON MATCH SET h.weight = h.weight+0.5
CREATE (p)-[:IN_HOUSEHOLD]->(h);
//split out the households that are bigger than 2 people
CALL apoc.periodic.commit(
"MATCH (h:Household)
WHERE h.weight>1.5 //get the households with more than 2 people
WITH h limit 1
MATCH (h)<-[r]-(p)
WITH * limit 2
MERGE (h2:Household {id:h.id + h.weight, weight:1.5})
MERGE (p)-[:IN_HOUSEHOLD]->(h2)
WITH h, r
DELETE r
WITH DISTINCT h
SET h.weight = h.weight - 1.0
RETURN count(*)", {limit:1});
//create the tables
WITH range(1,6) as tables
UNWIND tables AS no
CREATE (t:Table {number:no, weight:0});

And this a potential query that can be used to allocate the seats:

//allocate seats!
CALL apoc.periodic.commit(
"MATCH (h:Household) WHERE NOT (h)-->()
WITH h ORDER BY h.weight DESC, id(h) LIMIT $limit
MATCH (t:Table) WHERE t.weight+h.weight <= 3.0
WITH t, h order by t.number LIMIT 1
CREATE (h)-[:AT_TABLE]->(t)
SET t.weight=t.weight+h.weight
RETURN count(*)",{limit:1})
YIELD updates
WITH updates AS ignoreMe
MATCH (p:Person)-->(h:Household)-->(t:Table)
RETURN p.name AS guest, left(h.id,1) AS household, t.number AS tableNumber ORDER BY tableNumber

There are, of course, several things we could do to make this query more robust to dealing with fewer guests, better distribution, ensure non-consecutive IDs of split households, etc. Perhaps if you fancy an extension to the challenge, why not think about how you could make this flexible to more tables/places!

We hope you enjoyed the challenges! Hungry for more? We’ll post the next set of challenges shortly!

--

--