Creation of Graph2Graph Dataset

Line-based 2D polygon dataset for MidcurveNN

Anushkaykulkarni
Technology Hits
3 min readJul 6, 2023

--

Code + Result Plot for ‘Plus’ by Author

This project is an extension to the problem posed by Dr. Yogesh Kulkarni regarding the generation of midcurves, a dimension reduction on 2D shapes.

Problem Statement

To convert the existing point-based dataset to a line-based one as the point-based shape representation has following problems:

  • Lines are assumed to be between pairs of listed points in a sliding window manner but in case of closed polygons i.e., profiles the closing line is assumed to be between the first and last point. For an open polygon like Midcurves the closing line should not exist. This difference creates ambiguity.
  • In the case of Midcurves which have intersecting lines just the list of points is insufficient and may create undesirable connections.

Thus, line-based shape representation is proposed.

A line-based shape format is nothing but:

Shape = list of lines
line = list of points
point = x and y coordinates for 2D shape

e.g.: [((x1, y1,..),(x2, y2,..)), ..]

The input data in point-based format is shown below:

# I Profile points (Each line is space separated x and y coordinate)
5 5
10 5
10 20
5 20

While the required line-based output format is:

  [
[
[
5.0,
5.0
], # Individual Point
[
10.0,
5.0
]
], # Single Line
[
[
10.0,
5.0
],
[
10.0,
20.0
]
],
[
[
10.0,
20.0
],
[
5.0,
20.0
]
],
[
[
5.0,
20.0
],
[
5.0,
5.0
]
]
] # List of all Profile lines

Proposed approach

  • Read the .dat and .mid files containing the point data into a dictionary with the following format:
shape_dict = {
'ShapeName': "I",
'Profile': [[5.0,5.0], [10.0,5.0], [10.0,20.0], [5.0,20.0]],
'Midcurve': [[7.5,5.0], [7.5,20.0]]
}
  • Line formation logic is implemented based on sliding window approach with special care taken for cases like closing connections and intersecting lines like ‘T’ and ‘Plus’.
  • These lines are then added to the same dictionary as shown.
shape_dict = {
'ShapeName': "I",
'Profile': [[5.0,5.0],[10.0,5.0],[10.0,20.0],[5.0,20.0]],
'Midcurve': [[7.5,5.0],[7.5,20.0]],
'Profile_lines': [[[5.0,5.0],[10.0,5.0]],[[10.0,5.0],[10.0,20.0]],[[10.0,20.0],[5.0,20.0]],[[5.0,20.0],[5.0,5.0]]],
'Midcurve_lines': [[[7.5,5.0],[7.5,20.0]]]
}
  • Shapes are visualized as below with profile in black and midcurve in red.
Program Generated Images by Author
  • To have a large number of variations transformations such as Scaling, Rotation (about origin), Translation and Mirroring (about x and y axis) are performed on the point-based data before conversion to lines.
  • All shapes are then saved as separate JSONs.

Open Sourcing the Dataset

The populated JSON files have been uploaded to Kaggle to for wider usage at https://www.kaggle.com/datasets/anushkaykulkarni/midcurvenn-linegraphs

Screenshot of Kaggle dataset by Author

The implementation of the above approach has been open sourced at MidcurveNN GitHub repo.

What Next?

Graph2Graph transformation neural Networks in the form of encoder-decoder architectures are unknown, especially if the input and output graphs are of different topologies. The reason for their unavailability could be the lack of an appropriate dataset.

With the availability of above mentioned MidcurveNN LineGraph Dataset there is a hope of development of Graph2Graph transformations.

The author is a First Year Electronics Engineering student at Cummins College of Engineering for Women. More details here.

--

--