High Resolution 3D Models of Formal Dresses using LIDAR and Photogrammetry

Claire
Queenly Engineering

--

Goal

As a marketplace dedicated toward formalwear and pageant dresses, Queenly often features a number of glamorous floor length gowns and sequined dresses. However, it’s difficult to fully capture all the various embellishments and decorative jewels on many of these dresses.

Even with zoomed-in shots, buyers might still struggle to envision the dress as a whole or wonder about how dresses look from certain angles that aren’t featured. And, because many buyers are shopping for big events like pageants, quinceañeras, proms, and weddings, it’s especially important for them to have as much visual information as possible to make an informed decision.

The new 3D model feature offers an innovative way to view collections so buyers can see dresses in their full beauty. Alongside the photos available, users can also view a high resolution 3D scan of a dress — and interact with it! They can rotate these 3D dresses to see different angles and zoom in to see minute details. With more visual information about overall dress fit, fabric material, and intricate detailing, buyers are more likely to take the plunge and make the purchase for dresses they like.

3D Modeling reconstruction approach: Photogrammetry

Photogrammetry is a method of using overlapping photos to reconstruct a 3D model of an object. Essentially, a series of high resolution photos taken from different heights and angles are processed to generate a 3D map with elevation, shape, texture, and color information. Using this data, a 3D model can be reconstructed.

To achieve the dress models through photogrammetry, we experimented with several different softwares.

First, we used Apple’s sample Photogrammetry Command-Line App. The app took a series of photos as input and would output a 3D model reconstructed from the photos. In order to obtain the best results, we shot the model against a solid background in a well-lit room, and turned it to have at least a 70% overlap between shots. These steps are key to making sure the program can recognize landmarks between photos and ultimately reconstruct the model.

We attempted this process several times with different dresses — but most of them showed holes or extraneous parts in the final render.

3D models of formalwear dresses reconstructed using Apple’s Photogrammetry Command-Line App

At first, we thought the holes were a result of the semi-transparent fabric of the dress, but we then attempted the same process with opaque fabrics. Again, we noticed holes or missing chunks near the edges of the dress, and even unexpected meshes that would float or stick out of the dress.

We then turned to Polycam, a popular 3D scanning app that can also reconstruct models through photogrammetry.

3D model of beige ball gown reconstructed using Polycam photogrammetry process

The models generated by Polycam were noticeably more polished than those of Apple’s Command-Line App. The results were consistently high resolution and detailed. Whether scanning opaque, transparent, sequined, or patterned fabrics, Polycam yielded accurate 3D models, and seemed better equipped to handle the photogrammetry process in general.

3D models of orange straight gown (left) and blue prom dress (right) reconstructed using Polycam photogrammetry process

One possible factor for the higher quality renders generated by Polycam is its easy photo capturing process. When photographing an object, the app has an Auto feature that requires the user to tap the capture button only once and will then take photos based on the user’s movement. This has potential to lead to more consistent photos with better degrees of overlap than taking photos manually.

3D Modeling reconstruction approach: LIDAR

Lidar, which stands for “light detection and ranging”, is another approach to creating 3D models. It works by sending laser pulses at items in a space and then senses the reflected pulses to measure the distances between points. So, instead of using photos like in photogrammetry, lidar uses a cloud of points with direct measurements to features to recreate a 3D scene.

In addition to photogrammetry, we also tested Lidar as an approach to scanning the dresses. However, the models generated were blurry and low resolution – much of the embroidery and detailing work on the dresses were lost on the lidar scans.

Comparison of blue ball gown in real world space (left) and 3D model Lidar scan (right)

Winner: photogrammetry advantages for formalwear

After experimenting with both approaches, we found that photogrammetry was the clear winner for generating 3D dress models.

Comparison of Lidar and Photogrammetry scans of beige ball gown: photo of dress in real world space (left), Polycam lidar scanning (middle), Polycam photogrammetry scanning (right)

Because Lidar uses laser beams to reconstruct environments, it gains information about the distance between objects, as well as the shapes and sizes of features. However, since smaller objects don’t reflect as much light, Lidar works better for generating large spaces, such as buildings and rooms.

Photogrammetry also outperforms Lidar for scanning smaller objects because of how the two differ in their reconstruction processes. Photogrammetry pieces together sequential, overlapping photos, looking for landmarks and ultimately using the visual details of the photos to generate a 3D model.

Lidar uses a more abstract process, measuring distances to an environment’s features and generating a 3D point cloud. It then uses this point cloud to reconstruct a model. However, the accuracy of these point clouds is limited by the precision of Lidar sensors, which often yields best results for large or heavily articulated objects.

Lidar 3D point cloud

Smaller surface details and fine lines are hard to pick up on — meanwhile, photogrammetry is much more capable since it pulls from visual data in photos. Indeed, when comparing the Lidar and photogrammetry generated 3D models, we can see that the photogrammetry models are far more detailed in both texture and color.

Beige ball gown 3D model using Polycam’s Lidar
Beige ball gown 3D model using Polycam’s Photogrammetry

The distortion near the top of the Lidar models are most likely a consequence of the nature of the scanning process — Lidar scans are meant to be taken in a single orbit of the object, but doing so isn’t sufficient to fully capture formalwear dresses. Because these dresses are on the longer side, it’s difficult to fit the entire dress in every frame of the video capture. In fact, Lidar scans are usually performed using drones which can cover larger areas, rather than phone cameras.

Beige ball gown Polycam Lidar demo
Beige ball gown Polycam Photogrammetry demo

For Queenly, it thus makes more sense to use photogrammetry — this approach is better suited to capture objects in high resolution and vivid detail. The only drawback to photogrammetry is the time needed to take and process the photos to reconstruct the 3D model. Often, we would take about 200–300 photos per dress, which required about an hour to process into models. On the other hand, with Lidar, we captured a single video and waited about a minute to process. While Lidar scans were far more time efficient, they were undeniably less detailed than the photogrammetry scans.

Since many collections featured on Queenly are sequined and have jewels and embellishments, displaying high resolution 3D models is important, especially from a UX perspective. Allowing the user to zoom in on details and maneuver the model to see different angles creates a more detailed, fuller image of a dress.

In-app implementation

To render a 3D model in iOS, we can take advantage of Apple’s 3D graphics framework, Scenekit.

After generating a model in Polycam, we can export it as a USDZ file (the recommended format for Scenekit), which are appropriately sized for storage in our cloud storage CDN. In order to render the model in our app, we need to create a scene for our model to inhabit. Using the SCNScene class, we can create a displayable 3D scene. An SCNScene is essentially a hierarchy of nodes that contain different attributes to represent 3D visuals. Depending on how we obtained the model in the previous step, we can either load a scene from the file by its name or its url — the SceneKit framework provides convenient initializers for either option.

We can now create an SCNView object and set its scene property as the scene we’ve just created — SCNViews allow us to display any 3D SceneKit content.

Once we have created our scene, we need to make it visible by adding a light source. Since SCNScenes are composed of nodes, we can do this by creating an SCNNode and setting its light property. To do so, we can create an SCNLight and customize its type and color properties. We can add our newly created node as a child node of our scene’s root node.

Lastly, in order to allow the user to control the camera of the scene, we can configure camera control properties of the SCNView object.

For a walkthrough of the code needed, check out this tutorial on loading 3D models into an iOS App.

Future Applications

Displaying 3D models in our app offers buyers a high resolution, interactive viewing option for the dresses they’re interested in, which has great promise to increase user engagement and satisfaction.

In order to make this feature available to other sellers, we’re considering implementing an additional step during the listing process: for users who wish to display 3D models of their dresses, they would simply upload a collection of photos that adhere to a few basic photogrammetry requirements. Because of the time needed to reconstruct models, we would then generate their dress model after a one-day processing step.

We can also think about improving the resolution and accuracy of our 3D dress models. The NVIDIA Instant NeRF rendering model offers a promising approach to increased speedup and quality of renders. Similar to our current process, it constructs a 3D model based on a collection of static photos. It achieves high resolution models by utilizing neural networks that can be queried to describe object properties.

The efficient rendering of this approach can be attributed to its use of smaller neural networks, which are less costly to perform access operations on. The training of these neural networks are accompanied by a multi-resolution hash table of feature vectors, which numerically represent visual properties of the object. It maps a cascade of grids from coarse to fine resolutions to corresponding arrays of these feature vectors. This multi-resolution structure allows the training process to optimize areas with more important details, leading to fast and high resolution renderings.

Looking forward and expanding on our 3D feature, the next step in making formalwear shopping even more interactive is to use our models in augmented reality try-ons. The goal for this would be to render the dress model on the user’s body, scaling it to their height and size. This way, buyers can try dresses on virtually and get a better sense of dress fit and overall look, hopefully leading to greater customer satisfaction and fewer returns.

Acknowledgments

Thank you for reading this! Big thank you to Mica M who has been a fantastic mentor for my summer internship and this project!

--

--