Exploring the Future of XR Content with Generative AI: Point-E and Beyond

Published in

XRPractices

5 min readJan 1, 2023

Generative AI is revolutionising the way we create content for XR. In this article, we will explore how these advancements in technology can be applied to the challenges of creating XR content.

Traditionally, creating high-quality XR content has required specialized skills and expensive modeling software. While tools like Blender can assist in this process, they often have a steep learning curve and require a strong understanding of 3D geometry, which can be daunting for many. With the emergence of generative AI, however, we have the opportunity to streamline and simplify the process of creating immersive content for XR.

OpenAI has recently released Point-E, a generative AI model that can create 3D point clouds from reference images or text prompts. The code for Point-E has been made available to the public, allowing anyone to try out this innovative tool. This is a major development in the world of generative AI, as it allows for the creation of complex 3D shapes and structures from simple prompts. Point-E has the potential to revolutionize the way we create 3D content, making it faster and easier to generate immersive experiences for XR.

The workflow for creating 3D content using generative AI can be summarized as follows:

Use a generative art AI to generate an initial image as inspiration.
Feed this image into Point-E to create a 3D point cloud.
Mesh the point cloud and export it for use in XR.

To generate the initial image for our 3D content, I used a generative art AI called DiffusionBee. Using the following prompt I was able to generate an image that more or less met my desired specifications.

mini cooper car toy, isometric no shadows grey background, low poly, 
Cartoon, Unreal Engine, 3D Model, PBR, high quality render, 3D Render

To use Point-E to create a 3D point cloud from our initial image, we will first need to install the necessary dependencies and launch a Jupyter notebook. To do this, we will follow these steps:

#Clone Point-E repo
git clone https://github.com/openai/point-e.git
cd point-e
#install depentencies
pip install -e .
#install jupyter notebook if not already installed
pip install --upgrade notebook
python3 -m notebook

Next, we will copy our saved image to the “example_data” folder in the Point-E repository. Then, we will open the “image2pointcloud.ipynb” notebook in Jupyter and change the following line to include the file name of our image.

Point-E offers three different AI models for generating 3D point clouds: a fast 40M model, a moderate 300M model, and a slower but more accurate 1B model. While the 40M model may be sufficient for some applications, the 1B model is recommended for the best results. However, keep in mind that using the 1B model may require more computational resources and may take longer to run. It is important to weigh the trade-offs between accuracy and speed when choosing which model to use for your specific needs.

base_name = 'base300M'
...
img = Image.open('example_data/car.png')

at the end of the notebook add this line to persist the converted pointcloud.

pc.save('example_data/car.npz')

and start executing the model.

If you encounter import errors when running the model, you may need to uninstall and reinstall certain libraries. In my case, I had to do:

python3 -m pip uninstall Pillow
python3 -m pip install Pillow
python3 -m pip uninstall numpy
python3 -m pip install numpy
python3 -m pip uninstall matplotlib
python3 -m pip install matplotlib
python3 -m pip uninstall kiwisolver
python3 -m pip install kiwisolver
python3 -m pip uninstall scipy
python3 -m pip install scipy

After generating the 3D point cloud using Point-E, the next step is to convert it into a mesh. To do this, we will use the “pointcloud2mesh.ipynb” notebook. First, we need to open the notebook in Jupyter and modify the following line to point to the “car.npz” file generated by Point-E:

pc = PointCloud.load('example_data/car.npz')

Executing the notebook will take some time, as the process of converting the point cloud into a mesh can be computationally intensive. Once the process is complete, you should have a “tree.ply” file that represents the mesh of your 3D content. This mesh can then be imported into a 3D modeling software like Blender/ Meshlab for further editing.

The output from Point-E meshing model is way below my expectations.

While the initial results from the Point-E meshing model did not meet our expectations, it is important to remember that this technology is still in its early stages and has the potential to improve over time. By using MeshLab to fill in any holes or gaps in the mesh using the Poisson surface reconstruction algorithm, we were able to slightly improve the quality of the output. However, the final result is still not yet at a level that is directly useful.

Despite this, we have high hopes for the future of Point-E and other generative AI technologies. Just as we have seen with ChatGPT, these models are constantly evolving and improving. It is worth keeping an eye on the progress of Point-E, as it has the potential to revolutionise the way we create 3D content for XR experiences.

Extending this experiment, I have also tested out Google’s “DreamFusion” tool, using a similar text prompt to generate a low poly minicooper toy car with a cartoonic style. The resulting output was more usable than what we had previously obtained with Point-E. DreamFusion utilizes NVidia’s Neural Radiance Field (NeRF) technology to generate its 3D models, which may have contributed to its improved performance in this case.

DreamFusion uses NVidia’s NeRF (Neural Radiance Field) for generating the 3D Models.

Exploring the Future of XR Content with Generative AI: Point-E and Beyond

Written by Raju K