CAD Archaeology, Part 0

I recently found myself wanting to hack on some CAD files (from SolidWorks) programmatically, preferably with Clojure. There doesn’t, however, seem to be any actual usable open-source software that I can find that is able to read these files.

This was a good start at the overall file format, which is a crazy filesystem inside a file. Apache POI can at least read files at this level and extract the bits inside. In theory, the following should return you a big byte array containing the object data from your file:

(import '[org.apache.poi.poifs.filesystem DocumentInputStream NPOIFSFileSystem])
(defn extract-display-lists
[file]
(with-open [fs (NPOIFSFileSystem. (File. file))]
(let [contents (.getEntry (.getRoot fs) "Contents")
display-lists (.getEntry contents "DisplayLists__ZLB")
in (DocumentInputStream. display-lists)
buf (byte-array (.getSize display-lists))]
(.readFully in buf)
buf)))

The trouble then is that the Contents/DisplayLists__ZLB part is some hopeless binary format, and I can’t seem to get anywhere in figuring it out. Chances are good that it’s C structs written directly, so awesome. The only thing I can see immediately, given two sample SLDPRT files (one blank, and one containing a single line), is that 1 the first 16 bytes are identical in both, and 2 it looks like the first two 4-byte values after these prologues might be little-endian integer values. Could dig deeper there, I guess.

I also tried the free eDrawings viewer from SolidWorks, which can open these CAD files and can save them as “eprt” files. These are slightly simpler, in that they are plain old zip files, with a magic header at the beginning (both unzip and java.util.zip.ZipFile can read them without trouble). These files contain three separate files:

  • “eModel”, apparently a “HOOPS Stream File”. This appears to be the file that contains the 3D models itself.
  • “preview.jpg”, a rendering of the object in the file, as, you guessed it, a JPEG.
  • “scene.xml”, which seems to be some basic information about the scene in some simple XML, but which unfortunately doesn’t contain any information about the objects.

Information about the “HOOPS Stream File” is pretty scant on the Internet. There used to be a “OpenHFS” group, but I can’t find anything concrete on the file format itself through archive.org, and any source code that they provided seems to be long gone.

There isn’t much here, I realize, but it’s hopefully just the start of this exploration. The obvious issue here is that so many “open” hardware projects rely pretty heavily on proprietary software, so many designs that are otherwise freely available are locked away.

Lastly, I want to complain about how bad Tumblr and Medium are for writing, especially when it comes to code and formatting. Isn’t the point of these sites to write stuff? Or oh, I get it, the point is to herd in reams of DAUs, and improve the company’s bottom line.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.