Untangling the GeoDatabase
Data was one of the biggest issues needing addressed at PCAO. The situation was a data disaster, but not just restricted to the data. Anything to do with structure, definition, maintenance or application of best practices associated with maintaining a geo-database had been ignored; all of this irritated by the revolving door associated with the GIS Manager role at the County.
About 5 years prior, ESRI representatives had come in and established the system for enterprise use and application, and turned it over to the resident GIS Analyst of the time. Several had passed through that position in the years that followed; some less qualified than others; some attempting to maintain data and data structure integrity; others, not so much. The GIS Manager position had been empty for about 6 months, prior to my arrival. The GIS Analyst during this time was a solid Cartographer, but admitted having a very limited understanding of databases. She depended heavily upon the GIS Coordinator in the Planning & Zoning Department (P&Z).
The geo-database was a rather simple arrangement, leveraging ArcGIS SDE v9.1 on a IBM DB2 v8 database, providing unrestricted access to anyone that knew how. It contained all of the features used by the Office of the Assessor and some random features from P&Z. Some of the features had been copy/pasted from one feature to another, thereby duplicating the Shape_Length and Shape_Area field multiple times. Many other fields had also been duplicated also, from various attempts to manipulate the data fields All of the fields in all of the tables had been converted, or were originally created as 256 character length varChar (text). It was a little hard to believe that ESRI would set it up that way, but I later found that was exactly the case. The geo-database also contained numerous incomplete features, and various types of analysis features, from unknown and unidentified projects. Nothing was documented. Meta data was missing from everything. Field names were nonsensical. There was no rhyme or reason to the organization or naming convention of many of the features or data fields.
Straightening out the situation was rather simple and only took about a month. I exported everything to a File Geo-database (FGDB) for analysis and permanent archiving, then removed anything from SDE for which there was no accounting. P&Z was the only other department storing data in SDE, and the Assessor applications were straightforward in their use of the data.
Once all of the features were exported to a FGDB, I carefully analyzed the duplicated fields to determine that containing the most accurate information. In most cases, the newly duplicated field had been used for data storage, from the point it had been duplicated; in other cases, the field and the data were duplicate fully. After completing the analysis, and removing all duplicated fields, and then restoring the newly scrubbed features to be retained.
The data that was to remain on SDE would be strictly for application use and data maintenance, and not be stored without some sort of documentation about what it was and what it would be use for; thus, initiating a new and sorely needed policy for SDE.