Only the correct words would need to be stored, not the spelling mistakes themselves. Spelling correction is usually implemented by defining a dictionary of valid words (say the dictionary in /usr/share/dict/words on Unix-y systems). For any word in the input that does not match the dictionary, you can compute all edits with Levenshtein distance=1 or 2 and see which of those match the dictionary. See Peter Norvig’s famous article for the simplest approach to this: http://norvig.com/spell-correct.html (it’s naive and uses too much memory, but solves the problem in a few lines of Python). Using libpostal’s training data for the countries you’re interested in could provide a corpus that’s specific to geo queries and statistics on which words/phrases are more likely. For most user-facing systems I’d recommend using something like Lucene’s spell checker (in search engines like ElasticSearch/Solr) with a “Did You Mean”-style interface in so users can see and correct their mistakes during the autocomplete phase, then sending the query to libpostal for parsing only once the user is done typing. Trying to do automatic spelling correction after the fact and correctly understanding what the user meant 100% of the time is not a solved problem, and varies in effectiveness by language.
I initially thought Bad Wollishofen was a city name rather than a POI. If it’s indeed supposed to be a POI, libpostal can parse all of those forms correctly (tried {Bad, Seebad, Strandbad, Badi} Wollishofen with libpostal 1.0 and it gets all of them correct as “house”, our tag for venue name/POI). If a geocoder can’t find a particular POI, the problem is with either the tuning of the geocoder’s underlying search index or with its rules for when to return results. The only name for that venue in OSM is “Bad Wollishofen”. The issue can be resolved by either adding the alternate names to OSM (in the alt_name, alt_name_1, etc. tags), by adding synonyms to the geocoder so “Bad”, “Badi”, etc. are all treated as the same word by the geocoder, or by stripping all of the above prefixes from POI names in your application where name means the relevant field(s) in your input data at ingestion time and the “house” tag as identified by libpostal at query time.
