Take a look at the http://project-haystack.org/ website for technical resources and community discussions on these topics. There is an open source tool available called Project Builder Plus that may be helpful. Find information here: http://project-haystack.org/forum/topic/467
As far as a REGEX code there isn’t a “single” solution because source data can be in so many different formats. My company, SkyFoundry offers training materials and sample REGEX example code for importing external data and automatically tagging it based on interpretation of names. (You can contact us at firstname.lastname@example.org). The logic can be easily adapted to other software applications. The completeness of auto tagging is always dependent on the extent of descriptive data that can be interpreted from the available names and descriptions. In many cases there is little that can be interpreted from those names without manual effort.