
SALVIAS TaxonScrubber
TaxonScrubber Updates - Sept. 21, 2004
- TaxonScrubber Version 1.2 now available for download
- Peru synonymized checklist now available for download
- See below for more information
What is TaxonScrubber?
SALVIAS TaxonScrubber is a stand-alone application for automated standardization of taxonomic
names. In addition to removing spelling errors in species names, TaxonScrubber splits concatenated
information into separate fields, and can be used to restructure flat-file specimen data prior to importing
to a relational database. Although designed primarily for standardizing inventory
data for the SALVIAS plots database, TaxonScrubber can be used whenever large numbers of taxonomic
records need to be error-checked and reformated.
How TaxonScrubber works
TaxonScrubber performs four basic actions:
- Splitting of concatenated fields. Epithets and authorities contained in single fields are
split into separate fields. For example, the input string "Quercus alba L." is split into three fields,
Genus = "Quercus", Species_epithet = "alba", Sp_auth = "L.". TaxonScrubber can splits up to two
subspecific levels off of a single name (e.g., Quercus alba var. gunnisonii Torr. fo. Rugosa).
- Recognition and removal of standard annotations. TaxonScrubber contains an extensive library of Latin
and English botanical annotations, their spelling variants, and abbreviations. Annotations such as "cf.", "aff.",
"vel. sp. aff.", etc., are removed and stored in a separate field. Informal annotations of uncertainty, such as
question marks, are treated as "cf." Any text not recognized as a standard annotation
is stored in an additional annotation field, and flagged for inspection by the user.
- Standardization of spelling. Once fields have been split, and extraneous text removed, TaxonScrubber matches
names to a standard list of validly published names (currently, TaxonScrubber uses a world list of
plant names; however, later releases of TaxonScrubber will have the option of loading name lists for other taxa).
After flagging all names which match to the standard list, TaxonScrubber's "Hand scrub" utility provides pull-down
menus for correcting remaining names to the standard world list. Names still unmatched at the end of the process can then be
flagged as morphospecies names (e.g., Miconia sp.3), or as indets (e.g., Miconia sp.).
- Standardization of higher taxonomy. TaxonScrubber standardizes all family names to match taxonomic concepts
and spellings of the Missouri Botanical Garden's TROPICOS database. Future versions will allow the user to update
higher taxonomy according to alternative taxonomic concepts (for example, APG familial concepts; see The Angiosperm Phylogeny Website).
During the scrubbing process, TaxonScrubber generates new fields containing the results of the splitting
and cleaning process, and various "flag fields" indicating the status of each name component (Family, genus, specific epithet, etc). These fields may be retained or deleted as needed upon export of the formatted the cleaned file.
Other TaxonScrubber features
- File management. TaxonScrubber imports, names, backs up, and manages source files within the database
environment. Original files are left untouched until the user has completed the scrubbing process, and chooses
to export the scrubbed file and replace the original.
- Archiving of source names. Prior to scrubbing, TaxonScrubber archives the original names, unchanged, for comparison with the "srubbed versions". After scrubbing, these fields can be deleted--or not--at the user's discretion.
- Hand-scrubbing. TaxonScrubber features tools for manual inspection of taxonomic
fields, including filters which display only records containing selected standard annotations, and matching to
pull-down menus of standard names or names within the original file.
New with TaxonScrubber 1.1
- Matching of names to alternative taxonomies and synonymized lists. Version 1.1 of TaxonScrubber
permits loading of alternative reference taxonomies, and provides information on name status and synonymy when the
source taxonomy is also synonymized. Currently available lists includes a provisional list of vascular plant species
of the world (invalid names flagged, but no synonymy) and a synonymized checklist of the gymnosperms and flowering plants
of Peru. See below for more details). Other regional and monographic
reference taxonomies in preparation will be released over the coming months.
- Matching to standard names includes authorities. The first version TaxonScrubber matched names only
down to the specific epithet. Version 1.1 will optioinally match authorities as well, when these are present in the source
data and reference taxonomies.
- Improved parsing of name compenents. Several parsing problems with the first release of TaxonScrubber
have been corrected. For example, TaxonScrubber now correctly distinguishes by context various identical abbreviations
of "filius" ("son of", a component of authority citations such as L.f.), and "forma" (a rank indicator).
- More compact taxonomic reference tables. Reorganization of taxonomic reference tables and
the queries has reduced the size of the taxonomy database by half.
To view the interface of TaxonScrubber, click on the thumbnail...

Download the latest version of TaxonScrubber: Version 1.2 (updated September 2004)
TaxonScrubber 1.2 features faster name parsing and matching, and corrects an error in the application of standardized family names.
Installation Notes:
- TaxonScrubber is a pc application for MS Access 2000 or later versions. Access must be installed on your machine to run TaxonScrubber. If you need TaxonScrubber for an earlier version of Access, please contact Brad Boyle: bboyle@email.arizona.edu
- To run TaxonScrubber, you will need to download two files: the main application (filename "TaxonScrubber_v12") and a taxonomic database file beginning with the name "TS_Taxon_Tables_").
- Taxonomic databases have been reformatted extensively and will not work with the earlier versions of TaxonScrubber (1.0, 1.1 and 1.1b); nor will the old reference tables work with TaxonScrubber version 1.2 (TaxonScrubber_v12). Therefore you will need to download the updated versions of the taxonomic databases as well as the main application.
- Once you have both files on your computer, click on TaxonScrubber. You will be provided with instructions for loading the reference lists into your database.
Download TaxonScrubber version 1.2:
TaxonScrubber (4.2 MB; 27 MB uncompressed)
Download taxonomic databases (updated for TaxonScrubber ver. 1.2):
Currently we offer two taxonomic databases: a world list of vascular plant names from the Missouri Botanical Garden's TROPICOS database (TROPICOS) anda synonymyzed checklist of the vascular plants of Peru (based on Brako and Zarucchi 1993. In the future we may offer additional taxonomic reference lists, including taxa other than plants.
World plant list. Lookup tables for nearly 1 million plant names. Based on all names in TROPICOS, with additional names of old world plants from the IPNI source databases. Compilation date: May 2003, reformatted for TaxonScrubber ver. 1.2, Sept. 2004. Warning: this is a very large file (35 MB; 158 MB uncompressed)
Peru plant names. Lookup tables based on the synonymyzed checklist of the vascular plants of Peru (Brako and Zarucchi 1993). Formatted for TaxonScrubber ver. 1.1 Feb. 2004. (1.5 MB; 5.8 MB uncompressed)
Citing TaxonScrubber 1.2:
Boyle, B.L. 2004 Sep 21. TaxonScrubber Version 1.2. The SALVIAS Project, http:/www.salvias.net. (Accessed [date_downloaded]).
Comments or questions concerning TaxonScrubber? Please contact Brad Boyle, bboyle@email.arizona.edu