BIO TDS Supporting Materials for

Bioinformatic Tools Discovery System

The Bio-TDS Bio Tools Discovery Systems has been developed to assist researchers in retrieving the most applicable analytic tools by allowing them to formulate their questions as free text. The Bio-TDS is a flexible retrieval system that affords users from multiple bioscience domains ( e.g. genomic, proteomic, bio-imaging) the ability to query over 15,000 analytic tool descriptions integrated from well-established, community repositories. One of the primary components of the Bio-TDS system is the ontology and natural language processing workflow for annotation, curation, query processing, and evaluation. The Bio-TDS’s scientific impact was evaluated using sample questions posed by researchers retrieved from Biostars, a site focusing on biological data analysis. The Bio-TDS was compared to five similar bioscience analytic tool retrieval systems with the Bio-TDS outperforming the others in terms of relevance and completeness. The Bio-TDS offers researcher the capacity to associate their bioscience question with the most relevant computational toolsets required for the data analysis in their knowledge discovery process.

Bioinformatics Elaborated Tools Specifications (BETS) provides a standard for analytic tool descriptions. The analytic tool descriptions (i.e. metadata) gathered from community tool repositories integrated into the Bio-TDS are stored in JSON format using the BETS standard. This standard consists of core BETS attributes and domains/repositories specifics attributes (see Figure S1) The core BETS attributes are manually mapped to the repository attribute

The Bio-TDS combines bioinformatics tools from five other repositories and stores them in one central location, following BETS (Bioinformatics Elaborated Tool Specification). There are six main modules that convert the data from each of the five repositories into BETS tools and store the new tools into the Bio-TDS database. The BETS Checker is a Java application that tests the compatibility of a tool with the BETS specification. A tool is considered “compatible” if it is in the format specified by the specific BETS converter. For example, the system contains a mapper called Galaxy Converter. A tool from the Galaxy Tool Shed can only be “compatible” if it matches the predefined Galaxy format.