This is a line of research that we are not currently pursuing, but that still interest us. With Temple F. Smith we addressed, some time ago, the problem of finding the query selecting the closest database subset to a given arbitrary subset –a problem which we term now reverse querying–. We addressed the problem informally in Guigó et al. (1991), and more rigorously in Guigó and Smith (1993). In this paper, we mapped the semantic problem of finding appropriate descriptions in a first order language (a database query language) into the algebraic problem of finding similar sets in a set algebra. Using the properties of a set similarity measure, we were able to design an efficient algorithm, that was latter implemented in a program (Guigó et al., 1993). We were particularly interested in the case in which the given database subset is the set of protein sequences in a database matching a given (maybe randomly generated) pattern, and the query was built on functional annotation of the database. During the development, this method was tested to automatically search a protein sequence database for functional amino acid patterns, and a few interesting cases were discovered (Vega et al, 1990; Guigó and Smith, 1992).

