Skip to main content
Fig. 1 | Journal of Cheminformatics

Fig. 1

From: Advanced SPARQL querying in small molecule databases

Fig. 1

SPARQL query using similarity search. This example SPARQL query uses a procedure call named orchem:similaritySearch to identify compounds that are similar to a given structure. The task of the query is to select all compounds that are not annotated as antibiotics, but that are similar to a compound that is annotated as an antibiotic. In addition to the compounds, the query also returns similarity scores to the most similar antibiotics. The first triple pattern (line 6) binds the ATB variable to compounds that are annotated as antibiotics (identified by ChEBI ID 33281). The following triple pattern (line 7) binds the MOLFILE variable to the MOL structures of these compounds. The procedure call is identified by the orchem:similaritySearch IRI and is represented by the triple pattern on lines 10–14. The blank node used in the object position (lines 12–14) represents parameters of the procedure call. The query structure is denoted by the orchem:query IRI, and its value is specified by MOLFILE. Other parameters are constant. The type of the query structure is denoted by the orchem:queryType IRI, and the cutoff similarity score is denoted by the orchem:cutoff IRI. The blank node used in the subject position (line 10) represents multi-value results of the procedure call. The COMPOUND variable is bound to the similar compounds found (identified by the orchem:compound IRI), and the SCORE variable is bound to their appropriate similarity score (identified by the orchem:score IRI). The minus pattern (lines 17–21) eliminates all identified compounds (to which COMPOUND is bound) that are annotated as antibiotics. Finally, the results are grouped by COMPOUND (line 23), and the compounds (COMPOUND variable) and their maximal similarity scores to some antibiotics (MAXSCORE variable) are returned as the final result (line 3)

Back to article page