Text Analytics

I have a table docquery with columns docid,term,count.

This represents how many times a term occurs in a document denoted by docid. I pick a set of search terms from the same table and find out which of these terms occur the most number of times in other documents.

I am not excluding the document(9_txt) in which the search term occurs.

I am sure this query can be simplified.

Sample data

1_txt|industrys|1
2_txt|follow|1
3_txt|yes|1
4_txt|deferring|1
5_txt|wilf|1
6_txt|future|2
7_txt|reason|1
8_txt|dropped|2
9_txt|doesnt|2
9_txt|yes|2
select f1d,sum(s) sim FROM
(select f1d,f1t,sum(f1c) s FROM
(select f1.docid f1d,f1.term f1t,sum(f1.count) f1c
        FROM docquery f1
                WHERE f1.term IN (select term from docquery
                                        WHERE docquery.docid = '9_txt')
        GROUP BY f1.docid,f1.term)
GROUP BY f1d
ORDER BY s)
        GROUP BY f1d
        ORDER BY sim;