**Commenced**in January 2007

**Frequency:**Monthly

**Edition:**International

**Paper Count:**31903

##### Learning to Order Terms: Supervised Interestingness Measures in Terminology Extraction

**Authors:**
Jérôme Azé,
Mathieu Roche,
Yves Kodratoff,
Michèle Sebag

**Abstract:**

**Keywords:**
Text-mining,
Terminology Extraction,
Evolutionary
algorithm,
ROC Curve.

**Digital Object Identifier (DOI):**
doi.org/10.5281/zenodo.1333849

**References:**

[1] T. B├ñck, Evolutionary Algorithms in theory and practice, 1995.

[2] D. Bourigault and C. Jacquemin, "Term Extraction + Term Clustering: An Integrated Platform for Computer-Aided Terminology," Proc. of EACL, Bergen., pp. 15-22, 1999".

[3] L. Breiman, "Arcing Classifiers," Annals of Statistics, vol. 26, no. 3, pp. 801-845, 1998.

[4] R. Caruana and A. Niculescu-Mizil, "Data Mining in Metric Space: An Empirical Analysis of Supervised Learning Performance Criteria". Proc. of "ROC Analysis in AI" Workshop ECAI, pp 9-18, 2004.

[5] K.W. Church and P. Hanks, "Word Association Norms, Mutual Information, and Lexicography," Computational Linguistics, vol. 16, pp. 22-29, 1990.

[6] W. Cohen, R. Schapire, and Y. Singer, "Learning to Order Things," Journal of Artificial Intelligence Research, vol. 10, 243-270, 1999.

[7] B. Daille, E. Gaussier, and J.M. Langé, "An Evaluation of Statistical Scores for Word Association," The Tbilisi Symposium on Logic, Language and Computation, CSLI Publications, pp. 177-188, 1998.

[8] P. Domingos, "Meta-Cost: A general method for making Classifiers Cost Sensitive," Knowledge Discovery from Databases, pp. 155-164, 1999.

[9] T.E. Dunning, "Accurate Methods for the Statistics of Surprise and Coincidence," Computational Linguistics, vol. 19, n┬░1, pp. 61-74, 1993.

[10] R. Esposito and L. Saitta, "Monte Carlo Theory as an Explanation of Bagging and Boosting," Proc. of International Joint Conference on Artificial Intelligence, pp. 499-504, Morgan Kaufman Publishers, 2003.

[11] C. Ferri, P. Flach, and J. Hernandez-Orallo, "Learning decision trees using the area under the ROC curve," Proc. of International Conference on Machine Learning (ICML), pp. 139-146, 2002.

[12] D.B. Fogel, E.C. Wasson, and E.M. Boughton, "Evolving Neural Networks for Detecting Breast Cancer," Cancer Letters, vol. 96, pp. 49- 53, 1995.

[13] Y. Freund, R. Iyer, R. E. Schapire, Y. Singer, "An Efficient Boosting Algorithm for Combining Preferences", Journal of Machine Learning Research, 4(Nov):933-969, 2003.

[14] R. Jin, Y. Liu, L. Si, J. Carbonell, and A. Hauptmann, "A New Boosting Algorithm Using Input-Dependent Regularizer," Proc. of International Conference on Machine Learning (ICML), AAAI Press, 2003.

[15] A. Kolcz, A. Chowdhury, J. Alspector, "Data duplication: An Imbalance Problem?" Workshop on Learning from Imbalanced Data Sets II (ICML), 2003

[16] G. Nenadic, H. Mima, I. Spasic, S. Ananiadou, and J. Tsujii, "Terminology-based Literature Mining and Knowledge Acquisition in Biomedicine", International Journal of Medical Informatics, vol. 67, pp 33-48, 2002.

[17] M. Roche, J. Azé, O. Matte-Tailliez, and Y. Kodratoff, "Mining texts by association rules discovery in a technical corpus," Proc. of IIPWM'04, Springer Verlag, pp. 89-98, 2004.

[18] M. Roche, J. Azé, Y. Kodratoff and M. Sebag, "Learning Interestingness Measures in Terminology Extraction. A ROC-based approach," Proc. of "ROC Analysis in AI" Workshop ECAI, pp 81-88, 2004.

[19] S. Rosset, "Model Selection via the AUC," Proc. of International Conference on Machine Learning (ICML), 2004.

[20] R.E. Schapire, "Theoretical views of boosting," Proc. of European Conference on Computational Learning Theory, pp. 1-10, 1999.

[21] M. Sebag, N. Lucas, and J. Azé, "ROC-based Evolutionary Learning: Application to Medical Data Mining," Proc. of International Conference on Artificial Evolution (EA), Springer Verlag, pp. 384-396, 2004.

[22] M. Sebag, N. Lucas, and J. Azé, "Impact studies and sensitivity analysis in medical data mining with ROC-based genetic learning," Proc. of IEEE International Conference on Data Mining (ICDM), pp. 637-640, 2003.

[23] F. Smadja, "Retrieving collocations from text: Xtract," Computational Linguistics, vol. 19, no. 1, pp. 143-177, 1993

[24] F. Smadja, K. R. McKeown, and V. Hatzivassiloglou, "Translating collocations for bilingual lexicons: A statistical approach," Computational Linguistics, vol. 22, n┬░1, pp. 1-38, 1996.

[25] V.N. Vapnik, "The Nature of Statistical Learning," Springer Verlag, 1995.

[26] J. Vivaldi and L. Marquez and H. Rodriguez, "Improving Term Extraction by System Combination Using Boosting," Lecture Notes in Computer Science, vol 2167, pp. 515-526, 2001.

[27] I.H. Witten, G.W. Paynter, E. Frank, C. Gutwin, and C.G. Nevill- Manning. Kea: Practical automatic keyphrase extraction. Proc. of DL '99, pp. 254-256, 1999.

[28] F. Xu, D. Kurz, J. Piskorski, and S. Schmeier, "A Domain Adaptive Approach to Automatic Acquisition of Domain Relevant Terms and their Relations with Bootstrapping," Proc. of LREC 2002, the third international conference on language resources and evaluation, 2002.