Referências em Informática aplicada à Biodiversidade


Biodiversity Informatics References

Alberga, C. N. 1967. String similarity and misspellings. Communications of ACM 10(5):302-313.

Baeza-Yates, R., and G. Navarro. 1996. A faster algorithm for approximate string matching in Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching, (D. Hirshberg, and G. Meyers (eds.)), Laguna Beach, California, USA.

Ballou, D. P., and H. L. Pazer. 1985. Modeling Data and Process Quality in Multi-Input, Multi-Output Information System. Management Science 31(2):150-162.

Ballou, D. P., R. Wang, H. Pazer, and G. K. Tayi. 1998. Modeling information manufacturing systems to determine information product quality. Management Science 44(4):462-484.

Bossy, R. 2001. An Edition Control Policy Model for Scientific Collaborative Databases in Proceedings of the Second International Conference on Web Information Systems Engineering (WISE'01), (M. T. Özsu, H. Schek, K. Tanaka, Y. Zhang, and Y. Kambayashi (eds.)), IEEE Computer Society Press. Kyoto, Japan. 1:134-141.

Bossy, R. 2002. Édition coopératice de bases de données scientifiques. Laboratorie Informatique et Systématique, École doctorale Logique du Vivant, UFR Siences de la Nature et de la Vie. Université Pierre et Marie Curie. Doctorat.

Brill, E., and R. C. Moore. 2000. An improved error model for noisy channel spelling correction in Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL 2000), Hong Kong, China.

Cavnar, W. B., and J. M. Trenkle. 1994. N-Gram-based text categorization in Proceedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval (SDAIR-94).

Chen, P. P. 1976. The entity-relationship model - toward a unified view of data. ACM Trans. Database Syst. 1(1):9-36.

Codd, E. F. 1970. A Relational Model of Data for Large Shared Data Bank. Communications of ACM 13(6):377-387.

Damerau, F. J. 1964. A technique for computer detection and correction of spelling errors. Communications of the ACM 7(3):171-176.

Dey, D., V. C. Storey, and T. M. Barron. 1999. Improving database design through the analysis of relationships. ACM Trans. Database Syst. 24(4):453-486.

Ehm, M. G., M. Kimmel, and R. W. Cottingham Jr. 1996. Error detection for genetic data using likelihood methods. American Journal of Human Genetics 58(1).

Embury, S. M. 2001. Data quality issues in information systems. Research Report. Cardiff University, School of Computer Science. Cardiff. 41 pp.

Engels, R., and C. Theusinger. 1998. Using a data metric for preprocessing advice for data mining applications in Proceedings of the ECAI 98 - 13th European Conference on Artifical Intelligence, (H. Prade (ed.) John Wiley & Sons, Ltda..5.

English, L. P. 1999. Improving data warehouse and business information quality: methods for reducing costs and increasing profits. John Wiley & Sons, Inc., New York. 518 pp.

Fayyad, U., and R. Uthurusamy. 1996. Data Mining and Knowledge Discovery in Databases. Communications of the ACM 39(11):24-26.

Fayyad, U., D. Haussler, and P. Stolorz. 1996. Mining scientific data. Communications of ACM 39(11):51-57.

Fayyad, U., G. Piatetsky-Shapiro, and P. Smyth. 1996. The KDD Process for extracting useful knowledge from volumes of data. Communications of the ACM 39(11):27-34.

Fisher, C. W., and B. R. Kingma. 2001. Criticality of data quality as exemplified in two disasters. Information & Management 39(2001):109-116.

Fürnkranz, J. 1998. A study using n-gram features for text categorization. Research Report. Report Number OEFAI-TR-98-30. Austrian Research Institute for Artificial Intelligence. Wien. 10 pp.

Gadd, T. N. 1988. 'Fisching fore werds': phonetic retrieval of written text in information systems. Program 22(3):222-237.

Gadd, T. N. 1990. Phonix: The algorithm. Program 24(4):363-366.

Galhardas, H., D. Florescu, D. Shasha, and E. Simon. 1999. An Extensible Framework for Data Cleaning. Technical Report. Report Number RR-3742. INRIA.

Galhardas, H., D. Florescu, D. Shasha, and E. Simon. 2000. AJAX: An extensible data cleaning tool in Proceedings of the SIGMOD 2000, ACM. Dallas, TX, USA:590-590.

Galhardas, H., D. Florescu, D. Shasha, E. Simon, and C. Saita. 2001. Declarative Data Cleaning: Language, Model, and Algorithms. Rapport de recherche. Report Number 4149. INRIA - Institut National de Recherche en Informatique et en Automatique. 37 pp.

Gallaire, H., J. Minker, and J. Nicolas. 1984. Logic and database: A deductive approach. ACM Computing Surveys 16(2):153-185.

Galli, E. J., and H. Yamada. 1967. An automatic dictionary and the verification of machine-readable text. IBM Systems Journal 6(3):192-207.

Gertz, M., and I. Schmitt. 1998. Data integration techniques based on data quality aspects in Proceedings of the Workshop Föderierte Datenbanken, (I. Schmitt, C. Türker, E. Hildebrandt, and M. Höding (eds.)):1-9.

Golding, A. R., and D. Roth. 1996. Applying winnow to context-sensitive spelling correction in Proceedings of the 13th International Conference on Machine Learning (ICML-96), Bari, Italy:9.

Golding, A. R., and Y. Schabes. 1996. Combining Trigram-based and Feature-based Methods for Context-Sensitive Spelling Correction. Research Report. Mitsubishi Eletronic Research Laboratories. Cambridge, MA. 71-78 pp.

Han, J., and M. Kamber. 2001. Data mining: concepts and techniques. Morgan Kaufmann Publishers, San Francisco, CA. 550 pp.

Harding, S. M., W. B. Croft, and C. Weir. 1997. Probabilistic retrieval of OCR degraded text using N-Grams in Proceedings of the European Conference on Digital Libraries (ECDL'97), (C. Peters, and C. Thanos (eds.)), Pisa, Italy:345-359.

Hernandez, M. A., and S. J. Stolfo. 1995. The merge/purge problem for large databases in Proceedings of the SIGMOD' 95, ACM. San Jose, CA, USA:127-138.

Hernandez, M. A., and S. J. Stolfo. 1998. Real-world data is dirty: data cleaning and the merge/purge problem. Journal of Data Mining and Knowledge Discovery 2(1):9-37.

Hodge, V. J., and J. Austin. 2001. An evaluation of phonetic spell checkers. Research Report. Report Number YCS 338. Department of Computer Science, University of York. York. 1-8 pp.

Holmes, D., and M. C. McCabe. 2002. Improving Precision and Recall for Soundex Retrieval in Proceedings of the 2002 IEEE International Conference on Information Technology - Coding and Computing (ITCC), Las Vegas, Nevada, USA.

Huffman, S. 1995. Acquaintance: Language-independent document categorization by N-Grams in Proceedings of the 4th Text Retrieval Conference - TREC-4, (D. K. Harman (ed.) Department of Commerce, National Institute of Standards and Technology. Gaithersburg, Maryland:359-372.

Jain, A.K., Murty, M.N. and Flynn, P.J. 1999. Data Clustering: A Review. ACM Computing Surveys, Vol. 31, No. 3, September 1999.

Jin, L., C. Li, and S. Mehrotra. 2002. Efficient similarity strings joins in large data sets in Proceedings of the 28th Very Large Databases Conference, Hong Kong.

Kahn, B. K., D. M. Strong, and R. Y. Wang. 2002. Information Quality Benchmarks: Product and Service Performace. Communications of ACM 45(4):184-192.

Kesh, S. 1995. Evaluating the quality of entity relationship models. Information and Software Technology 37(12):681-689.

Klein, B. D. 2001. Detecting errors in data: Clarification of the impact of base rate expectations and incentives. Omega 29:391-404.

Kleinz, T. 2005. World of Knowledge in Linux Magazine. Issue 51. Pg.84-86

Kukich, K. 1992. Techniques for automatically correcting words in text. ACM Computing Surveys 24(4):377-439.

Lee, M. L., H. Lu, T. W. Ling, and Y. T. Ko. 1999. Cleansing Data for Mining and Warehousing. Research Report. School of Computing, National University of Singapore. Singapore. 10 pp.

Lee, M. L., T. W. Ling, and W. L. Low. 2000. IntelliClean: A knowledge-based intelligent data cleaner in Proceedings of the KDD 2000, ACM. Boston:5.

Lee, Y. W., D. M. Strong, B. K. Kahn, and R. Y. Wang. 2002. AIMQ: a methodology for information quality assessment. Information & Management 40(2002):133-146.

Levitin, A., and T. C. Redman. 1995. Quality Dimensions of a Conceptual View. Information Processing and Management 31(1):81-88.

Lowrance, R., and R. A. Wagner. 1975. An extension of the string-to-string correction problem. Journal of the Association for Computing Machinery 22(2):177-183.

Maletic, J. I., and A. Marcus. 1999. Progress report on automated data cleansing. Technical Report CS-99-02. Report Number CS-99-02. The Department of Mathematical Sciences Division of Computer Science, The University of Memphis. Memphis, TN. 13 pp.

Maletic, J. I., and A. Marcus. 2000. Automated identification of errors in data sets. Research Report. Report Number Technical Report CS-00-02. The Department of Mathematical Sciences Division of Computer Science, The University of Memphis. Memphis. 25 pp.

Maletic, J. I., and A. Marcus. 2000. Data Cleansing: Beyond integrity Analysis. Research Report. Report Number IQ2000. Division of Computer Science, Department of Mathematical Sciences, The University of Memphis. Memphis. 10 pp.

Marcus, A., and J. I. Maletic. 2000. Utilizing association rules for the identification of errors in data. Technical Report. Report Number CS-00-04. Division of Computer Science, Department of Mathematical Sciences, The University of Memphis. Memphis. 20 pp.

Marcus, A., J. I. Maletic, and K. I. Lin. 2001. Ordinal Association Rules for Error Identification in Data Sets. Report Number PaperID: 257. Division of Computer Science, Department of Mathematical Sciences, The University of Memphis. Memphis, TN. 15 pp.

Marteleto, R. M. 2001. Análise de redes sociais - Aplicaçao nos estudos de transferência da informaçao. Ci.Inf. 30(1):71-81.

Martin, S., J. Liermann, and H. Ney. 1995. Algorithms for bigram and trigram world clustering in Proceedings of the Europ. Conf. on Speech Communication and Technology, Madrid, Spain:1253-1256.

Martinho, C. 2003. Redes - uma introduçao às dinâmicas da conectividade e da auto-organizaçao, 1 ediçao. WWF - Brasil, Brasília - DF.

Mathieu, R. G., and O. Khalil. 1998. Data Quality in the Database Systems Course. Data Quality Journal 4(1):12.

Michelis, G., E. Dubois, M. Jarke, F. matthes, J. Mylopoulos, M. P. Papazoglou, K. Pohl, J. Schmidt, C. Woo, and E. Yu. 1997. Cooperative Information System: A Manifesto in Cooperative Information System: Trends & Directions (M. P. Papazoglou, and G. Schlageter, eds.). Academic Press.

Model, F., T. König, C. Piepenbrock, and P. Adorján. 2002. Statistical process control for large scale microarray experiments. Bioinformatics 1(1):1-9.

Moerkotte, G., and P. C. Lockemann. 1991. Reactive consistency control in deductive databases. Communications of ACM 16(4):670-729.

Monge, A. E. 2000. An Adaptative and Efficient Algorithm for Detecting for Detecting Approximately Duplicate Database Records. Research Report. California State University, Long Beach, CECS Department. Long Beach, CA. 17 pp.

Monge, A. E. 2000. Matching Algorithms Within a Duplicate Detection System. Bulletin of the Technical Committee on Data Engineering 23(4):14-20.

Mora, S. L., and M. Palomar. 2001. Reducing Inconsistency in Integrating Data From Different Sources in Proceedings of the International Database Engineering & Applications Symposium, IEEE Computer Society Press.209 - 218.

Morgan, H. L. 1970. Spelling correction in systems programs. Communications of ACM 13(2):90-94.

Motro, A. 1989. Integrity = Validity + Completeness. ACM Trans. Database Syst. 14(4):480-502.

Motro, A., and I. Rakov. 1996. Estimating the Quality of Data in Relational Databases in Proceedings of the 1996 Conference on Information Quality:94-106.

Motro, A., and I. Rakov. 1998. Estimating the quality of databases. Lecture Notes in Computer Science 1495:298-308.

Möller, E. 2004. Collective Authoring in Linux Magazine. Issue 42. Pg.54-59

Naumann, F. 2001. From Database to Information Systems - Information Quality Makes the Difference. Research Report. IBM Almaden Research Center. San Jose, CA. 17 pp.

Navarro, G. 2001. A guide tour to approximate string matching. ACM Computing Surveys 33(1):31-88.

Navarro, G., R. Yates-Baeza, and J. M. A. Arcoverde. 2001. Matchsimile: A flexible approximate matching tool for personal names searching in Proceedings of the SBBD' 01, Sao Paulo, SP, Brazil:228-242.

Orr, K. 1998. Data Quality and Systems Theory. Communications of ACM 41(2):66-71.

Peterson, J. L. 1980. Computer programs for detecting and correcting spelling errors. Communications of the ACM 23(12):676-687.

Peterson, J. L. 1986. A note on undetected typing errors. Communications of ACM 29(7):633-637.

Petrakis, E. G. M., and K. Tzeras. 2000. Similarity searching in the CORDIS text database. Software Practice and Experience 13(30):1447-1464.

Pfeifer, U., T. Poersch, and N. Fuhr. 1995. Searching Proper Names in Databases in Proceedings of the Hypertext - Information Retrieval - Multimedia, Synergieeffekte elektronischer Informationssysteme - HIM '95, Universitätsverlag Konstanz. Konstanz:259-276.

Pfeifer, U., T. Poersch, and N. Fuhr. 1996. Retrieval effectiveness of proper name search methods. Information Processing and Management. 32(6):667-679.

Piattini, M., M. Genero, C. Calero, M. Polo, and F. Ruiz. 2000. Database Quality. Pages 485-509 in Advanced Database Technology and Design Artech House, Inc.

Pipino, L. L., Y. W. Lee, and R. Y. Wang. 2002. Data Quality Assessment. Communications of ACM 45(4):211-218.

Pollock, J. J., and A. Zamora. 1984. Automatic spelling correction in scientific and scholarly text. Communications of ACM 27(4):358-368.

Porter, M. F. 1980. An algorithm for suffix stripping. Program 14(3):130-137.

Raghavan, V. V., S. G. Jung, and P. Bollmann. 1989. A critical investigation of recall and precision as measures of retrieval system performance. ACM Transactions on Information Systems 7(3):205-229.

Rahm, E., and H. H. Do. 2000. Data Cleaning: Problems and Current Approaches. Bulletin of the Technical Committee on Data Engineering 23(4):3-13.

Raman, V., and J. M. Hellerstein. 2001. Potter's Wheel: An Interactive Data Cleaning System in Proceedings of the 27th VLDB Conference, Roma, Italy.

Redman, T. C. 1996. Data Quality for the Information Age. Artech House, Inc. 303 pp.

Redman, T. C. 1998. The Impact of Poor Data Quality on the Typical Enterprise. Communications of ACM 48(2):79-82.

Roughton, K. G., and D. A. Tyckoson. 1985. Browsing with sound: Sound-based codes and automated authority control. Information Technology and Libraries 4(2):130-136.

Sennhauser, R. 1993. Improving the recognition accuracy of text recognition systems using typographical constrains. Eletronic Publishing 6(3):273-282.

Shen, H., and P. Dewan. 1992. Access Control for CollaborativeEnvironments in Proceedings of the ACM CSCW'92 Conference on Computer-Supported Cooperative Work, Toronto, Ontario, Canada:51-58.

Singh, L., P. Scheuermann, and B. Chen. 1997. Generating Association rules from semi-structured documents using an extended concept hierarchy in Proceedings of the 6th International Conference on Information and Knowledge Management (CIKM'97), ACM. Las Vegas, Nevada:193-200.

Souza, M. I. F., L. G. Vendrusculo, and G. C. Melo. 2000. Metadados para a descriçao de recursos de informaçao eletrônica: utilizaçao do padrao Dublin Core. Ci.Inf. 29(1):93-102.

Storey, V. C., and R. Y. Wang. 1998. Modeling Quality Requirements in Conceptual Database Design in Proceedings of the Conference on Information Quality:64-87.

Strong, D. M., Y. W. Lee, and R. Y. Wang. 1997. 10 Potholes in the Road to Information Quality. IEEE:38-46.

Strong, D. M., Y. W. Lee, and R. Y. Wang. 1997. Data quality in context. Communications of ACM 40(5):103-110.

Tayi, G. K., and D. P. Ballou. 1998. Examining Data Quality. Communications of ACM 41(2):54-57.

Tu, S. Y., and R. Y. Wang. 1993. Modeling Data Quality and Context Trough Extension of the ER Model in Proceedings of the WITS-'93 Conference Proceedings, Orlando, Florida.

Vermeer, B. H. P. J. 2000. How important is data quality for evaluating the impact of EDI on global supply chains? in Proceedings of the 33rd Hawaii International Conference on System Sciences, IEEE Computer Society Press.

Viégas, F. B., M. Wattenberg, and K. Dave. 2004. Studying cooperation and conflict between authors with history flow visualizations in Proceedings of the Conference on Human Factores in Computer Systems - CHI, ACM. Vienna, Austria:1-8.

Wagner, R. A. 1974. The string-to-string correction problem. Journal of the Association for Computing Machinery 21(1):168-173.

Wand, Y., and R. Y. Wang. 1996. Anchoring Data Quality Dimensions in Ontological Foundations. Communications of ACM 39(11):86-95.

Wang, J. T. L., Q. H. Ma, D. Shasha, and C. H. Wu. 2000. Application of neural networks to biological data mining: A case study in protein sequence classification in Proceedings of the KDD 2000, ACM. Boston.

Wang, R. Y., H. B. Kon, and S. E. Madnick. 1993. Data Quality Requirements Analysis and Modeling in Proceedings of the Ninth International Conference of Data Endineering, Vienna, Austria.

Wang, R. Y., M. P. Reddy, and H. B. Kon. 1995. Toward quality data: An attribute-based approach. Decision Suport System 13(1995):349-372.

Wang, R. Y., V. C. Storey, and C. P. Firth. 1995. A framework for analysis of data quality research. IEEE Transactions on Knowledge and Data Engineering 7(4):623-639.

Winkler, W. E. 2001. Quality of Very Large Databases. Research Report. Report Number RR2001/04. U.S. Bureau of the Census, Methodology and Standards Directorate, Statistical Research Division. Washington D.C. 12 pp.

Wong, C. K., and A. K. Chandra. 1976. Bounds for the string editing problem. Journal of the Association for Computing Machinery 23(1):13-16.

Wright, P. 1998. Knowledge discovery preprocessing: Determining record usability in Proceedings of the The 1998 36th Annual Southeast Conference, ACM, New York, NY, (USA). Marietta, GA, USA.

Zaïane, O. R. 1999. Introduction to Data Mining. Research Report. Report Number CMPUT690. University of Alberta, Department of Computing Science. 15 pp.

Zamora, E., J. J. Pollock, and A. Zamora. 1981. The use of trigram analysis for spelling error detection. Information and Processing Management 17(6):305-316.

Zobel, J., and P. Dart. 1996. Phonetic String Matching: Lessons from Information Retrieval in Proceedings of the 19th International Conference on Research and Development in Information Retrieval, (H. P. Frei, D. Harman, P. Schäble, and R. Wilkinson (eds.)), ACM Press. Zurich, Switzerland:166-172.