An Open Access Approach to Scientific Information Management at the Brazilian Agricultural Research Corporation

An Open Access Approach to Scientific Information Management at the Brazilian Agricultural Research Corporation

Patrícia Rocha Bello Bertin, Isaque Vacari, Victor Paulo Marques Simão, & Marcos Cezar Visoli

Embrapa

Fernando César Lima Leite

University of Brasília & Embrapa

Abstract: This article presents the experience of the Brazilian Agricultural Research Corporation (Embrapa)—a large state-owned company that plays an important global role in research, development, and innovation for tropical agriculture—in the planning and implementation of Open Access to scientific information in the context of a developing country. The aim of this initiative is to provide the necessary mechanisms to capture, store, organize, preserve, retrieve, and widely disseminate the scientific information produced by Embrapa and by agricultural research communities. This report concludes with a discussion of the obstacles encountered and the organizational features, cultural considerations, and political matters that facilitate open access implementation at Embrapa.

Keywords: Embrapa; Scientific communication; Scientific information management; Institutional repository

Patrícia Rocha Bello Bertin is an Assistant-Manager for Information Organization and Diffusion at Embrapa Technological Information, Parque Estação Biológica   PqEB   Avenida W3 Norte (final) Brasília, DF, CEP 70770-901, Brazil. Email: patrcia@sct.embrapa.br. Fernando César Lima Leite is a PhD candidate, Department of Information Science, University of Brasília, as well as Librarian and coordinator of the Open Access initiative at Embrapa. Email: fernandodfc@gmail.com. Isaque Vacari is Systems Analyst at Embrapa Agricultural Informatics, Avenida André Tosello, 209, Barão Geraldo, Campinas, SP, CEP 13083-886, Brazil. Email: isaque@cnptia.embrapa.br. Victor Paulo Marques Simão is a Librarian at Embrapa Environment, Rodovia SP 340, Km 127,5, Caixa Postal 69, Jaguariúna, SP, CEP 13820-000, Brazil. Email: victor@cnpma.embrapa.br. Marcos Cezar Visoli is a Systems Analyst at Embrapa Agricultural Informatics. Email: visoli@cnptia.embrapa.br.

Introduction

More than an emergent model for scientific communication, Open Access may be understood as a new paradigm for scientific information management in universities and research institutes. Open access philosophy, strategies, and tools take into consideration the major peculiarities of the science communication process and the needs of scientific information management in a digital environment (identifying, capturing, storing, organizing, digitally preserving, and, principally, widely disseminating information).

As long as one can offer the capacitating conditions for scientific information management, Open Access responds satisfactorily to the typical situation of universities and research institutes, especially those located in developing countries: namely, a situation where there is low visibility and poor governance mechanisms for research outputs, which are dispersed over several scientific periodicals—mostly restricted access journals—and conference proceedings worldwide. Moreover, the deficiencies or absence of information management tools appropriate to the context and environment of research institutes and universities obscures institutional origins of scientific production. As an antidote to these problems, the strategies and tools of Open Access enable institutional scientific production monitoring through information management processes that are aligned to scientific communication processes, both internal and external to the institution.

Considering this scenario, Embrapa has directed efforts to incorporation of the philosophy, strategies, and tools of Open Access. This article presents the current stage of open access implementation in this corporation as well as the strategies that were adopted. Additionally, this article outlines the relationship between the Embrapa facilities along with their objectives and explains the obstacles encountered due to cultural, political, and legal organizational matters.

Embrapa and its contribution to agricultural science

Embrapa was created in 1973 to “provide feasible solutions for the sustainable development of Brazilian agribusiness through knowledge and technology generation and transfer.” It has built a national and international reputation as a leading company for tropical agriculture research, development, and innovation (R&D&I) and is present in almost all 27 Brazilian Federative Units and the most diverse biomes through its 54 Central and Decentralized Units (see Figure 1): 38 of them devoted to research, 3 to services, and 13 to administration. The corporation has also intensified its international activities through the creation of Embrapa Virtual Laboratories, Labex, in the United States (http://www.embrapa.br/a_embrapa/labex/labex-estados-unidos/labex-usa), France (http://www.agropolis.fr/international/labex.html), and the Netherlands, as well as the Embrapa Business Offices Abroad in Ghana and Venezuela.

Figure 1: Distribution of Embrapa’s units throughout Brazil

To help build Brazil’s leadership in tropical agriculture, Embrapa has invested primarily in training: 24.8% of its researchers hold master’s degrees and 74.0% are PhDs. Embrapa also coordinates the National Agricultural Research System (SNPA, in Portuguese), composed of numerous federal- and state-owned institutions, universities, private companies, and foundations, that carries out research in agriculture and related areas in the different regions of the country. SNPA-generated technologies have solved century-old problems associated with the production, domestic supply, and insertion of foodstuff and fibres into international markets, as well as problems related to renewable energy. An 87% increase in land productivity from 1970 to 2006, achieved through the technological development of Brazilian agriculture, prevented the conversion of forested lands into farmed land. To attain the current level of agricultural production with the technology available decades ago would have required triple the grain farmed land, i.e., clearing 90 million hectares of forest. Such major preservation of natural resources is an invaluable contribution of Embrapa to the reduction of the global warming phenomenon (Embrapa, 2008).

Agricultural research outputs have now been considered a fundamental element for science and technology (S&T) planning in developing countries. That is especially true in Brazil, since agricultural science appears as the most important national contribution to global scientific production, with 4,139 papers produced and indexed in the Institute for Scientific Information - Web of Science between 2003 and 2007, which represents 4% of the whole world’s production. Embrapa is surely responsible for the majority of the agricultural research outputs in Brazil. Figure 2 shows the scientific production of Embrapa’s researchers indexed in the Web of Science between 2003 and 2007.

Figure 2: Embrapa’s scientific production indexed in the
Web of Science, 2003–2007

Source: Secretariat for Management and Strategy, Embrapa.

Table 1 was elaborated from data obtained institutionally; it quantitatively indicates the Embrapa’s general scientific production between 2000 and 2007 (complete conference papers, scientific journal articles, chapters of books, thesis and dissertation orientations, and short conference papers).

Table 1: Embrapa’s scientific production numbers, 2000–2007

Embrapa’s potential as a world leader in agricultural knowledge production is undeniable. Over the past few years, Embrapa has observed great qualitative and quantitative jumps in institutional scientific production, which were not accompanied by the appropriate evolution of information services. Therefore, implementing Open Access means a great advance with regard to scientific information management at Embrapa.

The institutional project for Open Access at Embrapa

Nevertheless, Embrapa still has a long way to proceed so that the scientific knowledge generated by the corporation is organized and easily recoverable through specific information systems and through the Web. It is still necessary to integrate the electronic publishing and the bibliographic cataloguing requirements into interoperable systems. Besides contributing to the management of Embrapa’s research outputs, the proposed strategy will benefit research itself by providing access to scientific knowledge produced externally.

For the reasons exposed above, and in accordance with the emerging paradigms on scientific information management and communication, Embrapa joins the open access movement. Project Open Access at Embrapa: Potentiating Research Impacts, Visibility and the Scientific Information Management, coordinated by the corporation’s decentralized units called Embrapa Technological Information and Embrapa Agricultural Informatics, integrates the portfolio of institutional projects. The main aim is to propose and implement a model for scientific information management based on open access statements and policies to support research and development activities.

As elsewhere in the world, Embrapa has adopted both strategies to Open Access: the so-called golden road, which promotes open access directly through the publishing of scientific journals edited within the organization, and the green road, through self-archiving scientific publications in the institutional repository.

This project’s specific objectives are: i) to describe Embrapa researchers’ scientific communication patterns; ii) to evaluate the features of institutional scientific outputs; iii) to adopt the Open Journal Systems (OJS) model as the management system for institutional scientific journals; iv) to build the institutional open access repository for capturing, storing, organizing, preserving, retrieving, and widely disseminating Embrapa’s scientific production; v) to identify and select external data providers that could be useful for Embrapa’s researchers; and vi) to build a network for open agricultural scientific information through the construction of a service provider (metadata harvester) that will collect information from different internal and external repositories and journals.

Embrapa’s Open Access Model comprises the following elements (see Figure 3):

  • Internal scientific information (data providers): composed scientific digital journals edited by the institution and by the institutional repository (essential element whose functions are to store, organize, preserve, retrieve, and widely disseminate the institution’s intellectual production).
  • External scientific information (data providers): canalizes all scientific production concerning the institutional fields of interest that are available in an open access environment and uses the OAI-PMH protocol.
  • Embrapa’s service provider: collects metadata that describe all the contents stored in the data providers, giving access to the institutional intellectual production and to external information sources.
  • Institutional self-archiving policy, considered the main motivation factor, for populates Embrapa’s institutional repository.

Figure 3: Open access model for Embrapa

Preliminary results

Preliminary results of the project are presented in the two following sections emphasizing the informational and technological aspects. The first section approaches the efforts directed to internal scientific information, represented by the scientific journals edited by Embrapa, the institutional repository and the policy of compulsory deposit. The second section takes into consideration the external scientific information as inputs to R&D&I activities at Embrapa. Scientific journals, institutional repositories, subject repositories, and digital libraries represent inputs from different external scientific communities of interest to the institution.

Internal scientific information

Scientific journals

Most Brazilian scientific journals, as is common in developing countries, have never been financially supported by journal subscriptions and are not associated with the business model of giant publisher conglomerates. Undoubtedly, this situation has favoured implementation of the golden road to Open Access in Brazil, especially considering the successful experience of the Brazilian Institute of Scientific and Technological Information (IBICT; www.ibict.br) in diffusing the Open Journals System (OJS), which has been adopted all around the country, and the pioneering Scientific Electronic Library Online (SciELO; www.scielo.org), which has recently embraced OJS.

At Embrapa it is not much different. Although there is no specific regulation, scientific journals edited by the corporation have always been strongly oriented to Open Access, taking advantage of the digital technologies. However, the “golden road” has actually been instituted after the adoption of OJS and the automation of editorial processes, guaranteeing interoperability and worldwide dissemination of publications.

The Brazilian Journal of Agricultural Research (PAB; http://seer.sct.embrapa.br/index.php/pab/index) implemented OJS in 2007, even before the idealization of an institutional model for Open Access, aiming at a solution for the journal’s electronic publishing management (see Figure 5). PAB is indexed in the Web of Science and demonstrates a hybrid model of publication: the printed version has been accompanied by the journal’s open access electronic version since 1997. From October 2007 until now, all publishing procedures are conducted through OJS, and a project for the migration of the “ancient” electronic version (1991–2007) and the digitization of volumes previous to 1991 is currently in process. The journal was still indexed by SciELO in 1999.

Figure 4: Brazilian Journal of Agricultural Research (PAB) in OJS

Besides PAB, Embrapa publishes other scientific journals: the Science & Technology Journal (CC&T), Text for Discussion, the Journal of Agricultural Policy, the Brazilian Journal of Oleaginous and Fibrous Plants. At first, the objective is to centralize the journals’ management through the creation of a unique scientific journal portal using OJS. Beginning from OJS adoption experience at PAB, the implementation of OJS in the other scientific journals published by the institution is being planned.

It is interesting to notice that OJS has been requested to solve other institutional needs. Each one of Embrapa’s 38 research units has a Local Publication Committee (LPC) whose roles are to receive, evaluate, process, and organize manuscripts; to guide the employees in terms of the journals’ requirements for manuscript submission; and to accompany the process until publication. Embrapa Technological Information is responsible for the supervision of the LPCs and is currently assisting with the automation of these committees’ workflow to improve publishing processes and publications dissemination.

Institutional repository

One of the features that differentiate institutional repositories from other information services is the possibility of addressing the researchers’ needs, such as the demand for increasing the visibility of research outputs, a service that in the past was only slightly performed by the libraries, besides appropriately managing scientific information.

Before developing an institutional repository, it is highly recommended to evaluate the existing information organization and retrieval systems, to correctly integrate systems and organizational processes. Such activity is essential, especially in large institutions. The survey conducted at the Embrapa environment revealed the existence of a number of systems for information organization and retrieval and pointed out the need for integrating processes—and thus technological platforms—with the Embrapa System of Libraries (SEB, in Portuguese) and its system for libraries management, Informática Agropecuária (Ainfo). SEB is composed by 40 libraries, 39 of which are located in research units and one located in the central administrative unit. There are 66 information professionals distributed in these libraries. The whole system is coordinated by the Assistant Manager for Information Organization and Diffusion at Embrapa Technological Information, whose principal role is to select, organize, and disseminate the scientific and technological information acquired and produced by Embrapa to society in general. Ainfo (http://www.ainfo) for its part, is the informational system for SEB management, developed and maintained by the Open Source Laboratory at Embrapa Agricultural Informatics. This software allows the creation and maintenance of bibliographic databases, management of collections, and automation of library routines.

An important additional function of Ainfo is to support the process of performance evaluation of Embrapa’s research units, which is carried out by the Secretariat for Management and Strategy with assistance of the SEB. One of the standards through which the research units are evaluated is the achievement of quantitative targets related to scientific production. The proof is obligatory, and the information professionals at the libraries are responsible for obtaining the researchers’ scientific production and cataloguing it in Ainfo to support the evidence of goal attainment. Thus, integration with SEB and Ainfo was found to be strategic for the success of the Open Access Model being implemented, since the obligation of evidencing scientific production is already a policy of compulsory deposit.

Embrapa’s institutional repository is one of the main components of the Open Access Model. It functions to put together, store, organize, retrieve, and, finally, broadly disseminate institutional scientific information. In the repository, all central and decentralized units of Embrapa will be represented.

Since there was no digital environment for scientific production organization and storage until recently, it is common at Embrapa that each research unit publishes electronic archives in their homepages without much care. Because of this, software was developed to capture all scientific production available in the 39 homepages of Embrapa’s research units and automatically store these files in the institutional repository. This first approach created a repository with 6,491 items.

The integration with SEB and Ainfo brings some peculiarities to Embrapa’s institutional repository, especially the role displayed by information professionals as moderators for the documents’ deposit in the repository, given the fact that the Open Access Model does not advocate self-archiving. All scientific production was already monitored and catalogued in a bibliographic database at Ainfo, and these processes were incorporated into the deposit flow for the institutional repository as detailed below:

  • Researchers deliver a scientific article in the library, in person or electronically.
  • Librarians upload the electronic file to the institutional repository and fill in only the title and the name of one author. AURL for the document is automatically generated in the institutional repository.
  • Librarians catalogue the article completely in Ainfo and include theURL of the electronic file generated by the institutional repository in a specific field.
  • All metadata that describes the article and was filled in in Ainfo is automatically migrated and updates the metadata of the article in the institutional repository.

The integration between the institutional repository and Ainfo prevents duplicity in cataloguing the same document, semi-automates the activity of depositing documents, decentralizes the filling out of documents, links the deposit of scientific production to the processes and tools for institutional performance evaluation, and creates the operational basis of the policy of compulsory deposit.

At present the institutional repository is an experimental installation of DSpace (1.5.2 version) with access restricted to Embrapa’s internal users. The definitive software version will be refined according to desirable functionalities such as Manakin’s interface and some add-ons developed by the University of Minho (Portugal) related to embargo operation and the statistical packpage. A detailed analysis is currently in process, and several versions of DSpace are being evaluated before final decision-making.

Building the self-archiving policy

The policy component is considered the main motivation factor for populating institutional repositories around the world. At Embrapa, the institutional policy on self-archiving research outputs will follow the international recommendations. At present, a draft policy is being proposed and defended for mandatory deposit of scientific production in the institutional repository.

Registering the scientific production of researchers is a procedure already established at Embrapa that is related to the performance evaluation processes. Internal policies obligate researchers to communicate and provide proof of their scientific production. The self-archiving policy will thus be accomplished through a single one-step process. The elaboration of the self-archiving policy is being discussed with the Research and Development Department and the Legal Advisory Office (DPD and AJU, respectively, in Portuguese), central units of Embrapa subordinate to the president of Embrapa.

External scientific information

The Open Access Model adopted at Embrapa deals with both information resulting from R&D activities and the information researchers need to do their job. That is, besides allowing scientific information management and enhancing scientific information visibility through the institutional repository, the model also systematizes access to external scientific information on the institution’s fields of interest.

In anticipation of offering fast and easy access to the different external open access data providers, a service provider is being constructed through a metadata harvester software. The software choice resulted from an analysis comprising the following stages:

  • Identification of open source metadata harvesting tools that use the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PHM). Six tools were identified: PKP Metadata Harvester (http://pkp.sfu.ca), MOD OAI (www.modoai.org), OAI Harvester (http://webservices.itcs.umich.edu/mediawiki/dlxs14/index.php/OAI_Harvester), OAI Arc (http://oaiarc.sourceforge.net), JOAI Harvester (www.dlese.org/dds/services/joai_software.jsp), and OAI Harvester OCLC (www.oclc.org/research/software/oai/harvester2.htm).
  • Analysis of the softwares identified, considering i) the frequency of update and availability of new versions; ii) the Web interface and/or command line for data collection; iii) the Web interface for searching the data collected; iv) the documentation and the support offered for setting and installing the metadada collection tool. After this stage, the PKP Metadata Harvester, JOAI Harvester, and OAI Arc were pre-selected.
  • Setting and installation of pre-selected softwares.
  • Testing the performance of the data providers and evaluating the functionalities available. The following aspects were considered: the administrator interface for data collection; the format of metadata supported for the collection; the data collected; the storage format for data collected; the log registered for data collected; the data collection for community/collection; the data collection for more than two digital repositories simultaneously; recollecting data from digital repositories collected before; collecting/recollecting data using command line; locking access to data collection module; activating/inactivating the digital repository for data collection; searching parameters and interface; indexation and search mechanisms; search by digital repository/collection; search by Boolean operators; grouping search results (facets); alternatives for classifying search results; the filter options for search results; browsing digital repositories and other ways of navigation; visualizing the original register; highlighting the search result that was encountered; re-indexing database for search; bugs.

The JOAI Harvester was chosen because it addressed Embrapa’s needs for the data provider creation. The service provider systematizes the flow and canalizes all scientific production in areas of institutional interest that is available in the open access environment based on the OAI-PMH protocol. A preliminary survey identified a total of 261 data providers in areas of interest to Embrapa. Among them, 52 are national journals, 74 are foreign scientific journals, 27 are institutional and thematic repositories, 4 are conferences’ repositories, and 104 are national and foreign journals available on SciELO. Providing access to the entire intellectual output of the institution and to external sources of information is the goal of the institutional service provider. Through a single interface, internal and external users will be able to search in all journals and institutional and subject repositories previously selected and collected in accordance with the policies established. The service provider will be the element responsible for integrating all internal and external data providers (institutional repositories, scientific journals, digital libraries, and others). Due to interoperability standards, it is expected that at a later stage, the service provider will be integrated with other scientific information services, such as catalogues and other databases of Embrapa.

Final considerations

Some characteristics of Embrapa certainly affect the planning and design of new information services. These characteristics sometimes favour and sometimes hinder the implementation of open access strategies. Although Embrapa’s Open Access Model has been built taking into consideration institutional specificities, its implementation is still directly influenced by cultural, political, legal, and organizational peculiarities.

Some obstacles to the implementation of strategies for open access are related to the fact that Embrapa is a research institution of great size, to the coexistence of different areas of knowledge and scientific cultures, to the geographically dispersed decentralized units, and to regional influences and the need for a decentralized management model. These characteristics give the institution a heterogeneity and complexity in terms of modes of knowledge production, scientific communication patterns of researchers, information flows, and information management processes. Therefore, such a heterogeneous and complex institutional environment requires open access strategies that accommodate different needs and perspectives.

Moreover, the recent international expansion experienced by Embrapa, with the establishment of new laboratories abroad and transnational cooperation projects, requires improved international visibility of the corporation’s research outputs. One of the main benefits of Open Access is to increase the visibility and impact of research outputs, and no doubt its adoption is a strategic action for Embrapa’s international expansion and performance. Similarly, conditions such as an organizational culture oriented to the intensive production of knowledge, an aggressive program of training and development of human resources in Brazil and abroad, a system for institutional performance evaluation that is founded on the pillars of scientific production, and the lack of digital environments for joining together, organizing, and disseminating research outputs make Embrapa’s environment a favourable one for Open Access.

Finally, linear models of the science communication process, with limits, actors, and their respective roles and responsibilities clearly established, are not sufficient to represent the complexity and the diversity of the scientific communication system. In the first instance, the philosophy and strategies of Open Access provide the necessary conditions for the reform of the scientific communication system and the management of scientific institutional information, and they also provide an open network within which this information can be broadly shared. Embrapa has realized that its R&D&I activities are not properly supported by the traditional systems of communication and information management, considering the emergent new digital scenario. Accordingly, the institution has been incorporating the assumptions of Open Access and adopting the necessary strategies to implement it.

References

Ainfo. (Informática Agropecuária). [Webpage]. URL:  http://www.ainfo.cnptia.embrapa.br/index.php/P%C3%A1gina_principal .

Brazil. Ministry of Agriculture, Livestock and Supply. Embrapa. Agricultural Informatics. Ainfo. [Webpage]. URL: http://www.ainfo.cnptia.embrapa.br/index.php/P%C3%A1gina_principal .

Brazilian Institute of Scientific and Technological Information (IBICT). [Website]. URL: http://www.ibict.br .

Brazilian Journal of Agricultural Research (PAB). [Online journal]. URL: http://seer.sct.embrapa.br/index.php/pab/index .

Embrapa. [Website]. URL: http://www.embrapa.br .

Embrapa. (2008). V Plano Diretor da Embrapa: 2008-2011-2023. Brasília: Secretaria de Gestão e Estratégia.

Embrapa Labex Europe. [Website, France]. URL: http://www.agropolis.fr/international/labex.html .

JOAI Harvester. [Website]. URL: www.dlese.org/dds/services/joai_software.jsp .

MOD OAI. [Website]. URL: www.modoai.org .

OAI Arc. [Website]. URL: http://oaiarc.sourceforge.net .

OAI Harvester. [Website]. URL: http://webservices.itcs.umich.edu/mediawiki/dlxs14/index.php/OAI_Harvester .

OAI Harvester OCLC  [Website]. URL: www.oclc.org/research/software/oai/harvester2.htm .

PKP Metadata Harvester. [Website]. URL: http://pkp.sfu.ca .

Scientific Electronic Library Online (SciELO). [Website]. URL: http://www.scielo.org .

CCSP Press
Scholarly and Research Communication
Volume 1, Issue 1, Article ID 010102, 12 pages
Journal URL: www.src-online.ca
Received July 20, 2009, Accepted September 2, 2009, Published December 21, 2009

Bertin, Patrícia Rocha Bello, Leite, Fernando César Lima, Vacari, Isaque, Victor, Simão Paulo Marques, Visoli, Marcos Cezar. (2010). An Open Access Approach to Scientific Information Management at the Brazilian Agricultural Research Corporation. Scholarly and Research Communication, 1(1): 010102, 12 pp.

© 2010 Patrcia Rocha Bello Bertin, Fernando Csar Lima Leite, Isaque Vacari, Simo Paulo Marques Victor, Marcos Cezar Visoli. This Open Access article is distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc-nd/2.5/ca), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Refbacks

  • There are currently no refbacks.