Difference between revisions of "Dataverse"

From COPTR
Jump to navigation Jump to search
(Updated information further since it hasnt been updated since 2013.)
m (Fixed incorrect links)
Line 1: Line 1:
 
{{Infobox_tool
 
{{Infobox_tool
|purpose=The Dataverse is an open source ([https://github.com/IQSS/dataverse.org code is available on GitHub]) web application to share, preserve, cite, explore and analyze research data.
+
|purpose=The Dataverse is an open source ([https://github.com/IQSS/dataverse code is available on GitHub]) web application to share, preserve, cite, explore and analyze research data.
 
|image=
 
|image=
 
|homepage=http://dataverse.org/
 
|homepage=http://dataverse.org/
Line 16: Line 16:
  
 
= Description =
 
= Description =
'Dataverse' is an open source ([https://github.com/IQSS/dataverse.org code is available on GitHub]) web application to share, preserve, cite, explore and analyze research data. It facilitates making data available to others, and allows you to replicate others' work ([http://dataverse.org/about/see their About page]). Researchers, data authors, publishers, data distributors, and affiliated institutions all receive appropriate credit via a data citation with a persistent identifier (DOI, or Handle).
+
'Dataverse' is an open source ([https://github.com/IQSS/dataverse code is available on GitHub]) web application to share, preserve, cite, explore and analyze research data. It facilitates making data available to others, and allows you to replicate others' work ([http://dataverse.org/about/ see their About page]). Researchers, data authors, publishers, data distributors, and affiliated institutions all receive appropriate credit via a data citation with a persistent identifier (DOI, or Handle).
  
 
A Dataverse repository hosts multiple dataverses ([http://guides.dataverse.org/en/4.0/_images/Dataverse-Diagram.png see diagram]). Each dataverse contains dataset(s) or other dataverses, and each dataset contains descriptive metadata and data files (including documentation and code that accompany the data - [http://guides.dataverse.org/en/4.0/_images/DatasetDiagram.png see diagram]).  
 
A Dataverse repository hosts multiple dataverses ([http://guides.dataverse.org/en/4.0/_images/Dataverse-Diagram.png see diagram]). Each dataverse contains dataset(s) or other dataverses, and each dataset contains descriptive metadata and data files (including documentation and code that accompany the data - [http://guides.dataverse.org/en/4.0/_images/DatasetDiagram.png see diagram]).  
Line 28: Line 28:
 
====Platform and interoperability====
 
====Platform and interoperability====
 
The Dataverse makes use of the following components: Java Server Faces; Enterprise Java Beans; PostgreSQL; Solr; and R and Zelig.
 
The Dataverse makes use of the following components: Java Server Faces; Enterprise Java Beans; PostgreSQL; Solr; and R and Zelig.
Prerequisites for installation include Oracle JDK or OpenJDK 1.7.x, a “virgin” installation of Glassfish Version 4.1+, preferably as part of the NetBeans Web Development bundle, PostgreSQL Version 9.3+, and R.
+
Prerequisites for installation include Oracle JDK or OpenJDK, a “virgin” installation of Glassfish Version 4.1+, preferably as part of the NetBeans Web Development bundle, PostgreSQL Version 9.3+, and R.
 
The software was designed to integrate reCAPTCHA, Google Analystics, ImageMagick, Shibboleth, and DOI registration via EZID if the installer so wishes.
 
The software was designed to integrate reCAPTCHA, Google Analystics, ImageMagick, Shibboleth, and DOI registration via EZID if the installer so wishes.
 
The Dataverse currently has [http://guides.dataverse.org/en/latest/api/index.html multiple open APIs available], which allow for searching, depositing and accessing data.
 
The Dataverse currently has [http://guides.dataverse.org/en/latest/api/index.html multiple open APIs available], which allow for searching, depositing and accessing data.

Revision as of 14:34, 5 January 2016

The Dataverse is an open source (code is available on GitHub) web application to share, preserve, cite, explore and analyze research data.
Homepage:http://dataverse.org/


Description

'Dataverse' is an open source (code is available on GitHub) web application to share, preserve, cite, explore and analyze research data. It facilitates making data available to others, and allows you to replicate others' work (see their About page). Researchers, data authors, publishers, data distributors, and affiliated institutions all receive appropriate credit via a data citation with a persistent identifier (DOI, or Handle).

A Dataverse repository hosts multiple dataverses (see diagram). Each dataverse contains dataset(s) or other dataverses, and each dataset contains descriptive metadata and data files (including documentation and code that accompany the data - see diagram).

Provider

Institute for Quantitative Social Science at Harvard University, along with many collaborators and contributors worldwide.

Licensing and cost

Apache 2 License – free.

Development activity

Version 4.0 was released in April 2015. The current version (in December 2015) is 4.2.2. The software is continually development, as revealed by an active issues tracking page.  The project is Harvard-sponsored, and appears to have support for the foreseeable future.

Platform and interoperability

The Dataverse makes use of the following components: Java Server Faces; Enterprise Java Beans; PostgreSQL; Solr; and R and Zelig. Prerequisites for installation include Oracle JDK or OpenJDK, a “virgin” installation of Glassfish Version 4.1+, preferably as part of the NetBeans Web Development bundle, PostgreSQL Version 9.3+, and R. The software was designed to integrate reCAPTCHA, Google Analystics, ImageMagick, Shibboleth, and DOI registration via EZID if the installer so wishes. The Dataverse currently has multiple open APIs available, which allow for searching, depositing and accessing data.

Functional notes

Dataverses can be configured for multiple levels of access (at the dataverse, dataset and file level). Dataverse will accept any format, but will give full support to tabular data or fits file data (astronomy format). SPSS, STATA, R and csv are the preferred formats; data in these formats will be eligible for subsettable features, multiple formats for download, and a Universal Numerical Fingerprint (UNF). These files will be eligible for subsetting and pre-defined measurements. Dataverse has the capability to register DOIs from EZID, which allows the repository to assign persistent identifiers to data sets.

Documentation and user support

The website contains extensive software documentation, including user, installer, and developer guides.  A Users google group appears to be reasonably active, along with a specific support email address.

Usability

The Dataverse software provides a web-based interface for both administrators and users. The package includes an installer, which is run through the command line; basic install is designed to be very quick. Comfort with command-line interface and general systems knowledge appear to be crucial for configuration and installation of any add-ons.

Expertise required

To take full advantage of the archival management features in the software, users should have a firm grasp on the metadata expectations for their field.

Standards compliance

The software supports numerous metadata standards, including DDI, Dublin Core, Data Cite, Virtual Observatory (for astronomy), ISA-Tab (for biomedical). Each dataset is also given a data citation with a persistent global unique identifier that is in compliance with DataCite and the Joint Declaration of Data Citation Principles.

Influence and take-up

Current installations include

User Experiences

Development Activity