Skip to main content
Scholars Portal logo

Scholars Portal Dataverse Guide

Research Data Management Services At Your Library

Need help managing your research data? Staff at your local institution’s library can provide assistance with all phases of the
data life cycle, including:

  • developing data management plans
  • documenting of research data
  • sharing and long-term preservation of data
  • using online research data repositories (including Scholars Portal Dataverse)
  • publishing options
  • author rights

Please click on the institutional logos below to contact research data services at your institution. 



Brock University   



Carleton University




Lakehead University


Laurentian University        


McMaster University                                                     



Nipissing University


OCAD University                                                



Queen's University                       



Ryerson University                    


Trent University


University of Guelph                 


University of Toronto                 


University of Waterloo        


University of Windsor                  




University of Ontario Institute of Technology                     


Western University Canada                 


Wilfrid Laurier University             


York University                   


Cases - These are the units of analysis, things that have certain characteristics or properties. For example, the cases could
be individuals in a statistics class, all residents of Chicago, hamburgers, cities, countries, organizations, or lakes.  We want
to reach a conclusion about their characteristics.

Data refers to quantitative data *and* research files broadly (i.e. field notes, ethnographic descriptive text, images, etc.). Dataverse accepts all kinds of data and files.

  • Tabular data is quantitative data (numbers) arranged in a table. Dataverse can only run statistical analyses on
    tabular data files.  Accepted file formats are: SPSS/POR, SPSS/SAV, Strata, CSV (w/SPSS card), and TAB (w/DDI).  Dataverse will maintain usability of tabular data files over time. For example, if .sav files become obsolete, Dataverse will republish deposited data in new useable formats.
  • Network data is represented in XML files. These files contain information about network properties (nodes,
    edges). Network data is used for network analysis (i.e. social network analysis). Dataverse can visualize
    network data from GraphML files.

Representation of data, network data, and tabular data

Frequency - The frequency for a value is the number of cases that fall into the category and is also called a "count".

Metadata - text that describes your research study. Metadata fields include the abstract, keywords, and data
collection mode (among others).
 All metadata fields in Dataverse are defined on the site itself and are compliant with
DDI standard schema version 2. For an overview of DDI standards, visit To view a complete list
of DDI fields in Dataverse, 
see this document (PDF).

Values - These are the possible outcomes for a single variable. They are different for the different cases. Values can
be numbers or named categories. For example the variable GENDER traditionally has two values, "man"
 and "woman". 

Some people (cases) are men, and some are women.

Variable - This is the characteristic or property in which we are interested. It is a characteristic that pertains to the
cases. A variable must be able to take on different values for different cases. Variables include characteristics like people's
GENDER, people's HEIGHT, the DEPTHS of lakes, lake TEMPERATURES, organizations' REVENUES, and whether a
hamburger is COOKED rare, medium, or well-done. Often we look at two or more different variables at a time and ask
whether they are related for a specific set of cases. For example, we might want to know if GENDER is related to HEIGHT
among human beings or it TEMPERATURE is related to DEPTH for lakes.

Variable: Character - In this level of measurement, the values of the variable are "qualities" or categoric pigeonholes, which
may or may not be orderable. These categoric values can be given code numbers, but the numbers do not refer to an

equal-interval scale or to real quantities. Generally, we cannot compute a mean or other quantitative summary measures for
the variable. These categories should be exhaustive and mutually exclusive.

Variable: Continuous - This level of measurement is like the interval-ratio level. The values of the variable are quantitative,
definite meaningful numbers on a scale. Furthermore, we can think of them as points along a continuum that can be 

subdivided forever. Measuring length or distance with a rule is a simple example of collecting data at a continuous level of
measurement. Most researchers treat percentages and other kinds of proportions as continuous data. It makes sense to
compute a mean and other quantitative summary measures for these data.

Variable: Discrete - These are quantitative variables whose values fall along a scale or metric, often with a true 0, but they
are not really continuous. The units of measurement are whole numbers, and it makes little sense to indefinitely subdivide
the units. Only a whole number makes sense for the value. For instance, generally people don't have a fraction of a sibling
or a fractional number of body pierces--only whole numbers. These data are discrete, but notice that the numbers do refer
to a real scale (not just code numbers), and most researchers end up treating them as if they were continuous data. It makes
sense to computer a mean (and other quantitative summary measures).  For example, we can talk about the "mean number
of children born to women in the Yukon" and come up with a fractional amount although each woman has only a whole number
of children.

from Garner, R. (2005). The joy of stats: A short guide to introductory statistics in the social sciences. Peterborough, Ont: Broadview Press.