Help & Resources

Data Terminology

Data vs. Statistics

Data are numeric files created and organized for further analysis. There are two types of data – aggregate data and microdata.​

Aggregate data are statistical summaries organized in a specific data file structure that permits further analysis. Aggregate data are delivered in a variety of formats, including CANSIM tables, Beyond 2020 files, and Excel spreadsheets.

Microdata consist of the raw data observed or collected from a specific unit of observation (e.g. individual respondents, households, families, etc.). The microdata file is composed of individual records consisting of a row of numbers. Columns are present which describe the data. Microdata require processing before they become ready for interpretation. Microdata in <odesi> are available for download and subsetting for a variety of statistical analysis software. Statistical analysis software is a comprehensive system for analysing data. There are many different types of statistical analysis software. Three of the more common software used by researchers include SPSS (Statistical Package for the Social Sciences), SAS (Statistical Analysis System), and STATA.

This is an example of microdata from the Canadian Tobacco Use Monitoring Survey (2008). Each row (left to right) represents a responder. Each column (top to bottom) represents responses to a particular question. The questions are coded along the top.

Statistics are the summarized tables and cross-tabulations that have been formulated from the raw data files. Statistics are often produced for ready-use and published in the form of e-publications, e-tables, and databases. (adapted from the Statistics Canada DLI Survival Guide)

What is a Variable?

“A variable is a characteristic of a statistical unit being observed that may assume more than one of a set of values to which a numerical measure or a category from a classification can be assigned.”
(from Statistics Canada “Definitions, data sources, and methods”)

More generally, a variable is a set of factors, traits, or conditions that make sense together as a unit of analysis. For example, in this question, the variable is “marital status,” and it is made up of the conditions divorced, legally married and not separated, separate and legally married, never legally married, and widowed.

Data Terminology Resources

  • Statistics Canada’s definitions, data sources, and methods: The information is provided to ensure an understanding of the basic concepts that define the data, including variables and classifications; the underlying statistical methods and surveys; and key aspects of the data quality. Direct access to questionnaires is also provided.
  • Statistics Canada Power from the Data! Glossary: The definitions provided here are, in some cases, oversimplifications of highly complex concepts. They provide information for those who have questions about statistics but who do not need highly technical explanations.
  • StatSoft Statistics Glossary: Statsoft has freely provided the Electronic Statistics Textbook as a public service for more than 12 years. This textbook offers training in the understanding and application of statistics. The material was developed at the Statsoft R&D department based on many years of teaching undergraduate and graduate statistics courses and covers a wide variety of applications.

Citing <odesi> Data in APA

Author. (Year of publication/production). Title (Version number if relevant) [Data type]. Name of Producer if different from author [Producer]. Name of Distributor [Distributor]. Retrieved from URL

Microdata example:

Statistics Canada. (1993). Survey of Persons Not in the Labour Force, 1992 [Data file]. Data Liberation Initiative [Distributor]. Retrieved from http://search1.odesi.ca/details/view.html?q=survey+of+persons+not+in+the+labour+force&field=TI&coll=odesi&date-gt=1871&date-lt=2011&uri=/odesi/spnlf_71M0014_E_1992.xml

Aggregate data example:

Statistics Canada. (2008). Census of Population, 2006: Profile for Canada, Provinces, Territories, Census Divisions, Census Subdivisions and Dissemination Areas, Profile Series [Table]. Data Liberation Initiative, [Distributor]. Retrieved from http://odesi.scholarsportal.info/documentation/CENSUS/2006/PROFILES/B2020/RAWDATA/CUMM/CAN/94-581-XCB2006002.IVT

Data Citation Resources

Data Literacy Resources

More Ways to Find Data

PDF Handouts

Other Scholars Portal Data Resources

Scholars GeoPortal: The Scholars GeoPortal tool provides access to geospatial datasets, including land-based vector data, census geography, and orthophotography. From the GeoPortal, you can download geospatial data such as the census boundary files for a census metropolitan area, download the associated census data from <odesi>, and pull these datasets together in a GIS tool to create a map based on your data.

Dataverse: The Scholars Portal Dataverse network is a repository for research data collected by individuals and organizations associated with subscribing Canadian universities. The Dataverse platform allows researchers to deposit data, create appropriate metadata, and version documents. Access to data and supporting documentation can be controlled down to the file level, and researchers can choose to make content available publicly, only to select individuals, or to keep it completely locked.

Get Help at Your School

Each OCUL school has a local data services librarian who can help you search for and identify data sets, work with microdata files, and get in touch with Statistics Canada liasons as necessary.

To get help at your institution find and email your local data librarian.

If you are having technical problems, you can email ODESI technical support.