Datasets

This section describes the general procedure to create a new dataset. However, each participating institution is able to set up this process based on their institutional policies. Some functions outlined below may not be available to all users. Please refer to your institution’s policies and procedures or with your institution’s support contact(s) to determine which functions are available to you.

Creating a Dataset

  1. Once you’re logged into Borealis, navigate to either your institutional collection or your personal collection to create a new dataset.
  2. Click on the Add Data button on the right side of the collection’s main page and select New Dataset.
    1. If you cannot see the Add Data button, verify you are logged in and have the appropriate permissions. You may need to contact your collection administrator.
      An example of a collection page, with the Add Data button clicked.
  3. Enter information for all the required fields (those with a red asterisk) on the new dataset page.
    1. The new dataset page has three main sections: the Dataset Template, Citation Metadata, and File Uploads. Note that additional metadata fields can be completed once the dataset has been saved (see Editing or Adding Metadata).
      The page and fields that appear when you select to add a new dataset.
  4. Select the appropriate template from the drop-down list under Dataset Template or select “Custom Licence.”
    1. Borealis provides templates that pre-fill information related to the licensing of the dataset that will appear in the Terms of Use field.
    2. Keep in mind that this licence will apply to all files in the dataset. If you need to assign different licences to different files, either create a separate dataset for each type of licence, or select Custom licence so you can provide details on the licences associated with each file.
    3. Ensure you understand what each licence allows users to do before you select an option for your dataset.
    4. The default template is normally CC0 Public Domain (CC0 1.0).
      The drop-down menu that appears for the Dataset Template field when creating a new dataset.
  5. Enter the name of your dataset in the Title field.
    1. This field is pre-populated based on the template selected. Replace the title with your own dataset title.
  6. The Name and Affiliation fields in the Author section are pre-filled based on your account information.
    1. If the dataset should be attributed to an author other than yourself, replace your name and affiliation with the appropriate information.
  7. Select an Identifier Scheme from the drop-down list for the author, if you’d like, then enter the proper information for that scheme in the Identifier field.
  8. If you have authors you’d like to add to the dataset, click the plus (+) button to the right of the Author section.
    1. A second set of the four Author fields — Name, Affiliation, Identifier Scheme, and Identifier — will be added below the first author’s information.
    2. Add as many authors as you’d like by continuing to click the plus (+) button.
    3. Remove an author by clicking the minus (-) button beside that author’s entry.
    4. When entering details into the citation metadata block, you can indicate the identifier type (e.g., ORCID) and then enter the identifier. Do not include the URL in this field. Once the dataset is saved, the identifier link will be clickable.
  9. The Contact section is also pre-filled based on your account information, but can be changed if you are not the dataset’s main contact.
    1. The person or persons included in the Contact section will receive any messages sent by users about the dataset.
    2. Contacts and authors do not have to be the same individuals.
    3. Like the Author section, you can add as many contacts for the dataset as you’d like by clicking the plus (+) button beside that section.
  10. Enter information about your dataset in the Text field in the Description section.
    1. If needed, you can also add a date in the Date field related to the description you’ve entered.
    2. Click the plus (+) button beside the Description section if you need to add more than one description for your dataset.
      An example of how more than one Description can be added to a dataset, and the Date field available for each Description.
  11. Select one or more disciplines associated with your dataset from the Subject drop-down list by clicking on the checkboxes.
    1. If the appropriate discipline is not listed, select the Other option at the bottom of the list.
  12. Enter a word or phrase associated with your dataset in the Term field in the Keyword section.
    1. If the term you enter is associated with a specific controlled vocabulary, you can enter the vocabulary’s name in the Vocabulary field and add a website for it to the Vocabulary URL field.
    2. If you’d like to include more than one term, click the plus (+) button beside the Keyword section to add fields. Only add one keyword in a single Term field.
      An example of multiple Keywords being added to a new dataset.
  13. If your dataset is associated with one or more publications, you can enter citations for those publications in the Related Publications section.
    1. You can also add an identifier for the publication, if needed, by selecting the identifier type from the ID Type drop-down list. Then enter the ID Number in the next field.
    2. If the publication you’ve included has a webpage, you can include it in the URL field.
    3. Click the plus (+) button to the right of the Related Publication section to add more than one publication, if required.
  14. Enter any additional information about the dataset in the Notes field, if required.
  15. Both the Depositor and Deposit Date fields are pre-filled with your account information and the current date, but can be edited.
  16. The Files tab allows you to upload on or more files into your new dataset immediately, but this is not required. Files can be added to a dataset at any point after it’s been created. Refer to the Adding Files to a Dataset section for more information.
  17. Click the Save Dataset button at the bottom of the page to create your new dataset. Your DOI is now reserved, but will not resolve until you publish your dataset.

Editing a Dataset

Once a dataset is created, you can edit several pieces of information about that dataset. These edits can be done to either an Unpublished (Draft) or Published dataset.

If you make these changes to a Published dataset, you will need to re-publish that dataset in order for the changes to take effect.

Note: The following screenshots are from a published dataset. Draft datasets may look slightly different, but the functionality is the same.

Editing or Adding Metadata
  1. To edit or add additional metadata for a dataset, click on the Edit Dataset drop-down menu and select Metadata.
    The Metadata option in the drop-down menu for the Edit Dataset button.
  2. On the Edit Dataset Metadata page, add or change the information in the citation metadata block and add additional metadata to any of the other disciplinary metadata blocks.
    The Edit Dataset Metadata page, where all metadata fields are available for editing.
  3. Click the Save Changes button at the top or bottom of the screen when you’re done making changes.

The metadata fields displayed will be based on the dataset template you used when you created the collection.

Geospatial Metadata Block

The Geospatial Metadata Block allows users to enter information about the Geographic Coverage of the data. 

Once the dataset is saved, go to the Metadata tab and select Add + Edit Metadata. Navigate to the Geospatial Metadata block to review the optional fields and enter information. 

The Geographic Bounding Box fields are indexed by the system and can be searched using the Geospatial Search API. Important: Invalid metadata entered into the fields will cause issues with saving and publishing the dataset. Do not include degree signs (°), commas, or other special characters. 

Please note that metadata validation and corrected tooltips in the user interface will be available soon

Details about the metadata fields, expected coordinates for the Geographic Bounding Box along with examples are provided in the table below.

Metadata field Description in tooltip (corrected) Valid Range Example
Westernmost (Left) Longitude Westernmost coordinate delimiting the geographic extent of the dataset. A valid range of values, expressed in decimal degrees, is -180.0 <= West Bounding Longitude Value <= 180.0 -180.0 to 180.0 -79.639283
Easternmost (Right) Longitude Easternmost coordinate delimiting the geographic extent of the dataset. A valid range of values, expressed in decimal degrees, is -180.0 <= East Bounding Longitude Value <= 180.0 -180.0 to 180.0 -79.113219
Northernmost (Top) Latitude Northernmost coordinate delimiting the geographic extent of the dataset. A valid range of values, expressed in decimal degrees, is -90.0 <= North Bounding Latitude Value <= 90.0 -90.0 to 90.0 43.855442
Southernmost (Bottom) Latitude Southernmost coordinate delimiting the geographic extent of the dataset. A valid range of values, expressed in decimal degrees, is -90.0 <= South Bounding Latitude Value <= 90.0 -90.0 to 90.0 43.579608

If you are having difficulty with the geospatial metadata fields, please contact your institutional collection administrator or the Borealis team.  You may also want to review bounding box tools (e.g., Klokantech or ArcGIS).

Computational Workflow Metadata Block

A computational workflow describes a process to coordinate multiple computational tasks and their data dependencies that lead to the resulting finalized dataset, for example: running code, using command-line tools, accessing a database, submitting a job to a compute cloud resource, and execution of data processing scripts (see also Advanced Guide). The computational metadata workflow metadata blocklinks to an external code repository where the related code and workflow steps are stored.

Once the dataset is saved, go to the Metadata tab and select Add + Edit Metadata.

Navigate to the Computational Workflow Metadata block and add details to link to external code repositories that contain the code and related details about computational workflow steps. The fields can contain details about:

  • the type of computational workflow framework
  • the external code repository URL where the related code is located
  • the URL to documentation or text describing the Computational Workflow and its use.

Three fields are nested under the Computational Workflow Metadata tab: Workflow type, External Code/Repository URL, and Documentation.

Editing Dataset Terms

Selecting a Creative Commons Licence or entering Custom Dataset Terms will prompt a user to accept those terms before they can preview or download the files in that dataset. If a user does not accept the terms, they will not be able to preview or download the file(s).

To edit the Dataset Terms:

  1. Click on the Edit Dataset drop-down menu and select Terms.
    The Terms option in the drop-down menu for the Edit Dataset button.
  2. The Edit Dataset Terms page will display the Dataset Terms, Restricted Files + Terms of Access, and Guestbook information. Some metadata may have been pre-populated based on the dataset template when the dataset was created.
    The Edit Dataset Terms page with Terms of Use, Restricted Files + Terms of Access, and Guestbook sections.
  3. Make edits on this page, then click the Save Changes button to update your dataset’s terms.
    1. Edit the Dataset Terms section.
    2. Edit the Restricted Files + Terms of Access section.
    3. Edit the Guestbook section.

Dataset Terms

Under Dataset Terms, the drop-down menu allows users to apply one of several standardized Creative Commons Licenses or enter Custom Dataset Terms. Note: the Borealis dataset license templates are available on the dataset creation page for selection and use, and will pre-populate the license selected. We recommend working with your institutional Dataverse collection contact or collection administrator to determine the appropriate license or custom terms of use.

Standardized Creative Commons licences provide structured metadata using standardized language, logos, and external links to the Creative Commons License information within the License/Data Use Agreement field. The text is also translated to be used in both English and French.

  1. To select a Creative Commons License:
  2. Navigate to the Dataset page or the Dataset Term Tab.
  3. Use the drop-down menu to select one of several standardized Creative Commons Licences.

A dropdown list of the available Creative Commons licences which can be selected from the Dataset Terms Tab

  1. Select “Save Changes” at the bottom of the page.

Users can also choose to use Custom Dataset Terms or a Data Use Agreement if the supplied standardized Creative Commons licences do not meet their needs. To enter Custom Dataset Terms:

  1. Use the drop-down menu to select “Custom Dataset Terms”
  2. Enter the information about how the dataset could be re-used once downloaded in the “Terms of Use” field (this field is required).
  3. Other Dataset Terms fields (e.g., “Confidentiality Declaration,” “Special Permissions,” “Restrictions”) are optional.
  4. Select “Save Changes” at the bottom of the page.

If Custom Dataset Terms is selected, a non-blank entry in the Terms of Use field is required.

  1. A validation error message will appear upon saving the dataset indicating that the required fields are missing.
  2. Entries in other Terms fields are optional.

Custom Terms of Use cannot be used in conjunction with a standard licence.

  • When a standard licence is selected, the Terms fields are no longer shown, since these fields are intended for terms/conditions that would potentially conflict with or modify the terms in a standard licence.

The Terms of Access field is unavailable when a standardized Creative Commons Licence is selected in the Dataset Terms tab.

 

Publishing a dataset with Creative Commons License or Custom Terms

A new popup window provides information about licence selection (or custom terms) that users must confirm before publishing (or submitting to review) for improved transparency. This pop-up window will appear every time a user clicks “Publish” and/or “Submit for Review.”

A pop-up appears confirming the Licence/Data Use Agreement and the fields defined for the Dataset Terms.

Downloading files from a dataset with Creative Commons License or Custom Terms

For datasets published with Creative Commons Licenses (except for CC0) and custom terms of use, users will be prompted with a pop-up window to confirm the license or custom terms of use before previewing or downloading the files.

Pop-up window asking users to accept the Creative Common Terms before downloading a file from a dataset.

Restricted Files + Terms of Access Section

The second section on the Edit Dataset Terms page is where you can document information about accessing restricted files within your dataset, if applicable. The fields in this section are typically used if a user must request access before they can view or download files, or if the use of the files is restricted to certain individuals (e.g., students and staff at a specific institution).

You can make a file restricted during or after it’s uploaded by editing the metadata associated with the file. You can restrict just one file in your data, some of the files, or all of the files. The information in this section will only apply to the files you’ve restricted within this dataset. If you have different terms of access for different restricted files in your dataset, make sure to state that explicitly in these fields.

Enabling Guestbooks

The third section on the Edit Dataset Terms page allows you to select a guestbook for your dataset. If you opt to include a guestbook, it will be applied to all files in your dataset. Whatever information you request as part of your guestbook will be required from every user every time they preview or download a file from your dataset (even if they’ve completed the guestbook previously).

If you want to enable one of the listed guestbooks for your dataset, click the radio button beside its name. Use the Preview Guestbook button to see which items are required from users when they want to view or download a file.

A Preview Guestbook page listing the information a user is required to provide to view or download a file from your dataset.

The Edit Dataset Terms page will only allow you to select or deselect a previously created guestbook. Guestbooks can only be created at the collection level. If you do not have access to add a guestbook at the collection level, contact your institutional support contact(s) for assistance.

If the only change you’ve made between versions of your dataset is to the guestbook, those changes will not appear in the details when comparing multiple versions.

Adding Thumbnails and Widgets

To edit thumbnails or widgets for a dataset, click on the Edit Dataset drop-down menu and select Thumbnails + Widgets. These are different from the themes and widgets you can include at the collection level.

The drop-down menu that appears when you click the

Under the Thumbnail tab, you can select an image (JPG, TIF, or PNG) to be displayed as the main image or logo for your dataset. You can select from the images you’ve added to your dataset, or you can upload a new image. The image you select will be displayed at the top of your dataset page, as well as beside the name of your dataset in the list of datasets within the collection.

The Widgets tab contains two HTML scripts that can be added to your website in order to display the proper citation for your dataset, or to provide easy access to your dataset from a website. Copy and paste the scripts provided to the HTML code on your website in order to display the dataset citation.

Creating a Private URL for a Dataset

Unpublished datasets are only viewable by users who have certain roles assigned to them for that dataset. Other users cannot see draft datasets. However, there may be a situation where you need to share your unpublished dataset with someone without assigning a specific role to their user account (e.g., anonymous reviewers). Instead of publishing your dataset, dataset admins and curators can create a Private URL. That private URL can then be shared with others and will give them the ability to view your unpublished dataset without needing to log in. Once a dataset is published, the private URL will be deactivated.

There are two types of private URLs: the general Private URL, and the URL for Anonymized Access. The general Private URL allows users to see all details about your dataset. The URL for Anonymized access will have some metadata fields hidden from the viewer. When creating an Anonymized URL, make sure the collection your dataset is in does not have any identifying information in the name as the breadcrumbs will reveal the collection name information. Reach out to your institutional support if this is a concern for your dataset.

To create a private URL:

  1. Open the dataset for which you want to create a Private URL.
  2. Click the Edit Dataset button in the top right corner and select Private URL from the drop-down menu.
    The drop-down menu that appears when the Edit Dataset button is clicked.
  3. Depending on how you want to use this URL, click either the Create Private URL button or the Create URL for Anonymized Access button.
    The Unpublished Dataset Private URL window where private URLs can be created.
  4. Once the private URL has been generated, it will be displayed in the window with a Success! message. You can copy this URL to share your dataset.
    An Unpublished Dataset Private URL window that appears after the private URL is created.
  5. Click the Close link to close the window.
  6. Send the private URL to whomever you need.
  7. Once you no longer need to share a private URL, go back to the Unpublished Dataset Private URL window and click the Disable Private URL button.
  8. Confirm that you want to disable the URL by clicking the Yes, Disable Private URL button.
  9. Alternatively, once you publish the dataset, the private URL will be disabled.