Overview
The Project Guest Collections Proof-of-Concept (POC) are Globus-accessible Collections that allow users to easily transfer data between endpoints using both the Globus Transfer web interface and HTTPS protocol. Access to Collections is managed by the PI or can be delegated to a group of Access Managers. Collections can be configured for collaboration amongst UC San Diego affiliates as well as individuals at other, external institutions, and the public.
Project Guest Collections can be easily integrated to produce streamlined workflows either stand-alone, or within portals, etc. Reporting capabilities can track weekly usage and enforcement quotas may be implemented to avoid overuse of disc space. For projects that outgrow the limitations established within the Project Guest Collections POC, Research IT can help the project migrate to an external storage resource offering a larger allocation (for example, AWS S3, Google Drive, SDSC's Universal Scale Storage (USS)) while maintaining access via Globus.
These collections are intended to promote data access by lab members, collaborators, or the public. (The collections' primary uses are not intended to be backups or secondary copies of data.) By default, 1-year allocations are initially set to 500GB (somewhat expandable upon request) and are reviewed annually for renewal. Priority for renewal will be given to active projects–i.e., collections where data is frequently accessed.
Requesting a Project Guest Collection
PIs may request a Project Guest Collection by emailing a short project description and intended usage to research-it@ucsd.edu. You may also include a short name for the collection and UCSD email addresses of any additional Access Managers. If you think you will need more than the initial 500TB allocation please include a brief description of the data planned for the collection.
Roles & Managing Access
Each Project Guest Collection has a set of Access Managers who control the permissions of folders within the collection. PIs are Access Managers by default, and additional designees can be added upon their request. To request a delegate Access Manager be added to your allocation, contact research-it@ucsd.edu.
Sharing Data with Collaborators
If your data is within the Project Guest Collection system, you can easily share it with collaborators who are at UC San Diego or elsewhere. You have full control over which files your collaborators can access, and whether they have read-only or read-write permissions. You can share with their institutional identity (someone@example.edu) or email. The collaborator can use the Globus web interface to download the data, or use Globus Transfer to move the data to their machine.
To share data with collaborators (that either have a Globus account or a UC San Diego account), log in to Globus.org, click on 'Endpoints', Shareable By You tab, select your Guest Collection, and go to the 'Permissions' tab. Click on 'Add Permissions - Share With':
Directories (not individual files) can be shared with other Globus users or Globus Groups (for more information on Groups, see section below). Collaborators can be given read, write or read+write permissions. Note that user having write access have the ability to delete files within a directory tree - be careful when providing write access. Once a Guest Collection has been selected, Click on the Permissions tab, then 'Add Permissions - Share With'. (Note: if other collaborators have shared Collections with you, those will be found listed under the Shared With You tab in Endpoints).
PI can also choose to share their data; there are four ways to do so. With a specific individual or 'user', with a Globus 'group', to 'all users' (individuals logged in to Globus), or to the 'public (anonymous)', with anonymous read access (this allows anyone that has access to the data to read and/or download it without authorizing the request).
Returning to the Permissions tab in the File Manager, you should see the folder and the people you have shared it with. You can terminate access to the directory by clicking the trash can next to the user on this page.
Additional information on Project Guest Collections POC
- Globus allows folders to be shared as either read or read/write. Any child folders/subdirectories contained within a parent folder will carry the same permissions as the parent. Permissions cannot be set on individual files, they are inherited from the folder.
- Since Globus supports setting permissions at the folder level, multiple Guest Collections are unnecessary for a project. Create appropriate sub-directories and share with collaborators by assigning desired permissions.
- Ownership to a Guest Collection cannot be transferred, therefore, Collections are created for the project owner at the highest level, the PI. If the project transitions to a new PI, a new Guest Collection will have to be created and permissions re-assigned to collaborating individuals.
- A Guest Collection is active initially for a 1-year period and will be reviewed for renewal annually.
Creating a Globus Group
When working with large groups of collaborators or varying levels or permissions, using the Groups feature in Globus makes this easy.
- Log in to the Globus web interface
- Go to Groups on the left panel
- Click on ‘Create a new group’ at the top
- Give the group a descriptive name and add Description for more information
- Make sure you select ‘group members only’ radio button
- Click on ‘Create Group’
Using Groups to Manage Permissions
Learn how to manage permissions to your new group(s) by following suggested best practices on this Globus How To page (begin on step 5).
How to Use HTTPS (web upload and download) Access to Files
Project Guest Collections support HTTPS access for file upload and download per the following instructions. HTTPS access can be automated, but may require the use of OAuth tokens. For help, contact research-it@ucsd.edu.
Using the Globus Web Interface
Within the web app, go to File Manager, and select a file by check marking the box to the left of the filename. Note: if you do not select a specific file in this window, the link provided will be to that of the parent folder. Any permissions that have been applied to this folder will continue to apply. In other words, if someone tries to download a file, they will still need to authenticate to Globus, unless the file is publicly accessible (see the next section).
A new window containing a URL will be presented. From this link the file can be directly downloaded.
Links have a consistent syntax, and can be embedded in other pages, such as data portals. Links use this format: https://<collectionid>.data.globus.org/<path>
The <collectionid>
(e.g., g-b0978f.0ed28.75bc
) is one or more strings separated by periods (subdomains) used to represent the host (server) and the specific guest collection. The <path>
is the relative path to the file under the collection, e.g., Demo_Testing/Collab_Folder1/TestDocHTTPS.docx
Here is a link to a test file with restricted access. You will prompted to authenticate, but will not able to download the file.
Public (Anonymous) Access
Files can also be made completely public, not requiring a login at all. For this example, we have a folder public/ in a demo Guest Collection. The folder name makes it clear that this data is publicly accessible. Access managers should consider limiting who can write data to public folders to reduce the chance that private data is accidentally made accessible.
Here is a link to a text file in the public folder, https://g-b0978f.0ed28.75bc.data.globus.org/public/README.txt
. You should be able to download this without authenticating to Globus.
Public files can also be downloaded via the command line:
$ wget https://g-b0978f.0ed28.75bc.data.globus.org/public/README.txt
...
$ cat README.txt
Hello, Public!
Requesting More Space
If you reach the limits of your current allocation, please contact us (research-it@ucsd.edu). If possible we'll accommodate the request within the POC. If your needs grow beyond with the POC can provide, we can assist you with acquiring more storage and migrating your guest collection.
Questions?
For more information or help with your Project Guest Collection (like adding Access Managers or managing permissions), please contact research-it@ucsd.edu.