NRP commandline tools easy

ℹ️

NRP commandline tools can currently be used only for NRP repositories based on Invenio technology stack. They can not be used to access repositories based on other technology stacks, such as DSpace, ARL or Islandora.

Prerequisites

python 3.12 or higher
curl

Example

# add a repository, will ask for token
nrp-cmd add repository https://data.narodni-repozitar.cz/ catchall --default
 
# create a new record and upload file to it. Store the record url to variable @rec
nrp-cmd create record --community brno \
        '{"title": "My Record"}' \
        /tmp/myfile.txt '{"title":"My file"}' @rec
 
# or with metadata in file
nrp-cmd create record --community brno \
        metadata.json \
        /tmp/myfile.txt /tmp/myfile-metadata.json @rec
 
# publish the record
nrp-cmd publish record @rec

Bulk changes to records:

# get all records with publisher being 'myorg'
nrp-cmd scan records "metadata.publisher:myorg" @ids
 
# download them together with files to the current directory as a backup
nrp-cmd download record @ids
 
# After carefully inspecting them, 
# change the publisher to 'myorg2' inside all the records
nrp-cmd update record @ids --path publisher myorg2

Installation

Binary distribution for Linux/64bit

⚠️

Not yet released, use the pip version instead.

Download the latest release from github and place it somewhere to your PATH. Then just run the nrp-cmd.

Using pip and virtualenv

Run the following command to install the tool:

python3 -m venv nrp-cmd
source nrp-cmd/bin/activate
pip install nrp-invenio-client

For better comfort, create an alias in your .bashrc/.zshrc file:

alias nrp-cmd="<path-to-nrp-invenio-client>/bin/nrp-cmd"

Create a new terminal and try running nrp-cmd.

Note: The nrp-cmd command stores its configuration inside the ~/.nrp/invenio-config.json file. The file contains sensitive information, such as tokens, so make sure it is not accessible by other users.

Getting help

Call the nrp-cmd --help command to get a list of available commands, or nrp-cmd <command> --help to get help for a specific command.

Managing repositories

Adding a repository

The tool can work with multiple repositories. Each repository is identified by an alias and one of the repositories is set as a default.

To add a repository, invoke:

nrp-cmd add repository <servername> <alias> [--token token] [--anonymous] [--default] [--no-tls-verify]

where the servername is either the full URL of the server, or just the name of the host (e.g. myserver.cesnet.cz) and alias is an optional alias. If you do not provide the alias, the first part of the servername (up to the first dot) will be used.

If you do not specify the token, the command will try to open a repository login page in your default browser. After you log in, it will redirect you to the Personal tokens page, where you can create a token. Then copy the token to the terminal. The token will be stored in the configuration file and used for subsequent requests.

You can skip this step by using the --anonymous option. If you do so, you will get anonymous access to the repository - you will be able to read public records, but not to create or modify them.

If you specify the --default option, the repository will be set as the default one.

If you specify the --no-tls-verify option, the tool will not verify the TLS certificate of the server. This is useful for testing purposes, but not recommended for production use.

Selecting the default repository

A default repository is used when no repository alias is specified in the command. The first repository added will be set as default. To change the default repository, run:

nrp-cmd select repository <alias>

Listing known repositories

To get a list of repositories, invoke:

 
nrp-cmd list repositories
 
Repositories
 
  Alias   URL                       Default
 ───────────────────────────────────────────
  repo    https://127.0.0.1:5000/   ✓

Removing a repository

To remove a repository, run:

nrp-cmd remove repository <alias>

Repository introspection

Basic information

To get information about the repository, invoke:

nrp-cmd describe repository [--output-format=table,json,yaml] [--refresh]<alias>

This command will return machine-understandable information about the repository, such as name, description, metadata schemas, urls, etc.

By default it outputs tabular format, but you can select a format of the output using the --output-format option.

The information is cached as we do not expect that the information changes frequently. To refresh the information, use the --refresh option.

Sample output:

Repository 'repo'                                                
                                                                 
  Name                  NR Document repository                   
  URL                   https://127.0.0.1:5000/                  
  Token                 ***                                      
  TLS Verify            skip                                     
  Retry Count           5                                        
  Retry After Seconds   5                                        
  Default               ✓                                        
  Version               local development                        
  Invenio Version       12.1.35                                  
  Transfers             local-file, url-fetch                    
  Records url           https://127.0.0.1:5000/api/search/       
  User records url      https://127.0.0.1:5000/api/user/search/  
                                                                 
Model 'documents'                                                                                                                 
                                                                                                                                  
  Name                    documents                                                                                               
  Description                                                                                                                     
  Version                 1.0.0                                                                                                   
  Features                requests, drafts, files                                                                                 
  Schemas                 {'application/json': 'local://documents-1.0.0.json'}                                                    
  API                     https://127.0.0.1:5000/api/docs/                                                                        
  HTML                    https://127.0.0.1:5000/docs/                                                                            
  Schemas                 {'application/json': URL('https://127.0.0.1:5000/.well-known/repository/schema/documents-1.0.0.json')}  
  Model Schema            https://127.0.0.1:5000/.well-known/repository/models/documents                                          
  Published Records URL   https://127.0.0.1:5000/api/docs/                                                                        
  User Records URL        https://127.0.0.1:5000/api/user/docs/

Variables

The tool supports variables which are used mostly for storing record identifiers. The variables are stored inside the local directory in a file '.nrp/variables.json'. The file is created automatically when the first variable is stored.

The following operations are available:

nrp-cmd list variables [--output <fname>] [--output-format table|json|yaml] - list all variables.
nrp-cmd set variable <name> <value> [...<value>] - sets the variable to the values specified
nrp-cmd get variable <name> [--output-format table|json|yaml] [--single-value] - prints the variable to stdout
nrp-cmd remove variable <name> - removes the variable

Searching the repository

To search the repository, invoke one of the following commands (search and list are synonyms):

nrp-cmd search records [--repository <alias>] [modifiers] <query> [@variable]
nrp-cmd list records [--repository <alias>] [modifiers] <query> [@variable]

The optional query is a query string in the OpenSearch format. For example:

nrp-cmd search records "title:mytitle" @mytitle_ids

Modifiers

The following modifiers are supported:

--model - name of the model which will be searched. If not specified, all models will be searched. This option can be specified multiple times.
--community - name (slug) of the community whose records will be searched. If not specified, all communities will be searched. This option can be specified multiple times.
--size - number of results to return. Default is 10.
--page - page number to return. Default is 1.
--sort - sort order. The default is bestmatch. The supported values are bestmatch, newest, oldest.
--drafts - if specified, only draft records will be returned.
--mine - if specified, only user's records will be returned.
--published - if specified, only published records will be returned.
--output-format <format> - format of the output. The default is yaml. The supported formats are json and yaml.

If multiple models, communities or collections are specified, the search will return results from all of them in undefined order.

If the user is logged in, the search will also return private records.

Variables

If you specify the @variable, the urls of found records will be stored in the variable and can be used in subsequent commands, such as get or update.

If you specify the '@+variable', the identifiers of found records will be appended to the variable.

Output

hits:
- id: 1
  links:
    self: https://myserver.cesnet.cz/api/datasets/1
    self_html: https://myserver.cesnet.cz/datasets/1
  metadata:
    title: My dataset
- id: 2
  ...
total: 100
links:
  self: ...
  next: ...
  prev: ...

Pagination

To get the next page, run the command with --page option. For example:

nrp-cmd search records --page 2 myserver "title:mytitle"

Note that records might be repeated on different pages or some of the records can be skipped.

Scanning records

If you want to get all records, use the following call:

nrp scan records [--repository <alias>] [modifiers] <query>

The modifiers are the same as for the search command, except of sort and page, which are not supported. The output will be always sorted from the oldest to the newest record.

The query is optional and can be used to filter the records.

For yaml output, the output will be a list of record documents:

id: 1
...
---
id: 2
...

If you choose the json format, the output will be a list of records:

[
  {
    "id": 1,
    ...
  },
  {
    "id": 2,
    ...
  }
]

For easier processing, you can use the jsonl output format. The output will be a list of records formatted as separate json documents on a single line:

{"id": 1, ...}
{"id": 2, ...}

Bash example:

nrp-cmd scan records --output-format jsonl | while read REC ; do 
  echo $REC   # process the record
done

Note: the records will be output as soon as they are received. If there is any communication error or the command is interrupted, the list of records will be incomplete and in case of json output, the completion square bracket might be missing.

Record identifiers

Several representations can be used to identify a record:

id - this is the id field from search hit.
https://repository/api/model/id - the self url of the record
https://repository/api/user/model/id - the self url of the draft record
https://repository/model-ui/id - the self_html url of the record. The HTML page must contain api metadata inside the head tag.
doi:10.3323/1234567 - a DOI allocated by the repository (or any other supported external persistent identifier). The DOI must be resolvable. A call to the DOI resolver will be made to get the actual location of the record metadata.

CRUD operations

Reading record metadata

To get record metadata, invoke

nrp-cmd get record <pid> <pid> ...

where the pid is the record identifier in any of the supported formats, as defined above.

The command will output the record to stdout, for example in yaml:

id: 1
links:
  self: https://myserver.cesnet.cz/api/datasets/1
  self_html: https://myserver.cesnet.cz/datasets/1
metadata:
  title: My dataset

You can specify additional parameters:

-o, --output fn - will save the record as this output file
--repository <alias> - will use the specified repository
--expand - will add a list of files and requests to the output
--output-format - will change the output format to json or yaml

The value of the --output can contain placeholders {id} which will be replaced with the model name and the record id. The placeholders can also reference metadata from within the record, such as {metadata[title]}.json. Subdirectories are allowed within the --output-file and will be created if necessary. The placeholders will be sanitized and .. and leading / will be removed.

Note: You can use the @variable to get the record url from the variable. To download multiple records with their files, use:

nrp-cmd search records "title:mytitle" @mytitle_ids
nrp-cmd get record @mytitle_ids --download -o 'data/{id}'

Downloading a complete record

To download a complete record, invoke

nrp-cmd download record <pid> [-o directory]

where the pid is the record identifier in any of the supported formats, as defined above and

You can specify additional parameters:

-o, --output fn - will save the record as this output file
--repository <alias> - will use the specified repository
--model <model> - will use the specified model in case the id is ambiguous
--expand - will add a list of files and requests to the output
--output-format - will change the output format to json or yaml

As above, the value of the --output can contain placeholders. If the output is not provided, the record id will be used as the output directory.

This call will create the following structure:

<output>
  metadata.json
  files
    file1.txt
    file2.zip

Creating a record

To create a record, invoke

nrp-cmd create record [--model <model>] <metadata> [<file> <file_metadata> ...] [@variable]

where the model is the name of the model and the metadata is either a json string beginning with { containing the metadata, or path on the filesystem to a file containing the metadata in either json or yaml format. A - can be used to read the metadata from stdin.

If model is not specified, either the repository must contain just one model or the model must be specified in the metadata (for example, using the $schema attribute).

All options:

--repository <alias> - will use the specified repository
--model <model> - will use the specified model in case the id is ambiguous
-o, --output fn - will save the record as this output file
-f, --output-format json|yaml|table|jsonl - will change the output format to json or yaml
--community <community> - will create the record in the specified community
--workflow <workflow> - will create the record with the specified workflow
--metadata-only - will create the record with metadata only, files will not be uploaded

The command will output the created record to stdout, for example in yaml:

# nrp-cmd create record documents '{"metadata":{"title": "blah"}}' @blah
 
id: z2rrt-pz252
$schema: local://documents-1.0.0.json
created: '2024-03-14T13:30:01.133502+00:00'
errors:
- field: metadata.creators
  messages:
  - Missing data for required field.
- field: metadata.resourceType
  messages:
  - Missing data for required field.
files:
  enabled: true
links:
  draft: https://127.0.0.1:5000/api/docs/z2rrt-pz252/draft
  # more links here ...
parent:
  id: rcpyq-rx266
request_types:
- type_id: publish_draft
  links:
    actions:
      create: https://127.0.0.1:5000/api/requests
revision_id: 3
updated: '2024-03-14T13:30:01.154516+00:00'
metadata:
  title: blah

If you specify the @variable, the identifier of the created record will be stored in the variable and can be used in subsequent commands, such as get or update:

# nrp-cmd get record @blah
 
id: z2rrt-pz252
# ... same result as above

Updating a record

To update a record, invoke

nrp-cmd update record <pid> [--merge|--replace] <metadata>
nrp-cmd update record <pid> --path=<pth> <metadata>

where the pid is the record identifier in any of the supported formats, as defined above and metadata is either a json string beginning with { containing the metadata, or path on the filesystem to a file containing the metadata in either json or yaml format.

All options:

--repository <alias> - will use the specified repository
--model <model> - will use the specified model in case the id is ambiguous
--replace - will replace the metadata of the record. This is the default operation
--merge - will merge the metadata of the record with the new metadata
--path <pth> - will update the metadata at the specified path, not the whole metadata
-o, --output fn - will save the record as this output file
--output-format - will change the output format to json or yaml

The command will fetch the actual version of the record and update it with the new metadata. If the --merge is used, the new and old metadata will be merged as follows:

if the value is a scalar, the new value will replace the old value
if the value is a list, the new list will be appended to the old list
if the value is a dictionary, then for each key:
- if the key is not present in the old dictionary, it will be added
- if the key is present in the old dictionary, the value will be updated as above

null values will be removed from the metadata after the merge operation.

The command will output the updated record to stdout, for example in yaml:

The following two are equivalent and both update the {'metadata': {'title': '...'}} part of the record:

nrp-cmd update record @blah '{"title": "blah2"}'
nrp-cmd update record @blah --path title blah2

Note: the command always updates the "metadata" part of the record. You can not update other parts of the record using this command.

Deleting a record

TODO Not yet implemented

To delete a record, invoke

nrp-cmd delete record <pid>

where the pid is the record identifier in any of the supported formats, as defined above.

Validating a record

TODO Not yet implemented

To validate a record, invoke

nrp-cmd validate record <pid>

The command will perform a validation of the record and output the validation errors to stdout. If the record is invalid, the command will return a non-zero exit code.

Note: the validation is performed by saving the record, so internal revision number is increased by one. This implementation might change in the future.

Files

Listing files on a record

To list files on a record, invoke

nrp-cmd list files <pid>

where the pid is the record identifier in any of the supported formats, as defined above. You can use the @variable to get the record url from the variable. Also, all options from the get record command are supported.

# nrp-cmd list files @blah
 
bucket_id: 73c3d9ec-e4f2-4109-9212-de9e85ecd302
checksum: md5:d70b90d02e92e8ccb83100a324ac0f29
created: '2024-03-14T13:50:24.671439+00:00'
file_id: aae1b651-2245-4853-bfb5-03f42fd41703
key: p8.toml
links:
  commit: https://127.0.0.1:5000/api/docs/z2rrt-pz252/draft/files/p8.toml/commit
  content: https://127.0.0.1:5000/api/docs/z2rrt-pz252/draft/files/p8.toml/content
  self: https://127.0.0.1:5000/api/docs/z2rrt-pz252/draft/files/p8.toml
mimetype: application/octet-stream
size: 578
status: completed
storage_class: L
updated: '2024-03-14T13:50:24.804383+00:00'
version_id: 813511ba-fd02-4b13-888d-19cc145a7072
metadata:
  description: project file
---
# ... more files here

Alternatively, you can use the --expand option when getting a record and then taking the files section from the output:

# ❯ nrp-cmd get record @blah --files
 
mid: draft/documents/z2rrt-pz252
id: z2rrt-pz252
$schema: local://documents-1.0.0.json
created: '2024-03-14T13:30:01.133502+00:00'
files:
- key: p8.toml
  storage_class: L
  checksum: md5:d70b90d02e92e8ccb83100a324ac0f29
  size: 578
  created: '2024-03-14T13:50:24.671439+00:00'
  updated: '2024-03-14T13:50:24.804383+00:00'
  status: completed
  metadata:
    description: project file
  mimetype: application/octet-stream
  version_id: 813511ba-fd02-4b13-888d-19cc145a7072
  file_id: aae1b651-2245-4853-bfb5-03f42fd41703
  bucket_id: 73c3d9ec-e4f2-4109-9212-de9e85ecd302
  links:
    commit: https://127.0.0.1:5000/api/docs/z2rrt-pz252/draft/files/p8.toml/commit
    content: https://127.0.0.1:5000/api/docs/z2rrt-pz252/draft/files/p8.toml/content
    self: https://127.0.0.1:5000/api/docs/z2rrt-pz252/draft/files/p8.toml
links:

Downloading a file

To download a file, invoke

nrp-cmd download file <pid> <file-key> <file-key> ... [-o output-file]

where the pid is the record identifier in any of the supported formats, as defined above and file-key is the key of the file to download. The output-file is the path to the file where the file will be saved. If you do not specify the output-file, the file will be saved to the current directory with the same name as the key.

If you specify multiple file-keys, the files will be downloaded to the current directory with the same name as the key. In this case, you might use the {var} placeholder in the output file. The placeholder will be replaced with the value from the file's metadata (for example, {title}.pdf will save the file as title). To keep the original extension, use the {ext} placeholder.

To download all files, use the * as the file-key. Do not forget to escape it in the shell.

Note: If you download all files, it is preferable to use nrp-cmd download record ... command instead. You will get all the metadata and it might be useful for provenance tracking.

Uploading a file

To upload a file, invoke

nrp-cmd upload file <pid> <file> [<metadata>]

where the pid is the record identifier in any of the supported formats, as defined above. If the metadata is not passed, the file will be uploaded with empty metadata.

Everything after this is TODO for both implementation and checking

Updating file metadata

To update file metadata, invoke

nrp-cmd update file <pid> <file-key> <metadata>

where the pid is the record identifier in any of the supported formats, as defined above, file-key is the key of the file to update and metadata is either a json string beginning with { containing the metadata, or path on the filesystem to a file containing the metadata in either json or yaml format.

Alternatively, you can use the following command:

nrp-cmd update file <pid> <file-key> --set <path>=<value> --set <path>=<value> ...

where the path is the path to the metadata field and value is the new value.

Re-uploading a file

To reupload a file, invoke

nrp-cmd replace file <pid> <file-key> <file> [<metadata>]

Deleting a file

To delete a file, invoke

nrp-cmd delete file <pid> <file-key>

Publish and edit published document

TODO Not yet implemented

Publishing a draft

To publish a draft, invoke

nrp-cmd publish draft <pid> [version]

where the pid is the record identifier in any of the supported formats, as defined above. The command will output the published record to stdout.

If you use a variable instead of the pid, the variable will be updated with the new identifier.

Note: the record is validated before publishing. If the validation fails, the command will fail and print the validation errors instead of the record.

Editing published record

To edit a published record, invoke

nrp-cmd edit record <pid>

where the pid is the record identifier in any of the supported formats, as defined above. The command will create draft record and output it to stdout.

If you use a variable instead of the pid, the variable will be updated with the new identifier.

Requests

TODO Not yet implemented

Listing requests on a record

To list requests on a record, invoke

nrp-cmd list requests <pid>

where the pid is the record identifier in any of the supported formats, as defined above.

documents_publish_draft:
  type_id: documents_publish_draft
  links:
    actions:
      create: https://127.0.0.1:5000/api/docs/b465x-wz855/draft/requests/documents_publish_draft
  requests:
  - id: 0c224d67-f8a1-4a51-bf15-a75bfd53ceb2
    created: '2024-03-16T16:36:42.538694+00:00'
    updated: '2024-03-16T16:36:42.545675+00:00'
    links:
      actions:
        submit: https://127.0.0.1:5000/api/requests/0c224d67-f8a1-4a51-bf15-a75bfd53ceb2/actions/submit
      self: https://127.0.0.1:5000/api/requests/extended/0c224d67-f8a1-4a51-bf15-a75bfd53ceb2
      comments: https://127.0.0.1:5000/api/requests/extended/0c224d67-f8a1-4a51-bf15-a75bfd53ceb2/comments
      timeline: https://127.0.0.1:5000/api/requests/extended/0c224d67-f8a1-4a51-bf15-a75bfd53ceb2/timeline
    revision_id: 2
    type: documents_publish_draft
    title: ''
    number: '2'
    status: created
    is_closed: false
    is_open: false
    expires_at: null
    is_expired: false
    created_by:
      user: '1'
    receiver:
      group: curator
    topic:
      documents_draft: b465x-wz855

Creating a request

To create a request, invoke

nrp-cmd create request <pid> <type> <metadata> [--submit] [@variable]

where the pid is the record identifier in any of the supported formats, as defined above, type is the type of the request and metadata is either a json string beginning with { containing the metadata, or path on the filesystem to a file containing the metadata in either json or yaml format.

You can store the identifier of the created request in a variable by appending @variable to the command. It will store the <pid>/<request_id> into the variable, so later on you'll use this single variable instead of pid and request_id.

If you specify the --submit option, the request will be submitted immediately after creation.

Getting a request

To get a request, invoke

nrp-cmd get request <request-id>
 
# or
 
nrp-cmd get request @variable

where request-id is the identifier (url) of the request. The command will output the request to stdout. The command takes output file and format options as well, see the get record command for details.

Updating a request with a new metadata

If the request has not been submitted, you can update the metadata of the request by invoking

nrp-cmd update request <request-id> <metadata>
 
# or
 
nrp-cmd update request @variable <metadata>

where request-id is the identifier of the request and metadata is either a json string beginning with { containing the metadata, or path on the filesystem to a file containing the metadata in either json or yaml format.

You might also use the --set option to update the metadata:

nrp-cmd update request <request-id> --set <path>=<value> --set <path>=<value> ...

where the path is the path to the metadata field and value is the new value.

Submitting a request

To submit a request, invoke

nrp-cmd submit request <pid> <request-id> [<message>]
 
# or
 
nrp-cmd submit request @variable [<message>]

where the pid is the record identifier in any of the supported formats as defined above and request-id is the identifier of the request. You can pass a message to the submit command, which will be stored in the request.

Cancelling a request

To cancel a request, invoke

nrp-cmd cancel request <request-id> [<message>]
 
# or
 
nrp-cmd cancel request @variable [<message>]

where request-id is the identifier of the request. You can pass a message to the cancel command, which will be stored in the request.

Accepting a request

To accept a request, invoke

nrp-cmd accept request <request-id> [<message>]
 
# or
 
nrp-cmd accept request @variable [<message>]

where request-id is the identifier of the request. You can pass a message to the accept command, which will be stored in the request.

Declining a request

To decline a request, invoke

nrp-cmd decline request <request-id> [<message>]
 
# or
 
nrp-cmd decline request @variable [<message>]

where request-id is the identifier of the request. You can pass a message to the decline command, which will be stored in the request.