NRP commandline tools easy
NRP commandline tools can currently be used only for NRP repositories based on Invenio technology stack. They can not be used to access repositories based on other technology stacks, such as DSpace, ARL or Islandora.
Prerequisites
- python 3.12 or higher
- curl
Example
# add a repository, will ask for token
nrp-cmd add repository https://data.narodni-repozitar.cz/ catchall --default
# create a new record and upload file to it. Store the record url to variable @rec
nrp-cmd create record --community brno \
'{"title": "My Record"}' \
/tmp/myfile.txt '{"title":"My file"}' @rec
# or with metadata in file
nrp-cmd create record --community brno \
metadata.json \
/tmp/myfile.txt /tmp/myfile-metadata.json @rec
# publish the record
nrp-cmd publish record @rec
Bulk changes to records:
# get all records with publisher being 'myorg'
nrp-cmd scan records "metadata.publisher:myorg" @ids
# download them together with files to the current directory as a backup
nrp-cmd download record @ids
# After carefully inspecting them,
# change the publisher to 'myorg2' inside all the records
nrp-cmd update record @ids --path publisher myorg2
Installation
Binary distribution for Linux/64bit
Not yet released, use the pip version instead.
Download the latest release from github and place it somewhere to your PATH.
Then just run the nrp-cmd
.
Using pip and virtualenv
Run the following command to install the tool:
python3 -m venv nrp-cmd
source nrp-cmd/bin/activate
pip install nrp-invenio-client
For better comfort, create an alias in your .bashrc
/.zshrc
file:
alias nrp-cmd="<path-to-nrp-invenio-client>/bin/nrp-cmd"
Create a new terminal and try running nrp-cmd
.
Note: The nrp-cmd
command stores its configuration inside the ~/.nrp/invenio-config.json
file.
The file contains sensitive information, such as tokens, so make sure it is not accessible by other users.
Getting help
Call the nrp-cmd --help
command to get a list of available commands,
or nrp-cmd <command> --help
to get help for a specific command.
Managing repositories
Adding a repository
The tool can work with multiple repositories. Each repository is identified by an alias and one of the repositories is set as a default.
To add a repository, invoke:
nrp-cmd add repository <servername> <alias> [--token token] [--anonymous] [--default] [--no-tls-verify]
where the servername
is either the full URL of the server, or just the
name of the host (e.g. myserver.cesnet.cz
) and alias is an optional alias.
If you do not provide the alias, the first part of the servername (up to the first dot)
will be used.
If you do not specify the token, the command will try to open a repository login page
in your default browser. After you log in, it will redirect you to the Personal tokens
page, where you can create a token. Then copy the token to the terminal. The token will
be stored in the configuration file and used for subsequent requests.
You can skip this step by using the --anonymous
option. If you do so, you will get
anonymous access to the repository - you will be able to read public records, but
not to create or modify them.
If you specify the --default
option, the repository will be set as the default one.
If you specify the --no-tls-verify
option, the tool will not verify the TLS certificate
of the server. This is useful for testing purposes, but not recommended for production use.
Selecting the default repository
A default repository is used when no repository alias is specified in the command. The first repository added will be set as default. To change the default repository, run:
nrp-cmd select repository <alias>
Listing known repositories
To get a list of repositories, invoke:
nrp-cmd list repositories
Repositories
Alias URL Default
───────────────────────────────────────────
repo https://127.0.0.1:5000/ ✓
Removing a repository
To remove a repository, run:
nrp-cmd remove repository <alias>
Repository introspection
Basic information
To get information about the repository, invoke:
nrp-cmd describe repository [--output-format=table,json,yaml] [--refresh]<alias>
This command will return machine-understandable information about the repository, such as name, description, metadata schemas, urls, etc.
By default it outputs tabular format, but you can select a format of the output
using the --output-format
option.
The information is cached as we do not expect that the information changes frequently.
To refresh the information, use the --refresh
option.
Sample output:
Repository 'repo'
Name NR Document repository
URL https://127.0.0.1:5000/
Token ***
TLS Verify skip
Retry Count 5
Retry After Seconds 5
Default ✓
Version local development
Invenio Version 12.1.35
Transfers local-file, url-fetch
Records url https://127.0.0.1:5000/api/search/
User records url https://127.0.0.1:5000/api/user/search/
Model 'documents'
Name documents
Description
Version 1.0.0
Features requests, drafts, files
Schemas {'application/json': 'local://documents-1.0.0.json'}
API https://127.0.0.1:5000/api/docs/
HTML https://127.0.0.1:5000/docs/
Schemas {'application/json': URL('https://127.0.0.1:5000/.well-known/repository/schema/documents-1.0.0.json')}
Model Schema https://127.0.0.1:5000/.well-known/repository/models/documents
Published Records URL https://127.0.0.1:5000/api/docs/
User Records URL https://127.0.0.1:5000/api/user/docs/
Variables
The tool supports variables which are used mostly for storing record identifiers. The variables are stored inside the local directory in a file '.nrp/variables.json'. The file is created automatically when the first variable is stored.
The following operations are available:
nrp-cmd list variables [--output <fname>] [--output-format table|json|yaml]
- list all variables.nrp-cmd set variable <name> <value> [...<value>]
- sets the variable to the values specifiednrp-cmd get variable <name> [--output-format table|json|yaml] [--single-value]
- prints the variable to stdoutnrp-cmd remove variable <name>
- removes the variable
Searching the repository
To search the repository, invoke one of the following commands (search and list are synonyms):
nrp-cmd search records [--repository <alias>] [modifiers] <query> [@variable]
nrp-cmd list records [--repository <alias>] [modifiers] <query> [@variable]
The optional query is a query string in the OpenSearch format. For example:
nrp-cmd search records "title:mytitle" @mytitle_ids
Modifiers
The following modifiers are supported:
--model
- name of the model which will be searched. If not specified, all models will be searched. This option can be specified multiple times.--community
- name (slug) of the community whose records will be searched. If not specified, all communities will be searched. This option can be specified multiple times.--size
- number of results to return. Default is 10.--page
- page number to return. Default is 1.--sort
- sort order. The default isbestmatch
. The supported values arebestmatch
,newest
,oldest
.--drafts
- if specified, only draft records will be returned.--mine
- if specified, only user's records will be returned.--published
- if specified, only published records will be returned.--output-format <format>
- format of the output. The default isyaml
. The supported formats arejson
andyaml
.
If multiple models, communities or collections are specified, the search will return results from all of them in undefined order.
If the user is logged in, the search will also return private records.
Variables
If you specify the @variable
, the urls of found records will be stored in the variable and can be used
in subsequent commands, such as get
or update
.
If you specify the '@+variable', the identifiers of found records will be appended to the variable.
Output
hits:
- id: 1
links:
self: https://myserver.cesnet.cz/api/datasets/1
self_html: https://myserver.cesnet.cz/datasets/1
metadata:
title: My dataset
- id: 2
...
total: 100
links:
self: ...
next: ...
prev: ...
Pagination
To get the next page, run the command with --page
option. For example:
nrp-cmd search records --page 2 myserver "title:mytitle"
Note that records might be repeated on different pages or some of the records can be skipped.
Scanning records
If you want to get all records, use the following call:
nrp scan records [--repository <alias>] [modifiers] <query>
The modifiers are the same as for the search command,
except of sort
and page
, which are not supported.
The output will be always sorted from the oldest to the newest record.
The query is optional and can be used to filter the records.
For yaml output, the output will be a list of record documents:
id: 1
...
---
id: 2
...
If you choose the json format, the output will be a list of records:
[
{
"id": 1,
...
},
{
"id": 2,
...
}
]
For easier processing, you can use the jsonl
output format. The output will be a list of records
formatted as separate json documents on a single line:
{"id": 1, ...}
{"id": 2, ...}
Bash example:
nrp-cmd scan records --output-format jsonl | while read REC ; do
echo $REC # process the record
done
Note: the records will be output as soon as they are received. If there is any communication error or the command is interrupted, the list of records will be incomplete and in case of json output, the completion square bracket might be missing.
Record identifiers
Several representations can be used to identify a record:
id
- this is theid
field from search hit.https://repository/api/model/id
- theself
url of the recordhttps://repository/api/user/model/id
- theself
url of the draft recordhttps://repository/model-ui/id
- theself_html
url of the record. The HTML page must contain api metadata inside the head tag.doi:10.3323/1234567
- a DOI allocated by the repository (or any other supported external persistent identifier). The DOI must be resolvable. A call to the DOI resolver will be made to get the actual location of the record metadata.
CRUD operations
Reading record metadata
To get record metadata, invoke
nrp-cmd get record <pid> <pid> ...
where the pid
is the record identifier in any of the supported formats, as defined above.
The command will output the record to stdout, for example in yaml:
id: 1
links:
self: https://myserver.cesnet.cz/api/datasets/1
self_html: https://myserver.cesnet.cz/datasets/1
metadata:
title: My dataset
You can specify additional parameters:
-o
,--output fn
- will save the record as this output file--repository <alias>
- will use the specified repository--expand
- will add a list of files and requests to the output--output-format
- will change the output format tojson
oryaml
The value of the --output
can contain placeholders {id}
which
will be replaced with the model name and the record id. The placeholders can also
reference metadata from within the record, such as {metadata[title]}.json
.
Subdirectories are allowed within the --output-file
and will be created if necessary.
The placeholders will be sanitized and ..
and leading /
will be removed.
Note: You can use the @variable
to get the record url from the variable. To download
multiple records with their files, use:
nrp-cmd search records "title:mytitle" @mytitle_ids
nrp-cmd get record @mytitle_ids --download -o 'data/{id}'
Downloading a complete record
To download a complete record, invoke
nrp-cmd download record <pid> [-o directory]
where the pid
is the record identifier in any of the supported formats, as defined above and
You can specify additional parameters:
-o
,--output fn
- will save the record as this output file--repository <alias>
- will use the specified repository--model <model>
- will use the specified model in case the id is ambiguous--expand
- will add a list of files and requests to the output--output-format
- will change the output format tojson
oryaml
As above, the value of the --output
can contain placeholders. If the output
is not provided, the record id will be used as the output directory.
This call will create the following structure:
<output>
metadata.json
files
file1.txt
file2.zip
Creating a record
To create a record, invoke
nrp-cmd create record [--model <model>] <metadata> [<file> <file_metadata> ...] [@variable]
where the model
is the name of the model and the metadata
is either a json string beginning with {
containing the metadata, or path on the filesystem to a file containing the metadata in either json or yaml format.
A -
can be used to read the metadata from stdin.
If model is not specified, either the repository must contain just one model or the model must be specified
in the metadata (for example, using the $schema
attribute).
All options:
--repository <alias>
- will use the specified repository--model <model>
- will use the specified model in case the id is ambiguous-o
,--output fn
- will save the record as this output file-f
,--output-format json|yaml|table|jsonl
- will change the output format tojson
oryaml
--community <community>
- will create the record in the specified community--workflow <workflow>
- will create the record with the specified workflow--metadata-only
- will create the record with metadata only, files will not be uploaded
The command will output the created record to stdout, for example in yaml:
# nrp-cmd create record documents '{"metadata":{"title": "blah"}}' @blah
id: z2rrt-pz252
$schema: local://documents-1.0.0.json
created: '2024-03-14T13:30:01.133502+00:00'
errors:
- field: metadata.creators
messages:
- Missing data for required field.
- field: metadata.resourceType
messages:
- Missing data for required field.
files:
enabled: true
links:
draft: https://127.0.0.1:5000/api/docs/z2rrt-pz252/draft
# more links here ...
parent:
id: rcpyq-rx266
request_types:
- type_id: publish_draft
links:
actions:
create: https://127.0.0.1:5000/api/requests
revision_id: 3
updated: '2024-03-14T13:30:01.154516+00:00'
metadata:
title: blah
If you specify the @variable
, the identifier of the created record will be stored in the variable and can be used
in subsequent commands, such as get
or update
:
# nrp-cmd get record @blah
id: z2rrt-pz252
# ... same result as above
Updating a record
To update a record, invoke
nrp-cmd update record <pid> [--merge|--replace] <metadata>
nrp-cmd update record <pid> --path=<pth> <metadata>
where the pid
is the record identifier in any of the supported formats, as defined above and metadata is either a json string beginning with {
containing the metadata, or path on the filesystem to a file containing the metadata in either json or yaml format.
All options:
--repository <alias>
- will use the specified repository--model <model>
- will use the specified model in case the id is ambiguous--replace
- will replace the metadata of the record. This is the default operation--merge
- will merge the metadata of the record with the new metadata--path <pth>
- will update the metadata at the specified path, not the whole metadata-o
,--output fn
- will save the record as this output file--output-format
- will change the output format tojson
oryaml
The command will fetch the actual version of the record and update it with the new metadata.
If the --merge
is used, the new and old metadata will be merged as follows:
- if the value is a scalar, the new value will replace the old value
- if the value is a list, the new list will be appended to the old list
- if the value is a dictionary, then for each key:
- if the key is not present in the old dictionary, it will be added
- if the key is present in the old dictionary, the value will be updated as above
null
values will be removed from the metadata after the merge operation.
The command will output the updated record to stdout, for example in yaml:
The following two are equivalent and both update the {'metadata': {'title': '...'}}
part of the record:
nrp-cmd update record @blah '{"title": "blah2"}'
nrp-cmd update record @blah --path title blah2
Note: the command always updates the "metadata" part of the record. You can not update other parts of the record using this command.
Deleting a record
TODO Not yet implemented
To delete a record, invoke
nrp-cmd delete record <pid>
where the pid
is the record identifier in any of the supported formats, as defined above.
Validating a record
TODO Not yet implemented
To validate a record, invoke
nrp-cmd validate record <pid>
The command will perform a validation of the record and output the validation errors to stdout. If the record is invalid, the command will return a non-zero exit code.
Note: the validation is performed by saving the record, so internal revision number is increased by one. This implementation might change in the future.
Files
Listing files on a record
To list files on a record, invoke
nrp-cmd list files <pid>
where the pid
is the record identifier in any of the supported formats, as defined above.
You can use the @variable
to get the record url from the variable.
Also, all options from the get record
command are supported.
# nrp-cmd list files @blah
bucket_id: 73c3d9ec-e4f2-4109-9212-de9e85ecd302
checksum: md5:d70b90d02e92e8ccb83100a324ac0f29
created: '2024-03-14T13:50:24.671439+00:00'
file_id: aae1b651-2245-4853-bfb5-03f42fd41703
key: p8.toml
links:
commit: https://127.0.0.1:5000/api/docs/z2rrt-pz252/draft/files/p8.toml/commit
content: https://127.0.0.1:5000/api/docs/z2rrt-pz252/draft/files/p8.toml/content
self: https://127.0.0.1:5000/api/docs/z2rrt-pz252/draft/files/p8.toml
mimetype: application/octet-stream
size: 578
status: completed
storage_class: L
updated: '2024-03-14T13:50:24.804383+00:00'
version_id: 813511ba-fd02-4b13-888d-19cc145a7072
metadata:
description: project file
---
# ... more files here
Alternatively, you can use the --expand
option when getting a record and then taking the files section from the output:
# ❯ nrp-cmd get record @blah --files
mid: draft/documents/z2rrt-pz252
id: z2rrt-pz252
$schema: local://documents-1.0.0.json
created: '2024-03-14T13:30:01.133502+00:00'
files:
- key: p8.toml
storage_class: L
checksum: md5:d70b90d02e92e8ccb83100a324ac0f29
size: 578
created: '2024-03-14T13:50:24.671439+00:00'
updated: '2024-03-14T13:50:24.804383+00:00'
status: completed
metadata:
description: project file
mimetype: application/octet-stream
version_id: 813511ba-fd02-4b13-888d-19cc145a7072
file_id: aae1b651-2245-4853-bfb5-03f42fd41703
bucket_id: 73c3d9ec-e4f2-4109-9212-de9e85ecd302
links:
commit: https://127.0.0.1:5000/api/docs/z2rrt-pz252/draft/files/p8.toml/commit
content: https://127.0.0.1:5000/api/docs/z2rrt-pz252/draft/files/p8.toml/content
self: https://127.0.0.1:5000/api/docs/z2rrt-pz252/draft/files/p8.toml
links:
Downloading a file
To download a file, invoke
nrp-cmd download file <pid> <file-key> <file-key> ... [-o output-file]
where the pid
is the record identifier in any of the supported formats, as defined above and
file-key
is the key of the file to download. The output-file
is the path to the file where the file will be saved.
If you do not specify the output-file
, the file will be saved to the current directory with the same name as the key.
If you specify multiple file-key
s, the files will be downloaded to the current directory with the same name as the key.
In this case, you might use the {var}
placeholder in the output file. The placeholder will be replaced with the value
from the file's metadata (for example, {title}.pdf
will save the file as title). To keep the original extension,
use the {ext}
placeholder.
To download all files, use the *
as the file-key
. Do not forget to escape it in the shell.
Note: If you download all files, it is preferable to use nrp-cmd download record ...
command instead.
You will get all the metadata and it might be useful for provenance tracking.
Uploading a file
To upload a file, invoke
nrp-cmd upload file <pid> <file> [<metadata>]
where the pid
is the record identifier in any of the supported formats, as defined above.
If the metadata is not passed, the file will be uploaded with empty metadata.
Everything after this is TODO for both implementation and checking
Updating file metadata
To update file metadata, invoke
nrp-cmd update file <pid> <file-key> <metadata>
where the pid
is the record identifier in any of the supported formats, as defined above,
file-key
is the key of the file to update and metadata
is either a json string beginning
with {
containing the metadata, or path on the filesystem to a file containing the metadata
in either json or yaml format.
Alternatively, you can use the following command:
nrp-cmd update file <pid> <file-key> --set <path>=<value> --set <path>=<value> ...
where the path
is the path to the metadata field and value
is the new value.
Re-uploading a file
To reupload a file, invoke
nrp-cmd replace file <pid> <file-key> <file> [<metadata>]
Deleting a file
To delete a file, invoke
nrp-cmd delete file <pid> <file-key>
Publish and edit published document
TODO Not yet implemented
Publishing a draft
To publish a draft, invoke
nrp-cmd publish draft <pid> [version]
where the pid
is the record identifier in any of the supported formats, as defined above.
The command will output the published record to stdout.
If you use a variable instead of the pid, the variable will be updated with the new identifier.
Note: the record is validated before publishing. If the validation fails, the command will fail and print the validation errors instead of the record.
Editing published record
To edit a published record, invoke
nrp-cmd edit record <pid>
where the pid
is the record identifier in any of the supported formats, as defined above.
The command will create draft record and output it to stdout.
If you use a variable instead of the pid, the variable will be updated with the new identifier.
Requests
TODO Not yet implemented
Listing requests on a record
To list requests on a record, invoke
nrp-cmd list requests <pid>
where the pid
is the record identifier in any of the supported formats, as defined above.
documents_publish_draft:
type_id: documents_publish_draft
links:
actions:
create: https://127.0.0.1:5000/api/docs/b465x-wz855/draft/requests/documents_publish_draft
requests:
- id: 0c224d67-f8a1-4a51-bf15-a75bfd53ceb2
created: '2024-03-16T16:36:42.538694+00:00'
updated: '2024-03-16T16:36:42.545675+00:00'
links:
actions:
submit: https://127.0.0.1:5000/api/requests/0c224d67-f8a1-4a51-bf15-a75bfd53ceb2/actions/submit
self: https://127.0.0.1:5000/api/requests/extended/0c224d67-f8a1-4a51-bf15-a75bfd53ceb2
comments: https://127.0.0.1:5000/api/requests/extended/0c224d67-f8a1-4a51-bf15-a75bfd53ceb2/comments
timeline: https://127.0.0.1:5000/api/requests/extended/0c224d67-f8a1-4a51-bf15-a75bfd53ceb2/timeline
revision_id: 2
type: documents_publish_draft
title: ''
number: '2'
status: created
is_closed: false
is_open: false
expires_at: null
is_expired: false
created_by:
user: '1'
receiver:
group: curator
topic:
documents_draft: b465x-wz855
Creating a request
To create a request, invoke
nrp-cmd create request <pid> <type> <metadata> [--submit] [@variable]
where the pid
is the record identifier in any of the supported formats, as defined above,
type
is the type of the request and metadata
is either a json string beginning with {
containing the metadata, or path on the filesystem to a file containing the metadata in either json or yaml format.
You can store the identifier of the created request in a variable by appending @variable
to the command.
It will store the <pid>/<request_id>
into the variable, so later on you'll use this single variable
instead of pid and request_id.
If you specify the --submit
option, the request will be submitted immediately after creation.
Getting a request
To get a request, invoke
nrp-cmd get request <request-id>
# or
nrp-cmd get request @variable
where request-id
is the identifier (url) of the request.
The command will output the request to stdout.
The command takes output file and format options as well,
see the get record
command for details.
Updating a request with a new metadata
If the request has not been submitted, you can update the metadata of the request by invoking
nrp-cmd update request <request-id> <metadata>
# or
nrp-cmd update request @variable <metadata>
where request-id
is the identifier of the request and metadata
is either a json string beginning
with {
containing the metadata, or path on the filesystem to a file containing the metadata
in either json or yaml format.
You might also use the --set
option to update the metadata:
nrp-cmd update request <request-id> --set <path>=<value> --set <path>=<value> ...
where the path
is the path to the metadata field and value
is the new value.
Submitting a request
To submit a request, invoke
nrp-cmd submit request <pid> <request-id> [<message>]
# or
nrp-cmd submit request @variable [<message>]
where the pid
is the record identifier in any of the supported formats
as defined above and request-id
is the identifier of the request.
You can pass a message to the submit command, which will be stored in the request.
Cancelling a request
To cancel a request, invoke
nrp-cmd cancel request <request-id> [<message>]
# or
nrp-cmd cancel request @variable [<message>]
where request-id
is the identifier of the request.
You can pass a message to the cancel command, which will be stored in the request.
Accepting a request
To accept a request, invoke
nrp-cmd accept request <request-id> [<message>]
# or
nrp-cmd accept request @variable [<message>]
where request-id
is the identifier of the request. You can pass a message to the accept command,
which will be stored in the request.
Declining a request
To decline a request, invoke
nrp-cmd decline request <request-id> [<message>]
# or
nrp-cmd decline request @variable [<message>]
where request-id
is the identifier of the request.
You can pass a message to the decline command, which will be stored in the request.