Snakemake report plugin: nanopub

https://img.shields.io/badge/repository-github-blue?color=%23022c22 GitHub - Last commit https://img.shields.io/badge/author-Christian%20Meesters%20%3Cmeesters%40uni--mainz.de%3E-purple?color=%23064e3b PyPI - Version PyPI - License Snakemake

Nanopublications is a small knowledge graph snippet with metadata that is treated as an independent (scientific) publication. This information in a nanopublication can be about anything, for example a relation between a gene and a disease or an opinion.

Nanopublications are henced established in bioinformatics and others disciplines. With nanopublications, it is possible to disseminate individual data as independent publications with or without an accompanying research article. Furthermore, because nanopublications can be attributed and cited, they provide incentives for researchers to make their data available in standard formats that drive data accessibility and interoperability.

Installation

Install this plugin by installing it with pip or mamba directly, e.g.:

pip install snakemake-report-plugin-nanopub

Or, if you are using pixi, add the plugin to your pixi.toml. Be careful to put it under the right dependency type based on the plugin’s availability, e.g.:

snakemake-report-plugin-nanopub = "*"

Usage

In order to use the plugin, run Snakemake (>=8.5) with the corresponding value for the reporter flag:

snakemake --reporter nanopub ...

with ... being any additional arguments you want to use.

Settings

The report plugin has the following settings (which can be passed via command line, the workflow or environment variables, if provided in the respective columns):

Settings

CLI argument

Description

Default

Choices

Required

Type

--report-nanopub-workflow-id VALUE

NanoPub ID of a workflow for which this reportthis metadata NanoPub should be published.

None

--report-nanopub-output-path VALUE

Optional JSON output path for extracted workflow metadata.

None

--report-nanopub-main-server VALUE

Publish to nanopub main server (defaults to test server).

False

--report-nanopub-dry-run VALUE

Perform a dry run (do not publish the nanopub, just generateand print the nanopub content).

False

Further details

How this Plugin works

This is a reporter plugin. It enables to publish a Nanopublication containing all metadata necessary to reproduce a given workflow. The resulting nanopublication contains the configuration and all job specifications.

The idea is to allow using linking all metadata of a workflow into a Material & Methods section of scientific paper as a Nanopublication - a worldwide accessible, persistent and unique Wikidata link. Additionally, the plugin allows to create a graphical representation of a knowledge graph consisting of a worklow, its input configuration, a report and input data to illustrate the work done.

Installation

Installing this plugin into your Snakemake base environment using pip or conda will ensure dependency resolution for nanopub-py library as well.

Setup

The plugin will check whether a nanopub setup is present. You are advised to follow the introduction here and perform a setup step after the installation as described with the nanopub-py library documentation. Basically, perform

$ np setup

Contributions

We welcome bug reports, feature requests, and pull requests! Please report issues specific to this plugin in the plugin’s GitHub repository.

Usage

Preliminaries: Registering your Workflow as a Nanopublication

Before registering metadata for a workflow, the workflow itself ought to be registered manually using (this template](https://w3id.org/np/RAOT7z3RA0XYlHIikne8rfUUYZrtHyrzXBD1HpI_GvcRk).

Registering your Workflow Metadata

To register your workflow metadata with this plugin run Snakemake with

$ snakemake ... --reporter nanopub --report-nanopub-workflow-id <registered workflow nanopub>

This will inform you how much of all metadata are registered:

  • the configuration and description will be registered in any case

  • to avoid hitting the nanopublication size limit the rule information is most likely removed (it is redundant) and the job information is stripped of execution times and rule information until the size limit is observed.

In the end you will see a line like:

Nanopub published successfully: ('https://w3id.org/np/<nanopub ID>', 'https://test.registry.knowledgepixels.com/np/')
Report created.

You can navigate to the test registry and check your nanopub. If you want to register with main server put --report-nanopub-main-server. This is a security measure to avoid registering too many undesired nanopubs (e.g. accidentally for ill-configured runs).

Optional parameters are:

  • --report-nanopub-dry-run will print the nanopublication graph information on the terminal before shrinking

  • --report-nanopub-output-path allows for an optional JSON output

The Command Line Tool for plotting Knowledge Graphs

The plugin offers a stand-alone command line tool, too: plot-nanopub-knowledge-graph.

When you run it, you get a graphical representation of your workflow its in- and outputs like this:

Small knowledge graph of a workflow, its dataset, report and configuration.

In order to accomplish this, an uploaded report HTML (generated with Snakemake’s --report flag) can be registered with this template.

If you want to, register your data as a nanopub, too. I.e. using this template – any other data set template is fine and might offer a more fine-grained description, than this simple one.

Then running the command line tool will yield a knowledge graph as the one you see above:

$ plot-nanopub-knowledge-graph \
`--dataset-nanopub-id <dataset nanopub id> \
--workflow-nanopub-id <workflow nanopub id> \
--report-nanopub-id <report nanopub id> \
--workflow-configuration-id <configuration nanopub id> \
-o example_knowledgegraph.png

You can change

  • the output format with --format (e.g. svg, png, pdf). It defaults to the output file extension present.

  • the line color with `–line-color (in HTML format).

  • the text width with --text-width (as a number, it defaults to 60)

The *-nanopub-id parameters are mandatory and may be given with their https://w3id.org/np/ prefix.