empd_admin.query module

Functions

query_meta(meta, query[, columns, count, …])

Query the meta data of a data contribution

query_samples(meta_df, query)

Query the samples based on their metadata

empd_admin.query.query_meta(meta, query, columns='notnull', count=False, output=None, commit=False, local_repo=None, distinct=False)

Query the meta data of a data contribution

This function uses the query_samples() function to return a subset of the EMPD metadata. The performed query is such as:

SELECT columns FROM meta WHERE query
Parameters
  • meta (str) – The path to the metadata that shall be queried (see read_empd_meta())

  • query (str) – The WHERE clause of the SQL query

  • columns (list of str) – The columns that shall be returned. It can either be a list of columns, 'all' to return all columns, or 'notnull' (default) to return the non-empty columns

  • count (bool) – If True, do not return the values per column but the number of valid entries per column (i.e. SELECT COUNT(*) FROM meta WHERE query)

  • output (str) – The path where to save the tab-delimited result of the query. If None and commit is True, it will be saved to queries/query.tsv, relative to the local_repo

  • commit (bool) – If True, commit the changes in the repository local_repo

  • local_repo (str) – The path of the local EMPD-data repository. If None, it will be assumed to be the directory of the given meta.

  • distinct (list of str) – If not null, return a distinct query based on the columns listed in this parameter. For example distinct=['Country', 'SampleContext'] will result in SELECT DISTINCT ON ('Country', 'SampleContext') ...

Returns

  • str – The path where the query has been saved (see output and commit) or None

  • str – The result of the query as a markdown table, at maximum 200 rows

empd_admin.query.query_samples(meta_df, query)

Query the samples based on their metadata

This function saves the given meta_df to a sqlite database and queries it based on the given filter. The performed query is such as:

SELECT SampleName FROM meta_df WHERE query
Parameters
Returns

The samples that have been selected by the given query

Return type

np.ndarray