empd_admin.common module

Data

DATADIR

Path to the local directory of the cloned EMPD2/EMPD-data repository.

DATA_LOCKFILE

Lock file to lock the repository DATADIR.

NUMERIC_COLS

Columns in the EMPD-data metadata sheet that hold numeric values

Functions

dump_empd_meta(meta[, fname])

Dump the EMPD meta data to a file

get_empd_master_repo()

Get the repository to the data directory and download it if necessary

get_psql_scripts()

The path to the postgres scripts in the EMPD-data repository

get_test_dir()

The path to the tests directory in the data directory

lock_empd_master()

Lock the data repository

read_empd_meta([fname, addokexcept])

Read an EMPD-data metadata file into a pandas DataFrame

wait_for_empd_master([timeout])

Wait until the data repository is available

empd_admin.common.DATADIR = '/opt/empd-data'

Path to the local directory of the cloned EMPD2/EMPD-data repository. The path can be set through the EMPDDATA environment variable. Otherwise, it is assumed to be '$HOME/.local/share/EMPD-data'

empd_admin.common.DATA_LOCKFILE = '/var/lib/postgresql/cloning_master.lock'

Lock file to lock the repository DATADIR. By default, this is at '$HOME/cloning_master.lock' and is used by the lock_empd_master() and wait_for_empd_master() functions.

empd_admin.common.NUMERIC_COLS = ['Latitude', 'Longitude', 'Elevation', 'AreaOfSite', 'AgeBP', 'count', 'percentage']

Columns in the EMPD-data metadata sheet that hold numeric values

empd_admin.common.dump_empd_meta(meta, fname=None, **kwargs)

Dump the EMPD meta data to a file

This function dumps the meta data of the EMPD to a file with some standard formatting

Parameters
  • meta (pandas.DataFrame) – The dataframe holding the meta data (see read_empd_meta())

  • fname (str) – The filename where to save it (see pandas.DataFrame.to_csv())

  • **kwargs – Any other argument that is parsed to the pandas.DataFrame.to_csv() function

Examples

Read the EMPD meta file and dump it again:

from empd_admin.common import read_empd_meta, dump_empd_meta
meta = read_empd_meta('EMPD-data/meta.tsv')
dump_empd_meta(meta, 'EMPD-data/meta.tsv')
empd_admin.common.get_empd_master_repo()

Get the repository to the data directory and download it if necessary

This function returns a git.Repo instance for the DATADIR.

Returns

The local repository of the EMPD-data. If necessary, it has been cloned from https://github.com/EMPD2/EMPD-data.git.

Return type

git.Repo

empd_admin.common.get_psql_scripts()

The path to the postgres scripts in the EMPD-data repository

Returns

The path to the postgres scripts of the data repository

Return type

str

See also

get_empd_master_repo()

To get the data repository

empd_admin.common.get_test_dir()

The path to the tests directory in the data directory

Returns

The path to the tests of the data repository

Return type

str

See also

get_empd_master_repo()

To get the data repository

empd_admin.common.lock_empd_master()

Lock the data repository

This will lock the data repository and blocks any access to it. The locking is done through a lock file (usually in '$HOME/cloning_master.lock', see the DATA_LOCKFILE).

Use this function as a context manager, i.e. such as:

with lock_empd_master():
    # now the repository is locked
    do_something()
# now it is not locked anymore
empd_admin.common.read_empd_meta(fname=None, addokexcept=True)

Read an EMPD-data metadata file into a pandas DataFrame

This function is the same as pandas.read_csv() but it also ensures the correct dtype for the various columns.

Parameters

fname (str) – The path to the (tab-delimited) meta data file. If None, it will default to the meta data in the DATADIR, i.e. DATADIR + '/meta.tsv'

Returns

The given fname as a data frame. The index column will be the SampleName column in fname.

Return type

pandas.DataFrame

Examples

Read the meta data of the EMPD-data repository:

import git
from empd_admin.common import read_empd_meta
git.Repo.clone_from('https://github.com/EMPD2/EMPD-data.git')
meta = read_empd_meta('EMPD-data/meta.tsv')

See also

dump_empd_meta()

To save the meta data

empd_admin.common.wait_for_empd_master(timeout=120)

Wait until the data repository is available

This convenience function makes sure, that there is no process locking the EMPD-data repository that is accessed through the EMPD-admin