dbrepo package

Subpackages

Submodules

dbrepo.RestClient module

class dbrepo.RestClient.RestClient(endpoint: str = 'http://gateway-service', username: str = None, password: str = None, secure: bool = True)

Bases: object

The RestClient class for communicating with the DBRepo REST API. All parameters can be set also via environment variables, e.g. set endpoint with REST_API_ENDPOINT, username with REST_API_USERNAME, etc. You can override the constructor parameters with the environment variables.

Parameters:
  • endpoint – The REST API endpoint. Optional. Default: “http://gateway-service

  • username – The REST API username. Optional.

  • password – The REST API password. Optional.

  • secure – When set to false, the requests library will not verify the authenticity of your TLS/SSL certificates (i.e. when using self-signed certificates). Default: true.

analyse_datatypes(file_path: str, separator: str, enum: bool = None, enum_tol: int = None, upload: bool = True) DatatypeAnalysis

Import a csv dataset from a file and analyse it for the possible enums, line encoding and column data types.

Parameters:
  • file_path – The path of the file that is imported on the storage service.

  • separator – The csv column separator.

  • enum – If set to true, enumerations should be guessed, otherwise no guessing. Optional.

  • enum_tol – The tolerance for guessing enumerations (ignored if enum=False). Optional.

  • upload – If set to true, the file from file_path will be uploaded, otherwise no upload will be performed and the file_path will be treated as S3 filename and analysed instead. Optional. Default: true.

Returns:

The determined data types, if successful.

Raises:
analyse_keys(file_path: str, separator: str, upload: bool = True) KeyAnalysis

Import a csv dataset from a file and analyse it for the possible primary key.

Parameters:
  • file_path – The path of the file that is imported on the storage service.

  • separator – The csv column separator.

  • upload – If set to true, the file from file_path will be uploaded, otherwise no upload will be performed and the file_path will be treated as S3 filename and analysed instead. Optional. Default: true.

Returns:

The determined ranking of the primary key candidates, if successful.

Raises:
analyse_table_statistics(database_id: int, table_id: int) TableStatistics

Analyses the numerical contents of a table in a database with given database id and table id.

Parameters:
  • database_id – The database id.

  • table_id – The table id.

Returns:

The table statistics, if successful.

Raises:
check_database_access(database_id: int) bool

Checks access of a view in a database with given database id and view id.

Parameters:

database_id – The database id.

Returns:

The access type, if successful.

Raises:
create_container(name: str, host: str, image_id: int, sidecar_host: str, sidecar_port: int, privileged_username: str, privileged_password: str, port: int = None, ui_host: str = None, ui_port: int = None) Container

Register a container instance executing a given container image. Note that this does not create a container, but only saves it in the metadata database to be used within DBRepo. The container still needs to be created through e.g. docker run image:tag -d.

Parameters:
  • name – The container name.

  • host – The container hostname.

  • image_id – The container image id.

  • sidecar_host – The container sidecar hostname.

  • sidecar_port – The container sidecar port.

  • privileged_username – The container privileged user username.

  • privileged_password – The container privileged user password.

  • port – The container port bound to the host. Optional.

  • ui_host – The container hostname displayed in the user interface. Optional. Default: value of host

  • ui_port – The container port displayed in the user interface. Optional. Default: default_port of image.

Returns:

The container, if successful.

Raises:
create_database(name: str, container_id: int, is_public: bool) Database

Create a databases in a container with given container id.

Parameters:
  • name – The name of the database.

  • container_id – The container id.

  • is_public – The visibility of the database. If set to true everything will be visible, otherwise only the metadata (schema, identifiers) will be visible to the public.

Returns:

The database, if successful.

Raises:
create_database_access(database_id: int, user_id: str, type: AccessType) AccessType

Create access to a database with given database id and user id.

Parameters:
  • database_id – The database id.

  • user_id – The user id.

  • type – The access type.

Returns:

The access type, if successful.

Raises:
create_identifier(database_id: int, type: IdentifierType, titles: List[CreateIdentifierTitle], publisher: str, creators: List[CreateIdentifierCreator], publication_year: int, descriptions: List[CreateIdentifierDescription] = None, funders: List[CreateIdentifierFunder] = None, licenses: List[License] = None, language: Language = None, subset_id: int = None, view_id: int = None, table_id: int = None, publication_day: int = None, publication_month: int = None, related_identifiers: List[CreateRelatedIdentifier] = None) Identifier

Create an identifier draft.

Parameters:
  • database_id – The database id of the created identifier.

  • type – The type of the created identifier.

  • titles – The titles of the created identifier.

  • publisher – The publisher of the created identifier.

  • creators – The creator(s) of the created identifier.

  • publication_year – The publication year of the created identifier.

  • descriptions – The description(s) of the created identifier. Optional.

  • funders – The funders(s) of the created identifier. Optional.

  • licenses – The license(s) of the created identifier. Optional.

  • language – The language of the created identifier. Optional.

  • subset_id – The subset id of the created identifier. Required when type=SUBSET, otherwise invalid. Optional.

  • view_id – The view id of the created identifier. Required when type=VIEW, otherwise invalid. Optional.

  • table_id – The table id of the created identifier. Required when type=TABLE, otherwise invalid. Optional.

  • publication_day – The publication day of the created identifier. Optional.

  • publication_month – The publication month of the created identifier. Optional.

  • related_identifiers – The related identifier(s) of the created identifier. Optional.

Returns:

The identifier, if successful.

Raises:
  • MalformedError – If the payload is rejected by the service.

  • ForbiddenError – If something went wrong with the authorization.

  • NotExistsError – If the database, table/view/subset or user does not exist.

  • ServiceConnectionError – If something went wrong with connection to the search service.

  • ServiceError – If something went wrong with obtaining the information in the search service.

  • ResponseCodeError – If something went wrong with the creation of the identifier.

create_subset(database_id: int, query: str, page: int = 0, size: int = 10, timestamp: datetime = None, df: bool = False) Result | DataFrame

Executes a SQL query in a database where the current user has at least read access with given database id. The result set can be paginated with setting page and size (both). Historic data can be queried by setting timestamp.

Parameters:
  • database_id – The database id.

  • query – The query statement.

  • page – The result pagination number. Optional. Default: 0.

  • size – The result pagination size. Optional. Default: 10.

  • timestamp – The timestamp at which the data validity is set. Optional. Default: <current timestamp>.

  • df – If true, the result is returned as Pandas DataFrame. Optional. Default: False.

Returns:

The result set, if successful.

Raises:
create_table(database_id: int, name: str, columns: List[CreateTableColumn], constraints: CreateTableConstraints, description: str = None) Table

Updates the database owner of a database with given database id.

Parameters:
  • database_id – The database id.

  • name – The name of the created table.

  • constraints – The constraints of the created table.

  • columns – The columns of the created table.

  • description – The description of the created table. Optional.

Returns:

The table, if successful.

Raises:
create_table_data(database_id: int, table_id: int, data: dict) None

Insert data into a table in a database with given database id and table id.

Parameters:
  • database_id – The database id.

  • table_id – The table id.

  • data – The data dictionary to be inserted into the table with the form column=value of the table.

Raises:
  • MalformedError – If the payload is rejected by the service (e.g. LOB could not be imported).

  • ForbiddenError – If something went wrong with the authorization.

  • NotExistsError – If the table does not exist.

  • ServiceError – If something went wrong with obtaining the information in the metadata service.

  • ResponseCodeError – If something went wrong with the insert.

create_user(username: str, password: str, email: str) UserBrief

Creates a new user.

Parameters:
  • username – The username of the new user. Must be unique.

  • password – The password of the new user.

  • email – The email of the new user. Must be unique.

Returns:

The user, if successful.

Raises:
create_view(database_id: int, name: str, query: str, is_public: bool) View

Create a view in a database with given database id.

Parameters:
  • database_id – The database id.

  • name – The name of the created view.

  • query – The query of the created view.

  • is_public – The visibility of the view. If set to true everything will be visible, otherwise only the metadata (schema, identifiers) will be visible to the public.

Returns:

The created view, if successful.

Raises:
delete_container(container_id: int) None

Deletes a container with given id. Note that this does not delete the container, but deletes the entry in the metadata database. The container still needs to be removed, e.g. docker container stop hash and then docker container rm hash.

Parameters:

container_id – The container id.

Raises:
delete_database_access(database_id: int, user_id: str) None

Deletes the access for a user to a database with given database id and user id.

Parameters:
  • database_id – The database id.

  • user_id – The user id.

Raises:
delete_table(database_id: int, table_id: int) None

Delete a table with given database id and table id.

Parameters:
  • database_id – The database id.

  • table_id – The table id.

Raises:
delete_table_data(database_id: int, table_id: int, keys: dict) None

Delete data in a table in a database with given database id and table id.

Parameters:
  • database_id – The database id.

  • table_id – The table id.

  • keys – The key dictionary matching the rows in the form column=value.

Raises:
delete_view(database_id: int, view_id: int) None

Deletes a view in a database with given database id and view id.

Parameters:
  • database_id – The database id.

  • view_id – The view id.

Raises:
endpoint: str = None
get_concepts() List[Concept]

Get list of concepts known to the metadata database.

Returns:

List of concepts, if successful.

get_container(container_id: int) Container

Get a container with given id.

Returns:

List of containers, if successful.

Raises:
get_containers() List[ContainerBrief]

Get all containers.

Returns:

List of containers, if successful.

Raises:

ResponseCodeError – If something went wrong with the retrieval.

get_database(database_id: int) Database

Get a databases with given id.

Parameters:

database_id – The database id.

Returns:

The database, if successful.

Raises:
get_database_access(database_id: int) AccessType

Get access of a view in a database with given database id and view id.

Parameters:

database_id – The database id.

Returns:

The access type, if successful.

Raises:
get_databases() List[DatabaseBrief]

Get all databases.

Returns:

List of databases, if successful.

Raises:

ResponseCodeError – If something went wrong with the retrieval.

get_databases_count() int

Count all databases.

Returns:

Count of databases if successful.

Raises:

ResponseCodeError – If something went wrong with the retrieval.

get_identifiers(database_id: int = None, subset_id: int = None, view_id: int = None, table_id: int = None) List[Identifier] | str

Get list of identifiers, filter by the remaining optional arguments.

Parameters:
  • database_id – The database id. Optional.

  • subset_id – The subset id. Optional. Requires database_id to be set.

  • view_id – The view id. Optional. Requires database_id to be set.

  • table_id – The table id. Optional. Requires database_id to be set.

Returns:

List of identifiers, if successful.

Raises:
  • NotExistsError – If the accept header is neither application/json nor application/ld+json.

  • FormatNotAvailable – If the service could not represent the output.

  • ResponseCodeError – If something went wrong with the retrieval of the identifiers.

get_jwt_auth(username: str = None, password: str = None) JwtAuth

Obtains a JWT auth object from the auth service containing e.g. the access token and refresh token.

Parameters:
  • username – The username used to authenticate with the auth service. Optional. Default: username from the RestClient constructor.

  • password – The password used to authenticate with the auth service. Optional. Default: password from the RestClient constructor.

Returns:

JWT auth object from the auth service, if successful.

Raises:
get_licenses() List[License]

Get list of licenses allowed.

Returns:

List of licenses, if successful.

get_queries(database_id: int) List[Query]

Get queries from a database with given database id.

Parameters:

database_id – The database id.

Returns:

List of queries, if successful.

Raises:
  • ForbiddenError – If something went wrong with the authorization.

  • NotExistsError – If the database or user does not exist.

  • ServiceError – If something went wrong with obtaining the information in the data service.

  • ResponseCodeError – If something went wrong with the retrieval.

get_subset(database_id: int, subset_id: int) Query

Get query from a database with given database id and query id.

Parameters:
  • database_id – The database id.

  • subset_id – The subset id.

Returns:

The query, if successful.

Raises:
get_subset_data(database_id: int, subset_id: int, page: int = 0, size: int = 10, df: bool = False) Result | DataFrame

Re-executes a query in a database with given database id and query id.

Parameters:
  • database_id – The database id.

  • subset_id – The subset id.

  • page – The result pagination number. Optional. Default: 0.

  • size – The result pagination size. Optional. Default: 10.

  • size – The result pagination size. Optional. Default: 10.

  • df – If true, the result is returned as Pandas DataFrame. Optional. Default: False.

Returns:

The result set, if successful.

Raises:
  • MalformedError – If the payload is rejected by the service.

  • ForbiddenError – If something went wrong with the authorization.

  • NotExistsError – If the database, query or user does not exist.

  • ServiceError – If something went wrong with obtaining the information in the data service.

  • ResponseCodeError – If something went wrong with the retrieval.

get_subset_data_count(database_id: int, subset_id: int, page: int = 0, size: int = 10) int

Re-executes a query in a database with given database id and query id and only counts the results.

Parameters:
  • database_id – The database id.

  • subset_id – The subset id.

  • page – The result pagination number. Optional. Default: 0.

  • size – The result pagination size. Optional. Default: 10.

Returns:

The result set, if successful.

Raises:
  • MalformedError – If the payload is rejected by the service.

  • ForbiddenError – If something went wrong with the authorization.

  • NotExistsError – If the database, query or user does not exist.

  • ServiceError – If something went wrong with obtaining the information in the data service.

  • ResponseCodeError – If something went wrong with the retrieval.

get_table(database_id: int, table_id: int) Table

Get a table with given database id and table id.

Parameters:
  • database_id – The database id.

  • table_id – The table id.

Returns:

List of tables, if successful.

Raises:
get_table_data(database_id: int, table_id: int, page: int = 0, size: int = 10, timestamp: datetime = None, df: bool = False) Result | DataFrame

Get data of a table in a database with given database id and table id.

Parameters:
  • database_id – The database id.

  • table_id – The table id.

  • page – The result pagination number. Optional. Default: 0.

  • size – The result pagination size. Optional. Default: 10.

  • timestamp – The query execution time. Optional.

  • df – If true, the result is returned as Pandas DataFrame. Optional. Default: False.

Returns:

The result of the view query, if successful.

Raises:
get_table_data_count(database_id: int, table_id: int, page: int = 0, size: int = 10, timestamp: datetime = None) int

Get data count of a table in a database with given database id and table id.

Parameters:
  • database_id – The database id.

  • table_id – The table id.

  • page – The result pagination number. Optional. Default: 0.

  • size – The result pagination size. Optional. Default: 10.

  • timestamp – The query execution time. Optional.

Returns:

The result of the view query, if successful.

Raises:
get_table_history(database_id: int, table_id: int, size: int = 100) Database

Get the table history of insert/delete operations.

Parameters:
  • database_id – The database id.

  • table_id – The table id.

  • size – The number of operations. Optional. Default: 100.

Raises:
get_table_metadata(database_id: int) Database

Generate metadata of all system-versioned tables in a database with given id.

Parameters:

database_id – The database id.

Raises:
get_tables(database_id: int) List[TableBrief]

Get all tables.

Parameters:

database_id – The database id.

Returns:

List of tables, if successful.

Raises:
get_units() List[Unit]

Get all units known to the metadata database.

Returns:

List of units, if successful.

Raises:

ResponseCodeError – If something went wrong with the retrieval.

get_user(user_id: str) User

Get a user with given user id.

Returns:

The user, if successful.

Raises:
get_users() List[UserBrief]

Get all users.

Returns:

List of users, if successful.

Raises:

ResponseCodeError – If something went wrong with the retrieval.

get_view(database_id: int, view_id: int) View

Get a view of a database with given database id and view id.

Parameters:
  • database_id – The database id.

  • view_id – The view id.

Returns:

The view, if successful.

Raises:
get_view_data(database_id: int, view_id: int, page: int = 0, size: int = 10, df: bool = False) Result | DataFrame

Get data of a view in a database with given database id and view id.

Parameters:
  • database_id – The database id.

  • view_id – The view id.

  • page – The result pagination number. Optional. Default: 0.

  • size – The result pagination size. Optional. Default: 10.

  • df – If true, the result is returned as Pandas DataFrame. Optional. Default: False.

Returns:

The result of the view query, if successful.

Raises:
get_view_data_count(database_id: int, view_id: int) int

Get data count of a view in a database with given database id and view id.

Parameters:
  • database_id – The database id.

  • view_id – The view id.

Returns:

The result count of the view query, if successful.

Raises:
get_views(database_id: int) List[View]

Gets views of a database with given database id.

Parameters:

database_id – The database id.

Returns:

The list of views, if successful.

Raises:
get_views_metadata(database_id: int) Database

Generate metadata of all views in a database with given id.

Parameters:

database_id – The database id.

Raises:
import_table_data(database_id: int, table_id: int, file_name_or_data_frame: str | DataFrame, separator: str = ',', quote: str = '"', skip_lines: int = 0, line_encoding: str = '\n') None

Import a csv dataset from a file into a table in a database with given database id and table id. ATTENTION: the import is column-ordering sensitive! The csv dataset must have the same columns in the same order as the target table.

Parameters:
  • database_id – The database id.

  • table_id – The table id.

  • file_name_or_data_frame – The path of the file that is imported on the storage service or pandas dataframe.

  • separator – The csv column separator. Optional.

  • quote – The column data quotation character. Optional.

  • skip_lines – The number of lines to skip. Optional. Default: 0.

  • line_encoding – The encoding of the line termination. Optional. Default: CR (Windows).

Raises:
  • MalformedError – If the payload is rejected by the service (e.g. LOB could not be imported).

  • ForbiddenError – If something went wrong with the authorization.

  • NotExistsError – If the table does not exist.

  • ServiceError – If something went wrong with obtaining the information in the metadata service.

  • ResponseCodeError – If something went wrong with the insert.

password: str = None
publish_identifier(identifier_id: int) Identifier

Publish an identifier with given id.

Parameters:

identifier_id – The identifier id.

Returns:

The identifier, if successful.

Raises:
  • MalformedError – If the payload is rejected by the service.

  • ForbiddenError – If something went wrong with the authorization.

  • NotExistsError – If the database, table/view/subset or user does not exist.

  • ServiceConnectionError – If something went wrong with connection to the search service.

  • ServiceError – If something went wrong with obtaining the information in the search service.

  • ResponseCodeError – If something went wrong with the creation of the identifier.

refresh_jwt_auth(refresh_token: str) JwtAuth

Refreshes a JWT auth object from the auth service containing e.g. the access token and refresh token.

Parameters:

refresh_token – The refresh token.

Returns:

JWT auth object from the auth service, if successful.

Raises:
save_identifier(identifier_id: int, database_id: int, type: IdentifierType, titles: List[CreateIdentifierTitle], publisher: str, creators: List[CreateIdentifierCreator], publication_year: int, descriptions: List[CreateIdentifierDescription] = None, funders: List[CreateIdentifierFunder] = None, licenses: List[License] = None, language: Language = None, subset_id: int = None, view_id: int = None, table_id: int = None, publication_day: int = None, publication_month: int = None, related_identifiers: List[CreateRelatedIdentifier] = None) Identifier

Save an existing identifier and update the metadata attached to it.

Parameters:
  • identifier_id – The identifier id.

  • database_id – The database id of the created identifier.

  • type – The type of the created identifier.

  • titles – The titles of the created identifier.

  • publisher – The publisher of the created identifier.

  • creators – The creator(s) of the created identifier.

  • publication_year – The publication year of the created identifier.

  • descriptions – The description(s) of the created identifier. Optional.

  • funders – The funders(s) of the created identifier. Optional.

  • licenses – The license(s) of the created identifier. Optional.

  • language – The language of the created identifier. Optional.

  • subset_id – The subset id of the created identifier. Required when type=SUBSET, otherwise invalid. Optional.

  • view_id – The view id of the created identifier. Required when type=VIEW, otherwise invalid. Optional.

  • table_id – The table id of the created identifier. Required when type=TABLE, otherwise invalid. Optional.

  • publication_day – The publication day of the created identifier. Optional.

  • publication_month – The publication month of the created identifier. Optional.

  • related_identifiers – The related identifier(s) of the created identifier. Optional.

Returns:

The identifier, if successful.

Raises:
  • MalformedError – If the payload is rejected by the service.

  • ForbiddenError – If something went wrong with the authorization.

  • NotExistsError – If the database, table/view/subset or user does not exist.

  • ServiceConnectionError – If something went wrong with connection to the search service.

  • ServiceError – If something went wrong with obtaining the information in the search service.

  • ResponseCodeError – If something went wrong with the creation of the identifier.

secure: bool = None
update_database_access(database_id: int, user_id: str, type: AccessType) AccessType

Updates the access for a user to a database with given database id and user id.

Parameters:
  • database_id – The database id.

  • user_id – The user id.

  • type – The access type.

Returns:

The access type, if successful.

Raises:
update_database_owner(database_id: int, user_id: str) Database

Updates the database owner of a database with given database id.

Parameters:
  • database_id – The database id.

  • user_id – The user id of the new owner.

Returns:

The database, if successful.

Raises:
update_database_schema(database_id: int) Database

Updates the database table and view metadata of a database with given database id.

Parameters:

database_id – The database id.

Returns:

The updated database, if successful.

Raises:
update_database_visibility(database_id: int, is_public: bool) Database

Updates the database visibility of a database with given database id.

Parameters:
  • database_id – The database id.

  • is_public – The visibility of the database. If set to true everything will be visible, otherwise only the metadata (schema, identifiers) will be visible to the public.

Returns:

The database, if successful.

Raises:
update_subset(database_id: int, subset_id: int, persist: bool) Query

Save query or mark it for deletion (at a later time) in a database with given database id and query id.

Parameters:
  • database_id – The database id.

  • subset_id – The subset id.

  • persist – If set to true, the query will be saved and visible in the user interface, otherwise the query is marked for deletion in the future and not visible in the user interface.

Returns:

The query, if successful.

Raises:
update_table_column(database_id: int, table_id: int, column_id: int, concept_uri: str = None, unit_uri: str = None) Column

Update semantic information of a table column by given database id and table id and column id.

Parameters:
  • database_id – The database id.

  • table_id – The table id.

  • column_id – The column id.

  • concept_uri – The concept URI. Optional.

  • unit_uri – The unit URI. Optional.

Returns:

The column, if successful.

Raises:
  • MalformedError – If the payload is rejected by the service.

  • ForbiddenError – If something went wrong with the authorization.

  • NotExistsError – If the accept header is neither application/json nor application/ld+json.

  • ServiceConnectionError – If something went wrong with connection to the search service.

  • ServiceError – If something went wrong with obtaining the information in the search service.

  • ResponseCodeError – If something went wrong with the retrieval of the identifiers.

update_table_data(database_id: int, table_id: int, data: dict, keys: dict) None

Update data in a table in a database with given database id and table id.

Parameters:
  • database_id – The database id.

  • table_id – The table id.

  • data – The data dictionary to be updated into the table with the form column=value of the table.

  • keys – The key dictionary matching the rows in the form column=value.

Raises:
  • MalformedError – If the payload is rejected by the service (e.g. LOB data could not be imported).

  • ForbiddenError – If something went wrong with the authorization.

  • NotExistsError – If the table does not exist.

  • ServiceError – If something went wrong with obtaining the information in the metadata service.

  • ResponseCodeError – If something went wrong with the update.

update_user(user_id: str, theme: str, language: str, firstname: str = None, lastname: str = None, affiliation: str = None, orcid: str = None) User

Updates a user with given user id.

Parameters:
  • user_id – The user id of the user that should be updated.

  • theme – The user theme. One of “light”, “dark”, “light-contrast”, “dark-contrast”.

  • language – The user language localization. One of “en”, “de”.

  • firstname – The updated given name. Optional.

  • lastname – The updated family name. Optional.

  • affiliation – The updated affiliation identifier. Optional.

  • orcid – The updated ORCID identifier. Optional.

Returns:

The user, if successful.

Raises:
update_user_password(user_id: str, password: str) User

Updates the password of a user with given user id.

Parameters:
  • user_id – The user id of the user that should be updated.

  • password – The updated user password.

Returns:

The user, if successful.

Raises:
upload(file_path: str) str

Uploads a file located at file_path to the Upload Service.

Parameters:

file_path – The location of the file on the local filesystem.

Returns:

Filename on the S3 backend of the Upload Service, if successful.

username: str = None
whoami() str | None

Print the username.

Returns:

The username, if set.

Module contents