Database

The clld database models are declared using SQLAlchemy’s declarative extension. In particular we follow the approach of mixins and custom base class, to provide building blocks with enough shared commonality for custom data models.

Declarative base and mixins

class clld.db.meta._Base[source]

The declarative base for all our models.

active = Column(None, Boolean(), table=None, default=ColumnDefault(True))

The active flag is meant as an easy way to mark records as obsolete or inactive, without actually deleting them. A custom Query class could then be used which filters out inactive records.

created = Column(None, DateTime(timezone=True), table=None, default=ColumnDefault(<function <lambda> at 0x3efd320>))

To allow for timestamp-based versioning - as opposed or in addition to the version number approach implemented in clld.db.meta.Versioned - we store a timestamp for creation or an object.

classmethod get(value, key=None, default=<NoDefault>, session=None)[source]

Convenient method to query a model where exactly one result is expected, e.g. to retrieve an instance by primary key or id.

Parameters:
  • value – The value used in the filter expression of the query.
  • key (str) – The key or attribute name to be used in the filter expression. If None is passed, defaults to pk if value is int otherwise to id.
history()[source]
Returns:Result proxy to iterate over previous versions of a record.
jsondata = Column(None, JSONEncodedDict(), table=None)

To allow storage of arbitrary key,value pairs with typed values, each model provides a column to store JSON encoded dicts.

classmethod mapper_name()[source]

To make implementing model class specific behavior across the technology boundary easier - e.g. specifying CSS classes - we provide a string representation of the model class.

Return type:str
pk = Column(None, Integer(), table=None, primary_key=True, nullable=False)

All our models have an integer primary key which has nothing to do with the kind of data stored in a table. ‘Natural’ candidates for primary keys should be marked with unique constraints instead. This adds flexibility when it comes to database changes.

update_jsondata(**kw)[source]

Since we use the simple JSON encoded dict recipe without mutation tracking, we provide a convenience method to update

updated = Column(None, DateTime(timezone=True), table=None, onupdate=ColumnDefault(<function <lambda> at 0x3efd410>), default=ColumnDefault(<function <lambda> at 0x3efd398>))

Timestamp for latest update of an object.

class clld.db.meta.CustomModelMixin[source]

Mixin for customized classes in our joined table inheritance scheme.

Note

With this scheme there can be only one specialized mapper class per inheritable base class.

class clld.db.models.common.IdNameDescriptionMixin[source]

Mixin for ‘visible’ objects, i.e. anything that has to be displayed (to humans or machines); in particular all Resources fall into this category.

Note

Only one of clld.db.models.common.IdNameDescriptionMixin.description or clld.db.models.common.IdNameDescriptionMixin.markup_description should be supplied, since these are used mutually exclusively.

description = Column(None, Unicode(), table=None)

A description of the object.

id = Column(None, String(), table=None)

A str identifier of an object which can be used for sorting and as part of a URL path; thus should be limited to characters valid in URLs, and should not contain ‘.’ or ‘/’ since this may trip up route matching.

markup_description = Column(None, Unicode(), table=None)

A description of the object containing HTML markup.

name = Column(None, Unicode(), table=None)

A human readable ‘identifier’ of the object.

While the above mixin only adds columns to a model, the following mixins do also add relations between models, thus have to be used in combination, tied together by naming conventions.

class clld.db.models.common.DataMixin[source]

This mixin provides a simple way to attach arbitrary key-value pairs to another model class identified by class name.

class clld.db.models.common.HasDataMixin[source]

Adds a convenience method to retrieve the key-value pairs from data as dict.

Note

It is the responsibility of the programmer to make sure conversion to a dict makes sense, i.e. the keys in data are actually unique, thus usable as dictionary keys.

datadict()[source]
Returns:dict of associated key-value pairs.
class clld.db.models.common.FilesMixin[source]

This mixin provides a way to associate files with instances of another model class.

Note

The file itself is not stored in the database but must be created in the filesystem, e.g. using the create method.

create(dir_, content)[source]

Write content to a file using dir_ as file-system directory.

Returns:File-system path of the file that was created.
mime_type = Column(None, String(), table=None)

Mime-type of the file content.

ord = Column(None, Integer(), table=None, default=ColumnDefault(1))

Ordinal to control sorting of files associated with one db object.

relpath[source]

OS file path of the file relative to the application’s file-system directory.

class clld.db.models.common.HasFilesMixin[source]

Mixin for model classes which may have associated files.

files[source]
Returns:dict of associated files keyed by id.

Typical usage looks like

class MyModel_data(Base, Versioned, DataMixin):
    pass

class MyModel_files(Base, Versioned, FilesMixin):
    pass

class MyModel(Base, HasDataMixin, HasFilesMixin):
    pass

Core models

The CLLD data model includes the following entities commonly found in linguistic databases and publications:

class clld.db.models.common.Dataset(**kwargs)[source]

Each project (e.g. WALS, APiCS) is regarded as one dataset; thus, each app will have exactly one Dataset object.

get_stats(resources, **filters)[source]
Parameters:
  • resources
  • filters
Returns:

class clld.db.models.common.Language(**kwargs)[source]

Languages are the main objects of discourse. We attach a geo-coordinate to them to be able to put them on maps.

class clld.db.models.common.Parameter(**kwargs)[source]

A measurable attribute of a language.

class clld.db.models.common.ValueSet(**kwargs)[source]

The intersection of Language and Parameter.

class clld.db.models.common.Value(**kwargs)[source]

A measurement of a parameter for a particular language.

class clld.db.models.common.Contribution(**kwargs)[source]

A set of data contributed within the same context by the same set of contributors.

class clld.db.models.common.Contributor(**kwargs)[source]

Creator of a contribution.

last_first()[source]

ad hoc - possibly incorrect - way of formatting the name as “last, first”

class clld.db.models.common.Source(**kwargs)[source]

A bibliographic record, cited as source for some statement.

class clld.db.models.common.Unit(**kwargs)[source]

A linguistic unit of a language.

class clld.db.models.common.UnitParameter(**kwargs)[source]

A measurable attribute of a unit.

class clld.db.models.common.UnitValue(**kwargs)[source]
validate_parameter_pk(key, unitparameter_pk)[source]

We have to make sure, the parameter a value is tied to and the parameter a possible domainelement is tied to stay in sync.

Versioning

Versioned model objects are supported via the clld.db.versioned.Versioned mixin, implemented following the corresponding SQLAlchemy ORM Example.

Support for per-record versioning; based on an sqlalchemy recipe.

Migrations

Migrations provide a mechanism to update the database model (or the data) in a controlled and repeatable way. CLLD apps use alembic to implement migrations.