The clld database models are declared using SQLAlchemy’s declarative extension. In particular we follow the approach of mixins and custom base class, to provide building blocks with enough shared commonality for custom data models.

Declarative base and mixins

class clld.db.meta.Base(jsondata=None, **kwargs)[source]

The declarative base for all our models.

classmethod get(value, key=None, default=<NoDefault>, session=None)[source]

Convenience method to query a model where exactly one result is expected.

e.g. to retrieve an instance by primary key or id.

  • value – The value used in the filter expression of the query.
  • key (str) – The key or attribute name to be used in the filter expression. If None is passed, defaults to pk if value is int otherwise to id.
jsondata = Column(None, JSONEncodedDict(), table=None)

To allow storage of arbitrary key,value pairs with typed values, each model provides a column to store JSON encoded dicts.


Deprecated convenience function.

Use jsondata directly instead, which is guaranteed to be a dictionary.

pk = Column(None, Integer(), table=None, primary_key=True, nullable=False)

All our models have an integer primary key which has nothing to do with the kind of data stored in a table. ‘Natural’ candidates for primary keys should be marked with unique constraints instead. This adds flexibility when it comes to database changes.


Convenience function.

Since we use the simple JSON encoded dict recipe without mutation tracking, we provide a convenience method to update

class clld.db.meta.CustomModelMixin[source]

Mixin for customized classes in our joined table inheritance scheme.


With this scheme there can be only one specialized mapper class per inheritable base class.

class clld.db.models.common.IdNameDescriptionMixin[source]

Mixin for ‘visible’ objects, i.e. anything that has to be displayed.

In particular all Resources fall into this category.


Only one of clld.db.models.common.IdNameDescriptionMixin.description or clld.db.models.common.IdNameDescriptionMixin.markup_description should be supplied, since these are used mutually exclusively.

description = Column(None, Unicode(), table=None)

A description of the object.

id = Column(None, String(), table=None)

A str identifier of an object which can be used for sorting and as part of a URL path; thus should be limited to characters valid in URLs, and should not contain ‘.’ or ‘/’ since this may trip up route matching.

markup_description = Column(None, Unicode(), table=None)

A description of the object containing HTML markup.

name = Column(None, Unicode(), table=None)

A human readable ‘identifier’ of the object.

While the above mixin only adds columns to a model, the following mixins do also add relations between models, thus have to be used in combination, tied together by naming conventions.

class clld.db.models.common.DataMixin[source]

Provide a simple way to attach key-value pairs to a model class given by name.

class clld.db.models.common.HasDataMixin[source]

Adds a convenience method to retrieve the key-value pairs from data as dict.


It is the responsibility of the programmer to make sure conversion to a dict makes sense, i.e. the keys in data are actually unique, thus usable as dictionary keys.


return dict of associated key-value pairs.

class clld.db.models.common.FilesMixin[source]

This mixin provides a way to associate files with instances of another model class.


The file itself is not stored in the database but must be created in the filesystem, e.g. using the create method.

create(dir_, content)[source]

Write content to a file using dir_ as file-system directory.

Returns:File-system path of the file that was created.
mime_type = Column(None, String(), table=None)

Mime-type of the file content.

ord = Column(None, Integer(), table=None, default=ColumnDefault(1))

Ordinal to control sorting of files associated with one db object.


OS file path of the file relative to the application’s file-system dir.

class clld.db.models.common.HasFilesMixin[source]

Mixin for model classes which may have associated files.


return dict of associated files keyed by id.

Typical usage looks like

class MyModel_data(Base, Versioned, DataMixin):

class MyModel_files(Base, Versioned, FilesMixin):

class MyModel(Base, HasDataMixin, HasFilesMixin):

Core models

The CLLD data model includes the following entities commonly found in linguistic databases and publications:

class clld.db.models.common.Dataset(jsondata=None, **kwargs)[source]

Represents a database.

Each project (e.g. WALS, APiCS) is regarded as one dataset; thus, each app will have exactly one Dataset object.


primary key


date of publication




place of publication

class clld.db.models.common.Language(jsondata=None, **kwargs)[source]

Languages are the main objects of discourse.

We attach a geo-coordinate to them to be able to put them on maps.


geographical latitude in WGS84


geographical longitude in WGS84


primary key

class clld.db.models.common.Parameter(jsondata=None, **kwargs)[source]

A measurable attribute of a language.


primary key

class clld.db.models.common.ValueSet(jsondata=None, **kwargs)[source]

The intersection of Language, Parameter, and optionally Contribution.


Represents a column in a database table.


primary key


textual description of the source for the valueset

class clld.db.models.common.Value(jsondata=None, **kwargs)[source]

A measurement of a parameter for a particular language.


textual assessment of the reliability of the value assignment


Languages may have multiple values for the same parameter. Their relative frequency can be stored here.

class clld.db.models.common.Contribution(jsondata=None, **kwargs)[source]

A set of data contributed within the same context by the same contributors.


primary key

class clld.db.models.common.Contributor(jsondata=None, **kwargs)[source]

Creator of a contribution.


primary key

class clld.db.models.common.Source(jsondata=None, **kwargs)[source]

A bibliographic record, cited as source for some statement.


primary key

class clld.db.models.common.Unit(jsondata=None, **kwargs)[source]

A linguistic unit of a language.


primary key

class clld.db.models.common.UnitParameter(jsondata=None, **kwargs)[source]

A measurable attribute of a unit.


primary key

class clld.db.models.common.UnitValue(jsondata=None, **kwargs)[source]

primary key

validate_parameter_pk(key, unitparameter_pk)[source]

Validator to sync related parameter.

We have to make sure, the parameter a value is tied to and the parameter a possible domainelement is tied to stay in sync.