Lib¶
Reading delimiter-separated-values dsv¶
Support for reading and writing delimiter-separated value files.
- clld.lib.dsv.normalize_name(s)[source]¶
This function is called to convert ASCII strings to something that can pass as python attribute name, to be used with namedtuples.
>>> assert normalize_name('class') == 'class_' >>> assert normalize_name('a-name') == 'a_name' >>> assert normalize_name('a näme') == 'a_name' >>> assert normalize_name('Name') == 'Name' >>> assert normalize_name('') == '_' >>> assert normalize_name('1') == '_1'
- clld.lib.dsv.reader(lines_or_file, namedtuples=False, dicts=False, encoding=u'utf8', **kw)[source]¶
Parameters: - lines_or_file – Content to be read. Either a file handle, a file path or a list of strings.
- namedtuples – Yield namedtuples.
- dicts – Yield dicts.
- encoding – Encoding of the content.
- kw – Keyword parameters are passed through to csv.reader. Note that as opposed to csv.reader delimiter defaults to ‘ ‘ not ‘,’.
Returns: A generator over the rows.
iso¶
functionality to gather information about iso-639-3 codes from sil.org
- clld.lib.iso.get(path)[source]¶
retrieve a resource from the sil site and return it’s representation.
rdf¶
This module provides functionality for handling our data as rdf.
- class clld.lib.rdf.ClldGraph(*args, **kw)[source]¶
augment the standard rdflib.Graph by making sure our standard ns prefixes are always bound.
bibtex¶
Functionality to handle bibligraphical data in the BibTeX format.
See also
- class clld.lib.bibtex.Database(records)[source]¶
a class to handle bibtex databases, i.e. a container class for Record instances.
- class clld.lib.bibtex.EntryType[source]¶
- article
- An article from a journal or magazine. Required fields: author, title, journal, year Optional fields: volume, number, pages, month, note, key
- book
- A book with an explicit publisher. Required fields: author/editor, title, publisher, year Optional fields: volume/number, series, address, edition, month, note, key
- booklet
- A work that is printed and bound, but without a named publisher or sponsoring institution. Required fields: title Optional fields: author, howpublished, address, month, year, note, key
- conference
- The same as inproceedings, included for Scribe compatibility.
- inbook
- A part of a book, usually untitled. May be a chapter (or section or whatever) and/or a range of pages. Required fields: author/editor, title, chapter/pages, publisher, year Optional fields: volume/number, series, type, address, edition, month, note, key
- incollection
- A part of a book having its own title. Required fields: author, title, booktitle, publisher, year Optional fields: editor, volume/number, series, type, chapter, pages, address, edition, month, note, key
- inproceedings
- An article in a conference proceedings. Required fields: author, title, booktitle, year Optional fields: editor, volume/number, series, pages, address, month, organization, publisher, note, key
- manual
- Technical documentation. Required fields: title Optional fields: author, organization, address, edition, month, year, note, key
- mastersthesis
- A Master’s thesis. Required fields: author, title, school, year Optional fields: type, address, month, note, key
- misc
- For use when nothing else fits. Required fields: none Optional fields: author, title, howpublished, month, year, note, key
- phdthesis
- A Ph.D. thesis. Required fields: author, title, school, year Optional fields: type, address, month, note, key
- proceedings
- The proceedings of a conference. Required fields: title, year Optional fields: editor, volume/number, series, address, month, publisher, organization, note, key
- techreport
- A report published by a school or other institution, usually numbered within a series. Required fields: author, title, institution, year Optional fields: type, number, address, month, note, key
- unpublished
- A document having an author and title, but not formally published. Required fields: author, title, note Optional fields: month, year, key
- class clld.lib.bibtex.Record(genre, id_, *args, **kw)[source]¶
A BibTeX record is basically an ordered dict with two special properties - id and genre.
To overcome the limitation of single values per field in BibTeX, we allow fields, i.e. values of the dict to be iterables of strings as well. Note that to support this use case comprehensively, various methods of retrieving values will behave differently. I.e. values will be
- joined to a string in __getitem__,
- retrievable as assigned with get (i.e. only use get if you know how a value was assigned),
- retrievable as list with getall
Note
Unknown genres are converted to “misc”.
>>> r = Record('article', '1', author=['a', 'b'], editor='a and b') >>> assert r['author'] == 'a and b' >>> assert r.get('author') == r.getall('author') >>> assert r['editor'] == r.get('editor') >>> assert r.getall('editor') == ['a', 'b']
- text()[source]¶
linearize the bib record according to the rules of the unified style
Book: author. year. booktitle. (series, volume.) address: publisher.
Article: author. year. title. journal volume(issue). pages.
Incollection: author. year. title. In editor (ed.), booktitle, pages. address: publisher.
See also
http://celxj.org/downloads/UnifiedStyleSheet.pdf https://github.com/citation-style-language/styles/blob/master/ unified-style-linguistics.csl
- clld.lib.bibtex.stripctrlchars(string)[source]¶
remove unicode invalid characters
>>> stripctrlchars(u'a\u0008\u000ba') u'aa'
coins¶
See also
fmpxml¶
Functionality to retrieve data from a FileMaker server using the ‘Custom Web Publishing with XML’ protocol.
- class clld.lib.fmpxml.Client(host, db, user, password, limit=1000, cache=None, verbose=True)[source]¶
Client for FileMaker’s ‘Custom Web Publishing with XML’ feature.
- clld.lib.fmpxml.normalize_markup(s)[source]¶
normalize markup in filemaker data
>>> assert normalize_markup('') is None >>> assert normalize_markup('<span>bla</span>') == 'bla' >>> s = '<span style="font-style: italic;">bla</span>' >>> assert normalize_markup(s) == s >>> s = '<span style="font-weight: bold;">bla</span>' >>> assert normalize_markup(s) == s >>> s = '<span style="font-variant: small-caps;">bla</span>' >>> assert normalize_markup(s) == s