# Big Friendly Datastore Organic data collaboration, from the ground up. This is an experiment in APIs, data storage, aggregation and discovery. ## Developer Setup This project uses Python 3.8+. 1. Clone the repository. 2. Create and start a new virtual environment. 3. `pip install -r requirements.txt` 4. Type `make` to see a list of common developer related tasks. For instance, to run the full test suite and code checks type: `make check`. ## Core Concepts * Objects represent things and have a unique unicode name. BFG doesn't impose any further constraints on the name, except uniqueness. However, naming conventions are likely to evolve and/or be specified by mutual agreement of users working together across domains. * Namespaced tags annotate data on openly writeable objects. The combination of namespace (who) and tag (what) provide meaningful context for the data tagged onto the object. * Tag values containing data about objects are typed and can be queried via predicate based comparison operations (`and`, `or`, `not`, `<`, `>`, `=`, `<=`, `>=`, `(`, `)`), or naive string matching a la SQL `like` (and case insensitive `ilike`) pattern matching operator on string values. Queries return specified tags on matching objects. * Namespaces have admins and tags have users and readers. Admins configure the namespaces and tags belonging to their namespaces, users may annotate objects with the namespaces/tags and readers can see the namespaces/tags and their associated values. * Interrogate individual objects for readable namespaces/tags (that may match a pattern). * Events are raised when specific changes happen in the datastore. These are configured to call web-hooks so third parties can follow what's going on. The event log can be used observe how the object and associated values changed through time (i.e. versioning). * Data types understood by BFD: string, boolean, integer, floats, datetime and duration. Geospatial types may also be added soon. Blobs of arbitrary bytes may also be stored (as a URL that references raw data identified by mime-type). There is no such thing as "null". If a value isn't known, the tag is removed (but its historic presence is retained in the event log). ## Implementation * Delivered via a REST API. Query results returned as either JSON or CSV. * Admins, users and readers are expressed as a "whitelist", where an empty list means "everyone". For instance, if the readers are set to, `[]`, then everyone can see the namespace/tag. If the users are set to, `["nicholas", ]`, then only the user identified as "nicholas" can annotate with the namespace/tag. If the admins are set to, `["mary", "penelope", ]`, then only the users identified as "mary" and "penelope" may change the behaviour of the namespace and the tags contained therein. # Acknowledgements Many of the ideas found herein have evolved from those used in FluidDB by [Fluidinfo](https://fluidinfo.com/), a defunct startup project I was involved with between 2009-2012 (when it folded). Special mention to [Terry Jones](https://github.com/terrycojones) for much of the original thought behind this, and to [Nick Radcliffe](https://github.com/njr0) for subsequent stimulating exploration of the concepts involved. Why this? Why now? I find myself in need of such a data store, and since FluidDB is no more, I need to reheat the ocean with my own plastic kettle. 🤨 ## Contents ```eval_rst .. toctree:: :maxdepth: 2 contributing.md code_of_conduct.md api.md query.md architecture.md adr.md license.md authors.md ```