The Organization of Digital Information Systems
The fundamental problem of communication is reproducing a message from one point to another. — Claude Shannon
The Information Age has revolutionized economy, politics, and culture.
DIS (Digital Information Systems) store, transmit, and transform data with incredible efficiency.
The challenge: Make sense of vast, rapidly growing information.
DIS are digital, interconnected, and concerned with information.
We interact with DIS through apps, webpages, and programs.
(And for that, we need computers and phones)
The purpose of this treatise: Understand DIS systematically.
My hypothesis: these methods yield a 10x speedup and 2-4x quality & value.
My goal: by making DIS simpler, more humans will be empowered to bring their best.
What is information? Why do we care about DIS?
Information = Data + Context. Understanding transforms data into information.
(From now on: data = information)
DIS only do three things:
Speed, accuracy, and cost set digital systems apart from their predecessors.
Digital systems represent data with electrons, not carvings or ink.
The challenge: organize vast amounts of information. Organization is the key.
Most DIS challenges stem from understanding how parts interrelate.
Simple systems are easier to understand than complex ones.
The art of system design is the making of systems that are as simple as possible.
(The limiting factor is complexity, not energy)
The main thesis: to understand the system, focus on the data
When designing or understanding, focus on the data, not The List.
(The painting, not the palette and brushes!)
The List
Why not look at the data?
(This entire treatise topples these two myths)
We can build on five pillars, each of them a practical concept that removes a major obstacle to looking at the data.
The five pillars
Pillar 1: Single representation of data
Overcomes not being able to look and describe data in unambiguous terms.
Pillar 2: Single dataspace
Overcomes having parts of the system floating around instead of being part of one whole picture.
Pillar 3: Call and response
Overcomes the invisibility of how data is transformed inside a DIS.
Pillar 4: Logic is what happens between call and response
Overcomes doubts about the shape of the solution for a clearly specified problem.
Pillar 5: Interface is call and response
Overcomes separateness between system and user and between data and time.
Pillar 1: Single Representation of Data
Digital data is binary. We need to find a better way to represent it than zeroes and ones.
Introducing fourdata
A textual representation for all data.
(Why text? Because it is linear, compact and portable)
Fourdata represents four types of data:
1234
Hi
Data types can be combined and nested
And can represent data as diverse as an HTTP call
Or the state of a CPU
Or a row in a database
Or a simple web page
Fourdata can represent any conceivable data directly and without ambiguity, just using text and a few rules.
Pillar 2: Single dataspace
Every piece of data in our system has a path to it.
A path is a sequence of texts and numbers.
The path to eggs is breakfast 2
Paths are themselves data because they consist of numbers and texts.
Every data point in our system has a path to it.
Paths don't just point to data, they are the data!
Paths make places memorable, associative, even permanent.
To the left, there is context. To the right, detail.
Paths are themselves data because they consist of numbers and texts.
Every DIS stores its data in two primary forms: files and databases.
Both can be placed in the dataspace.
For files, put the path to the file as a hash, followed by its content:
Files can also be represented in binary format
For databases:
The dataspace is not where the data is.
The dataspace is the data.
Pillar 3: Call and response
The combination of a call and a response can be used to express any data transformation.
The formula of a call:
A reference to a variable:
A function call:
A database query:
An HTTP call:
An assembler instruction:
Call and response represent the dynamic nature of data while still being data.
Pillar 4: Logic is what happens between call and response
Logic is how a call creates a response.
(Logic is intentional transformation of data).
The five elements of logic:
The first three are essential, the last two are nice to have.
A reference is the destination of a call.
References are links between parts of the dataspace.
A reference can point to a mere value:
And it also can point to a call:
Resolving a reference is finding what part of the dataspace it refers to.
Here is one way to do it:
A sequence is a list of calls.
The concept of sequence works at any level of abstraction.
Function, flow, procedure, operation, definition, all express the same: a list of calls.
A good analogy for a sequence is a recipe:
The colon (:) freezes the sequence so that it can be expanded only when it is called.
When we call a sequence, we can see its expansion:
Note the colon (:) contains the expansion of the calls.
Conditionals let you choose between sequences based on a condition:
The simplest conditional only has one sequence, which will only be expanded if the condition is true.
An example with two sequences:
Loops are conditional repetition of sequences:
This simple type of loop will likely be like at least 50% of the loops in your logic.
Loops can be used as filters:
Or as accumulators:
Recursion can be understood as loops with depth:
If calls can represent transformation at any level, we can go beyond declarative vs imperative.
The "what" is the interface, the "how" is its logic. Every call is declarative on the outside, imperative on the inside.
Pillar 5: Interface is call and response
Everyday notion of interface: something made for humans, with graphics.
Formal notion of interface: a boundary between two parts of the system.
A more general way to look at it: an interface is the combination of a call and its response.
Implications:
Calls are reactive.
When something changes (destination, message, logic), the response is updated.
When a part of the system changes, the system updates itself to stay in sync. This is the true meaning of reactivity.