In 2019, faculty of computer science at HSE and personally Anton Ayzenberg and Vsevolod Chernyshev organized the laboratory for applied geometry and topology. The main project of the laboratory was inspired by paper by Curto and Itskov and aimed to reconstruct topology of a room by neuronal activity of an animal.
This task is indeed complex and is strong enough for the dedicated laboratory. The laboratory in initial form was started as educational, hence one of the first action taken by organizers was to open a recruitment of research interns. I was among the first interns selected. The laboratory also collaborated with faculty of MSU, who were responsible for experimental data.
Several words must be written about the major task.
Assume there is a genetically modified mouse (rats were not available) walking over the room with several obstacles. During her walks, spikes of place cells of her brain happen in certain regions of the room. We can detect these spikes and, theoretically, map them to coordinates of the mouse.
Assume that regions of coverage of single neurons are good enough to be thought of as of open disks of a fixed radius. Then as mouse moves and eventually covers all the room, these regions also cover all the room. Coverage by open disks is good in sense that all n-wise intersections of covering sets are contractible. And the room is modeled by a closed bounded set on a plane, hence is compact. Hence the conditions of the Čech theorem are satisfied. The covering is known by experimental data, hence the Čech complex is also (theoretically) known. And this is the way to the declared aim.
There are two ways in which persistence comes into play:
If one wants to obtain the best reliable data about the room, she should focus on both dimensions. The second is way harder to measure, so the lab focused on the first.
I’m not sure if there are some reliable results out of this work. I know about preprint, published by Konstantin Sorokin, but I didn’t read it. But anyways, there were obstacles on the way, in particular, computational complexity of persistent homology calculation.
As a pure math student, I was mostly out of the work with data, or at least I didn’t think about it much. But I was assigned a specific problem: explore ways to reduce dimensions of Čech complexes we work with. I was also given a hint to look at the Quillen-McCord theorem (or Quillen fiber lemma, or Quillen theorem A for posets, I’m never sure which name is best known).
On a second year of studies, I was not skilled enough technically for the task. By the end (in May ‘20) I thought I made an advancement, i.e. a persistent version of the theorem. But at the same time I became a father and started to create my department in MCCME from scratch. I had to quit the lab, I was mentally unable to continue with the theorem, and I finally stopped to think about it.
Two and a half years later I decided I’m ready to finish this job. Basic examination of the text convinced me the strategy was correct, but the proof was actually missing. I told Anton that there are enough changes to be made to make it a diploma work, and started to rewrite it. This preprint is an important milestone, is substantially the same as my graduation thesis, and, to my current understanding, contains the theorem with the correct proof.
The process of the work is reflected of Github. I find such development of everything a very good practice.
The work was presented outside of HSE faculty of Mathematics for a first time as a poster at the YTM 2023 in EPFL on July 25.
]]>My friend has mathematical education and is familiar with algebra, discrete optimization and real analysis, but he had never had any course in topology and does not work as a mathematician. I tried to sketch a path to the subject. At the end we agreed that an exercise sheet would support it a lot.
So here is this sketch of a path towards classifying spaces, extended and supplied with exercises. Let’s run.
Let’s consider a finite set \(X\) of points in the Euclidean space \(\mathbb{R}^n\). We can consider multiple closed shapes in the space such that \(X\) lies on a boundary of the shape. We want to pick one standard shape.
What could we want from such a shape? At first, we want it to be constructible directly from \(X\). At second, we want to have one-to-one correspondence between shapes and sets. Hence points of \(X\) must have some special geometric property which other points on a shape does not have.
Natural choices are balls and polytopes. Balls pass the first demand, but fail the second. Polytopes pass both since they have a naturally distinguished set of points — its vertices. So let’s choose a polytope.
Polytope can be informally defined as a shape with flat sides. It means that each face of a polytope contains all geodesics between pairs of points of its boundary. In Euclidean case, a face is a convex hull of its vertices. Id est for a set \(X = \{x_i\}\) its convex hull \(Conv(X) = \{\sum_i a_ix_i|\; \sum_i a_i = 1,\; a_i \geq 0\}\).
Exercise: play with conditions of the definition and check what it matches proposed informal definition with geodesics and all conditions match with intuition of a face.
Convex hull of \(k+1\) (\(k \leq n\)) vertices is called a geometric simplex. It is a polytope. If its set of vertices does not coincide with initial set of vertices, a simplex is called degenerate, otherwise we say that a simplex has dimension \(k\).
Each polytope can be decomposed into union of convex polytopes intersecting at faces. We could admit it as a definition, but anyways where is something to think about. Each convex polytope is a convex hull of its vertices \(V\). Each point in a convex hull of set \(V \subset \mathbb{R}^n\) lies in a convex hull of some subset \(W \subset V\) of cardinality at most \(n+1\), i.e. in a simplex. This statement is called the Carathéodory’s theorem. It gives us a decomposition of a polytope into a union of simplices of dimension at most \(n\). This decomposition (simplicial decomposition/triangulation) has two good properties: all faces of simplices lie in a polytope and simplices intersect at faces.
Hence by triangulation of a polytope we obtain a first example of a geometric simplicial complex — union of simplices satisfying two conditions at the end of the last paragraph.
Exercise: Details of a decomposition using Caratheodory’s theorem:
We can observe that our geometric simplicial complex is only defined by vertices of its faces. Hence we can abstract from geometry here and define an abstract simplicial complex. Let \(X\) be a finite set and \(\mathcal{P}(X)\) be its powerset. Then \(S \subset \mathcal{P}(X)\) is called an abstract simplicial complex if for any \(V \in S\) and \(W \subset V\) \(W \in S\). Elements of \(S\) are called simplices, number of elements in a simplex minus one is called its dimension.
We can also define a geometric realization which maps points of \(X\) to points in Euclidean space of appropriate dimension in such a way that each simplex maps into geometric simplex of the same dimension. There can be plenty of geometric realizations, the most commonly referred to is a standard geometric realization, which maps simplices to regular polytopes.
Finally, let \((P,\leq)\) be a finite partially ordered set (poset). We can construct an abstract simplicial complex by taking chains of length \(n+1\) to be \(n\)-simplices. Note that equalities are excluded from this construction.
Exercise: check that it is a simplicial complex.
This complex is called an order complex of a poset. We denote its geometric realization as \(BP\).
Any order relation satisfies two properties: \(a \leq b;\; b \leq c\) imply \(a \leq c\) and \(a \leq a\) for all elements in a poset.
Let’s introduce a definition: a (small) category is a set called a set of objects with sets \(\operatorname{Hom}(A,B)\) (homsets) of arrows between objects \(A\) and \(B\) for each pair \((A,B)\). The following two properties must be satisfied: for each pair of arrows \(f : A \to B\) and \(B \to C\) there exists an arrow \(g \circ f : A \to C\) and for each object \(A\) there exists an arrow \(Id_A : A \to A\).
Each poset forms a category with \(P\) being a set of objects and arrows between objects \(a\) and \(b\) present if and only if \(a \leq b\). This category has a property that each homset has at most one element. Order complex construction remains valid on this category simply by change of terms — we replace chains of length \(n+1\) with sequences of composable morphisms of length \(n+1\), excluding identities. The only change is that we do not forget an orientation. A result of this general construction is called a nerve of a category.
Categories with finite sets of objects and finite sets of morphisms can be displayed as diagrams. Consider some examples (compositions and identity arrows are omitted):
All of them have more than one sequence on the same set of objects. Hence nerves of these categories cannot be represented as simplicial complexes. But they have geometric realizations defined by similar construction and comprised of geometric simplices and their continuous deformations. Objects like these are called simplicial sets. Let’s think of them exactly as of nerves of categories and refer to an expository text by Greg Friedman for the formal definition and an overview.
Let’s define geometric realization inductively by procedure of attaching \(n\)-simplices to simplices of smaller dimensions. Vertices define set of geometric vertices. Assume we have a geometric realization of subset of simplices of dimension not above \(n\). When we can attach \(n+1\)-simplices by deforming and gluing them to their boundaries, which are already in a structure.
We can examine it by the third example drawn: we have two vertices and three segments of dimension one glued to them. Actually as the diagram shows. This drawing can be deformed to a wegde of two circles by contracting a segment in the middle.
Note that in general this realization is not a structure in some Euclidean space.
Denote a category by \(\mathcal{C}\). Geometric realization of its nerve \(B\mathcal{C}\) is called a classifying space of a category.
Exercise: Construct a simplicial decomposition of a circle such that is has two simplices of each dimensions.
Exercise: Construct a simplicial decomposition of a sphere such that is has two simplices of each dimensions.
Exercise: Classifying space of the second category is a simplical decomposition of infinite-dimensional real projective space (space of lines intersecting the origin in \(\mathbb{R}^n\)).
In a previous section, we noted without explanation that geometric realization of a simplicial set is not necessarily a subset of an Euclidean space. Then what is it? Here we have to introduce topology.
Let \(X\) be a set and \(\mathcal{P}(X)\) be its powerset. We say that \(\tau \subset \mathcal{P}(X)\) is a topology if it satisfies the following properties: \(X \in \tau\); \(\emptyset \in \tau\); union of any collections of sets of \(\tau\) lies in \(\tau\) and intersection of any finite (why?) collection of sets of \(\tau\) lies in \(\tau\). A set \(X\) with a topology is called a topological space.
Elements of \(\tau\) are called open sets. Family \(B\) of open sets such that open set can be represented as union of sets of \(B\) is called a base of the topology.
Exercise: Check that open balls in a metric space are a base of a topology.
Analogously one can define a subbase — collection of open sets such that they with their finite intersection form a base of a topological space.
We can study functions on topological spaces. We define a function to be continuous if a preimage of any open set is an open set.
Exercise: Check that for metric spaces this definition is equivalent to an \(\varepsilon-\delta\)-definition.
Two topological spaces \(X\) and \(Y\) are said to be homeomorphic if there exist functions \(f : X \to Y\) and \(f : Y \to X\) such that \(f \circ g = Id_Y\) and \(g \circ f = Id_X\).
Exercise: Write a homeomorphism between a unit segment and a line.
Exercise: Draw a homeomorphism between a circle and a square.
We have defined geometric simplices as subsets of \(R_n\) with boundary. Each of them can be turned into a topological space by induced topology: we say that a set in a subspace is open if it is open in a greater space or is an intersection of the subspace with an open set.
We can now formalize notion of “continuous deformation of a simplex” — it is a space homeomorphic to given simplex. For example, a closed ball of corresponding dimension fits.
Exercise: draw such a homeomorphism for planar case.
Let’s now play a constructor. We have variety of simplices of different dimensions; we can deform them to homeomorphic ones; and we can attach any simplex to simplices of smaller dimension by continuous maps defined everywhere on a boundary by continuous attaching maps. Union of all these simplices (cells) glued by attached maps is called a cellular complex. By construction, we see that a classifying space of a category is a cellular complex.
To make a statement that a classifying space of a category is a topological space we have to introduce topology on cellular complexes: its subset \(U\) is open if and only if pairwise intersections of \(U\) with all cells are open.
Note that cellular complex up to homeomorphisms is defined by purely combinatorial data. This data is encoded in a notion of a simplicial set similarly to how geometric simplicial complex is encoded in its abstract counterpart.
Exercise: Provide examples of two abstract simplicial complexes with homeomorhic geometric realizations. Hint: barycentric subdivision.
We also used the word “contracting” without a proper definition. Indeed, it deserves it.
At first, we have to define a product of topological spaces. As a set, it is a Cartesian product. We have to define topology. It is natural to require projections onto components to be continuous and usual choice is to select preimages of all open sets under these projections to be a subbase of a topology. Specific topology is not important for a talk, but we want to be able to speak about continuous functions from \(X \times [0,1]\).
Now assume we have two continuous functions \(f,g : X \to Y\). They are said to be homotopic (denoted by \(f \simeq g\)) if there exists a continuous function \(F : X \times [0,1] \to Y\) such that \(F(\_,0) = f\) and \(F(\_,1) = g\).
Exercise: Give a formula for homotopy between \(Id : [0,1] \to [0,1]\) and a constant map \(pt_0 : [0,1] \to [0,1]\).
Two spaces \(X, Y\) are said to be homotopy-equivalent if there exist functions \(f : X \to Y\) and \(f : Y \to X\) such that \(f \circ g \simeq Id_Y\) and \(g \circ f \simeq Id_X\).
Exercise: Prove that a real line is homotopy-equivalent to a point.
Exercise: Prove that infinite-dimensional sphere is homotopy-equivalent to a point. Instead of its cellular decomposition, it’s better to use explicit definition: \(S^{\infty} = \{(x_0,\ldots,x_i,\ldots) \in \mathbb{R}^{\infty}|\; \sum_i x_i^2 = 1\}\).
A topological space is said to be contractible if it is homotopy-equivalent to a point. Thus we have a formal meaning of an operation we conducted on our example 3. We replaced a contractible cell by a point with corresponding update of attaching maps.
For a motivation we can formulate a first theorem which essentially uses classifying spaces.
At first define an open covering: family \(\mathcal{U}\) of open sets in \(X\) is called an open covering if \(\bigcup \mathcal{U} = X\). We can associate a poset \(P(\mathcal{U})\) with a covering by considering all possible nonempty intersections of sets of \(\mathcal{U}\) with inclusions order relation. This poset represents a category, this category has a classifying space.
We say that the space is compact if every open cover has a finite subcover.
Exercise (long): prove criterion of compactness for Euclidean spaces: subset of an Euclidean space is compact if and only if it is closed and bounded.
Theorem (Borsuk 1948, referred to as Čech theorem, Alexandrov-Čech theorem or nerve theorem):
Let \(X\) be a compact topological space with cover \(U\) such that all nonempty intersections of sets of \(U\) are contractible. Then \(BP(\mathcal{U}) \simeq X\).
Exercise: If you know some other formulation of the theorem, prove its equivalence with given (with compactness requirement on both sides).
Now we know something about structure of classifying spaces and what they are. We know some statement which essentially uses this notion (or equally powerful one). But we barely can compute them once we have loops, infinite-dimensional projective space is our simplest example.
We need some other technique to see some classifying spaces at hand. Classifying spaces of categories are a generalization of a more established notion with the same name. Let’s meet with them also and hope it helps.
Every topological space \(X\) has its group of continuous transformations \(\operatorname{End}(X)\). Let \(\rho : G \to \operatorname{End}(X)\) be a homomorphism. It is called a group action.
Let’s endow \(G\) with a topology. For example, every group can be supplied with discrete topology — we declare each subset of \(G\) to be open. Then we can fix an arbitrary point \(x \in X\) and consider its orbit under continuous (as a map \(G \times X \to X\)) action of \(G\). This orbit is embedded into \(X\) via continuous map.
Exercise: If two orbits of an action have a common point, they coincide.
By exercise, we see that being in a same orbit is an equivalence relation. Hence we can take quotient of \(X\) by it. We did not discuss a quotient topology yet: sets that have an open preimage under quotient map are declared to be open.
If \(X=EG\) is a contractible space with an action of \(G\), this quotient \(BG\) is called a classifying space of a group \(G\).
Exercise: Let \(G = \mathbb{Z}_2\) and \(EG = S^{\infty}\). Describe action \(G \hookrightarrow \operatorname{End}{EG}\) and its quotient space.
Exercise: What is the classifying space of \(\mathbb{Z}\)?
By looking at this action on a contractible space, it is easy to catch how group can be represented as a category — we have a single homotopically trivial object with different endomorphisms indexed by elements of a group. By drawing it we come up with a category with one object and morphisms indexed by elements of a group.
For instance, \(\mathbb{Z}_2\)is represented by diagram of example 1, denote it by \(\mathcal{D}_{\mathcal{Z}_2}\). We see (given the last exercise) that \(B\mathbb{Z}_2 = B\mathcal{D}_{\mathcal{Z}_2}\).
There is a general fact that \(BG \simeq B\mathcal{D}_G\). It is approachable within the technique given in this text. Almost. Up to the fact that classifying space is unique up to (weak) homotopy equivalence.
Last exercise: Prove independently from this fact that \(B\mathcal{D}_{\mathcal{Z}}\) is homotopy-equivalent to a circle.
This proof may also require some methods which were not mentioned in the text.
I leave several important questions not covered for now:
The intention of the text was to run to classifying spaces and here we are, surrounded by questions.
]]>We have only considered binary products and coproducts. We can consider a family of products of arbitrary arity defining them as representing objects of \(\Pi_{i \in I}\operatorname{Hom}(\_,A_i)\) for finite ordinals \(I\). This definition has a problem with \(I = 0\), we treat this case specially and define the empty product as the terminal object. Universal properties of these objects can be drawn easily from the binary case.
\(Hask\) has all the finite products, for instance, product (a,(b,c))
satisfy the universal property. This is an instance of a general statement that a category with binary products and terminal objects has all finite products.
Usual binary product (,)
serves as a tensor product in \((Hask, (,), ())\), hence it satisfies associativity law and these products can be written without too many parens. Haskell has this notion — there are tuples of length bigger than 2. But each of them is implemented separately in GHC and it doesn’t allow tuples of length above some finite number.
This number is big enough for most applications, but here we need it only to notice that it is not a notion Haskell uses for the parens-free presentation of finite products. The used notion is the following:
data P3 = P3 a b c
This type is naturally isomorphic to (a,b,c)
and is an example of a product type.
Dually \(Hask\) has all finite coproducts (initial object is set to be empty coproduct), which follows from the existence of Either
. Again from monoidal structure \((Hask, Either, Void)\) we derive that parens-free notation is legal.
In Haskell, coproducts are written as
data C3 = Cons1 a | Cons2 b | Cons3 c
This is an example of a sum type.
Now consider two types:
type Mix1 = (a, Either b c)
and
type Mix2 = Either (a, b) (a, c)
It’s easy to see that they are canonically isomorphic, let’s write isomorphism down.
iso :: (Mix1 -> Mix2, Mix2 -> Mix1)
iso = (mix1toMix2, mix2toMix1)
mix1toMix2 :: Mix1 -> Mix2
mix1toMix2 (a, Left b) = Left (a,b)
mix1toMix2 (a, Right c) = Right (a,c)
mix2toMix1 :: Mix2 -> Mix1
mix2toMix1 (Left (a,b)) = (a, Left b)
mix2toMix1 (Right (a,c)) = (a, Right c)
This isomorphism inductively generalizes to arbitrary finite coproducts.
We have that tensor product (,)
distributes with finite coproducts.
Assume we have an arbitrary symmetric monoidal category with tensor product distributive over finite coproducts (frequently denoted as symmetric monoidal category with finite coproducts). Consider classes of isomorphic types and denote binary product as \(\cdot\), binary coproduct as \(+\), terminal object as \(1\), and initial object as \(0\).
The laws of monoidal categories we have built can be rewritten as follows:
Distributivity gives us equation \(a \cdot (b + c) = a \cdot b + a \cdot c\).
Proposition: For any object \(a\) \(0 \cdot a = 0\). I.e. \(0 \cdot a\) is an initial object.
Proof: Consider \(\operatorname{Hom}(0 \cdot a, b)\) for some \(b\). It is non-empty due to existence of the unique composition \(0 \cdot a \xrightarrow{\pi_0} \to b\). Let \(f, g \in \operatorname{Hom}(0 \cdot a, b)\). Consider \((0 \cdot a) + (0 \cdot a) = (0 + 0) \cdot a = 0 \cdot a\). There is a unique morphism from \(0\) to \(0\), hence inclusions \(i_1\) and \(i_2\) to the left and right summands coincide. Consider \(f + g : (0 \cdot a) + (0 \cdot a) \to b + b\). Compositions \((f + g) . i_1\) and $(f + g) . i_2$ are equal as $b + 0 = 0 + b$. But \((f + g) . i_1$ = f + 0 = f$ and\)(f + g) . i_2 = 0 + g = g$$.
Reference: ncatlab: Proposition 2.2.
For the category of finite sets, this construction yields the semiring of natural numbers, as Qiaochu Yuan points out. This is an easy exercise and we can compare it with the notation of finite ordinals used to compactly denote finite products at the start of the post.
We have also just shown that isomorphism classes of objects in \(Hask\) as a symmetric monoidal category with finite coproducts form a commutative semiring.
So do Haskell types. This construction seems to be a good answer to the question of why algebraic types are algebraic.
Actually, we have not constructed algebraic types since we excluded recursive types. And there is a valid question if the given semigroup structure extends to the whole class of algebraic types. But we did not rely on the nature of elements while proving a general statement, so there is nothing to raise a problem. Recursive types deserve at least a separate post.
]]>In the previous post, I promised to justify the change of signature of fmap
from (a -> b) -> f a -> f b
from Prelude
to (a -> b) -> (f a -> f b)
. And occasionally explain the former notation. There is a categorical notion to help, it is called Cartesian closedness.
Here we leave indices of \(\operatorname{Hom}\)-sets to context, same for definitions of objects — each subsection operates a fixed category.
Category \(\mathrm{C}\) is called cartesian closed if it has terminal object, for all \(A,B \in \operatorname{Ob}(\mathrm{C})\) there exists a product \(A \times B\) and exponential object \(B^A\).
To define all relevant notions it is convenient (following lectures in Russian by A. Gorodentsev) to define representable functor.
Functor \(F : \mathrm{C}^{op} \to Set\) is said to be representable if is it naturally isomorphic to \(\operatorname{Hom}(\_,A)\) for some \(A\). We call \(A\) the representing object of functor \(F\). Representing object is unique up to canonical isomorphism.
Any representing object has associated universal property. Let’s see it by example.
Consider objects \(A,B \in \mathrm{C}\). If functor \(\operatorname{Hom}(\_,A) \times \operatorname{Hom}(\_,B)\) is representable, its representing object \(A \times B\) is called the product of \(A\) and \(B\).
Write this definition: \(\forall Y\; Hom(Y, A \times B) \cong \operatorname{Hom}(Y,A) \times \operatorname{Hom}(Y,B)\).
By setting \(Y = A \times B\), we obtain via this isomorphism the pair of maps \(\pi_A : A \times B \to A\) and \(\pi_B : A \times B \to B\) which are the image of \(Id_{A \times B}\) under isomorphism. They are called canonical projections. For arbitrary \(Y\) this isomorphism guarantees existence and uniqueness of map \(\phi : Y \to A \times B\) for any pair of maps \(\psi_A : Y \to A\) and \(\psi_B : Y \to B\). There are maps \(\pi_A \circ \phi\) and \(\pi_B \circ \phi\), since isomorphism is natural, \(\pi_A \circ \phi = \psi_A\), \(\pi_B \circ \phi = \psi_B\).
Thus we obtain the universal property of a product:
For any \(Y\), \(f : Y \to A\) and \(g : Y \to B\) there exists map \(\phi : Y \to A \times B\) such that the following diagram commutes:
Consider constant functor \(\overline{\{0\}}\) moving all objects to a fixed singleton set and all morphisms to identity. We call representing object of \(\overline{\{0\}}\) terminal object of category \(\mathrm{C}\). By evaluating its universal property we have that it is the object \(T\) with exactly one morphism \(Y \to T\) for any \(Y\).
Assume \(\mathrm{C}\) has all binary products. Consider functor \(\operatorname{Hom}(\_ \times A, B)\). Its representing object \(B^A\) is called exponential object. Actually, we defined it via isomorphism \(\operatorname{Hom}(Y, B^A) \cong \operatorname{Hom}(Y \times A, B)\) which may look familiar.
By taking \(Y = B^A\) we obtain map \(eval : B^A \times A \to B\). We can draw the universal property:
Usually, terminal objects and products are defined in terms of limits. But limits are examples of representing objects and are not really important here. We will turn back to them later.
To prove the statement in the header we have to demonstrate the terminal object, exponentials, and products.
Since morphisms in \(Hask\) do not have to preserve any structure the only candidates for the terminal object are types with a single term. These are types isomorphic to ()
. They are all isomorphic (admit invertible bijections between each other), hence there is no need to check universality.
The evaluation map of the exponential object suggests an object satisfying universal property for types a, b
— it is a -> b
. Products are given by type with two natural projections — (a,b)
. Since universal objects are defined up to natural isomorphisms, product types data Cons = Cons B C
are also products, same for newtypes over arrow types, etc.
Definition of exponential object gives us a natural isomorphism (a,b) -> c
\(\cong\) a -> (b -> c)
for any types a, b, c
. This isomorphism is called currying. In Haskell it is common and is reflected in both implementation and notation — type a -> b -> c
is literally equivalent to a -> (b -> c)
(which is known as partial evaluation) and is isomorphic to (a,b) -> c
via inverse functions curry
and uncurry
.
There is an important proposition that any cartesian closed category is a closed symmetric monoidal category with the tensor product given by the product in the sense of given universal property.
Closedness (informally) means that hom-sets can be considered objects of the category. Again informally we can see it from the existence of exponentials — their elements are functions from a
to b
, and elements of \(Hom\)-sets are morphisms from a
to b
. The term closed monoidal category
encapsulates some compatibility conditions between the tensor product and exponentials. Rigorous definitions and proof of the claimed proposition are worth a separate post to be written.
So let’s proceed to the definition of the symmetric monoidal category.
Category \(\mathrm{C}\) is called monoidal if
Associativity law is the commutativity of the following diagram:
Identity law is the commutativity of the following diagram:
Monoidal category is called symmetric if \(\forall A,B\) there exists isomorphism \(s_{AB} : A \otimes B \to B \otimes A\) such that the following diagram commutes:
From cartesian closedness there automatically follows that \(Hask\) is symmetric monoidal with \(\otimes =\)(,)
and unit — ()
. Associator and unitor functors are obvious. Denote this structure as \((Hask, (,), ())\).
Consider \(End(Hask)\). It is a monoidal category with tensor product given by composition, unit — by Identity
functor, and with identical unitors and associator. Denote this structure as \((End(Hask),\circ,Id)\).
These two statements will be central in the next several posts.
Universal properties allow us to construct dual objects by formally reversing all arrows.
While doing it with already defined representing objects we obtain the following definitions:
Category with an initial object, all coexponentials, and all binary coproducts is called cocartesian coclosed.
\(Hask\) does not have all coexponentials and is not cocartesian coclosed.
But it has all coproducts given by Either a b
and it has the initial object Void
.
This data is enough to equip \(Hask\) with another structure of symmetric (since coproducts are symmetric) monoidal category. The tensor product is given by the coproduct and the unit is given by the initial object. Denote this structure as \((Hask, Either, Void)\).
These universal objects with reversed arrows can be defined via duals to representable functors.
Functor \(F : \mathrm{C} \to Set\) is said to be corepresentable if is it naturally isomorphic to \(\operatorname{Hom}(A,\_)\) for some \(A\). \(A\) is called corepresenting object of functor \(F\). Representing object is unique up to canonical isomorphism.
Initial object corepresents \(\operatorname{Hom}(\emptyset, \_)\), \(A \coprod B\) corepresents \(\operatorname{Hom}(A,\_) \otimes \operatorname{Hom}(B,\_)\), coexponential corepresents \(\operatorname{Hom}(B, \_ \coprod A)\).
In \(Hask\) \(\operatorname{Hom}(A,\_)\) is written as (->) A
. We can restrict functors to instances of the Functor
type class and obtain the following definition of representable functor: functor F
is said to be representable if there exists a natural isomorphism between F a
and (->) A a
for every type a
. A
is the corepresenting object of F
. But F
must be considered as a functor to Set with values — sets of terms of resulting types of Functor
we started with instead of types.
This definition is captured in package representable-functors. Isomorphism Cons a ~ A -> a
in terms of this library is written as f a ~ Key a -> a
, to the left it is given by tabulate
, to the right by index
.
All the content was restored by 22.02. Site has moved to Github pages + Jekyll instead of Wordpress – noone used dynamic features here yet.
]]>Hypermath is a client-server application receiving requests via HTTP. It is connected with remote relational database and with remote in-memory storage mostly used as a database cache. Application is solidly orchestrated (thanks to Egor) and supports rolling releases.
System is dedicated to serve as a teaching assistant for school teachers in mathematics. Its target audience is quite big hence system must work with high throughput. Specifics of the system yield an obvious critical point: testing of given answers.
When student press button “Check” the following actions must happen:
At a first glance there is no problem with consecutive execution of these steps in a handler thread (server uses warp). But if we perform a load testing we immediately become quite disappointed by the result.
The hardest computation is performed during answer test itself. This step is absolutely unavoidable since student needs to see result immediately. If there was no solution to simplify this step post would be over.
But answer determines the check result. Hence check results can be cached. Every math problem is tested by people before publication to the system hence in practice all frequent answers are known to system prior to publication. Hence step 1 is actually very fast in overwhelming majority of cases.
This solution unlocks the problem I want to talk about. Given this sequence of required operations performed by a handler we have to redesign it and reduce latency of a request at an acceptable cost of delays in statistics updates. Importantly, we must guarantee data safety in case of emergency or, more frequently, new releases.
Header of the post suggests that some actions can be performed asynchronously.
In particular, both operations 2 and 3 can be performed asynchronously.
Let’s talk about 3 first. Since statistics is stored in a table with several indices write operations are better to be performed not very frequently. So we store data to cache at the same action as 2 and then a separate job periodically flushes it to database.
Naive approach to 2 is to simply (modulo explicit passing of computation
context or accurate usage of MonadBaseControl IO
or similar technique)
put forkIO
before action.
But wait. Consider the basic version of the handler and its behavior under load.
Assume we give load of 1000rps with throughput of a handler 100rps. Then we first see normal dynamics with 100rps and when near the response timeout a massive and exponentially increasing list of 504s. At some point application freezes. It is a freeze of a web-server running the app which is at capacity with waiting requests. Good model of server crash under load can be found here.
Substantial part of the server latency comes from communication with
database. Regarding PostgreSQL, there is a setting max_connections
which limits number of simultaneously active sessions. It is recommended
to keep it small because each connection is maintained by separate
system process on a database host.
So assume we run computation which connects to database asynchronously
and we give app a constant load of 1000rps. Indeed we see desired
1000rps for some time in a testing protocol. But in a background we have
quickly reached max_connections
. Hence all other handlers requesting
database are put under stress and slowed down — we have allowed
actually more attempts to connect than in consecutive implementations
since web-server lets more requests to reach application in a moment.
Moreover we have moved exponential growth in number of requests to
application side. Hence we have created a little fork-bomb which
eventually explodes and causes application to freeze in a process of
context-switching. Green threads are lightweight but there are a lot of
them.
This is the natural setting to introduce threadpools. However this
approach is not widely used in Haskell, new threads can be created with
almost no overhead. Solution we adopted is to limit number of
simultaneously run limited computations using QSem
.
Explicitly it is written as follows:
sem <- getThreadPool
liftBaseWith (\runInBase -> forkIO
. bracket_ (waitQSem sem) (signalQSem sem)
$ runInBase action
)
where sem
is a semaphore stored in global application context. We
actually allow creation of new threads and only freeze before running
heavy actions. But this is enough to avoid excess context-switching,
most of the threads are locked. And indeed we limit burden on a
database.
From this point handler works as expected. Cost applied — delays in receiving students’ statistics may become significant.
But what’s with data safety? Assume release comes during operation under stress. Orchestration system stops replaced service only when all requests to it are answered but it ignores silent computations. We get a risk of statistical data loss before write to cache, which is increased by increased delays.
To deal with it we had to implement graceful termination. It est we track number of active explicitly forked threads and upon receiving interceptible termination signal wait for them all to finish. Regarding handler threads we rely on warp’s graceful termination.
Example code of how threads can be counted is here. Note that semaphores also count threads but we need some concurrency primitive anyway to lock signal handler.
So here it is: caching, periodic flushes of cached data to database, asynchronous actions with limit on number of simultaneously run ones, and graceful termination primitive.
]]>This post does not cover encoding and decoding of cyclic codes, existing implementations, and many mathematical aspects of codes. Maybe I will write more on this topic.
Let’s start with a practical problem. People or machines frequently have to send messages. They have to use some physical data transmission channel. Any physical channel is error-prone. Hence we have to be able to clean message from errors. One of adopted solutions is error-correcting codes.
Most of the information now is transmitted in binary form. So our message is a sequence of bits. Let’s split a message by chunks of length \(n\). We will call subset of all words of length \(n\) a code of length \(n\).
Binary word of length \(n\) is an element in \(\mathbb{F}_2^n\) — \(n\)-dimensional vector space over field with two elements \(0\) and \(1\). Multiplication by scalars is trivial over \(\mathbb{F}_2\). Hence the only operation we obtain by introducing this structure is addition. For simplicity we require all codes we consider to be closed under addition — sum of any two words in a code must lie in a code. Such codes are called linear.
Now each of our codes is a subspace in \(\mathbb{F}_2^n\). Denote number of nonzero coordinates of vector \(a\) as \(w(a)\). This function is called weight.
Using this function we can define Hamming distance. Let \(a\) and \(b\) be arbitrary words in \(\mathbb{F}_2^n\). Then Hamming distance \(d(a,b)\) is defined as \(w(a+b)\). It is symmetric by definition and is zero if and only if arguments coincide.
Note that
\(w(a+b) = w(a) + w(b) - \#\{i:\;a_i = b_i = 1\}\).
Last summand can be written as \(w(a \cdot b)\) with \(a\) and \(b\) multiplied coordinate-wise. We can check triangle inequality:
Let \(a,b,c\) be three vectors. Then \(w(a+b) + w(b+c) = w(a) + 2w(b) + w(c) - w(a \cdot b) - w(b \cdot c)\)
and \(w(a+c) = w(a) + w(c) - w(a \cdot c)\).
Difference \(w(a+b) + w(b+c) - w(a+c) = 2w(b) - w(a \cdot b) - w(b \cdot c) + w(a \cdot c)\)
is strictly non-negative since \(\forall x\; w(b) \geq w(b \cdot x)\) .
Thus Hamming distance turns \(\mathbb{F}_2^n\) into a metric space \(\mathbb{B}_2^n\). Metric induces metric topology of open balls, we shall use these terms.
We are ready to define correcting codes. Consider a message \(m\) of length \(n\) with errors. It is a vector. And it has a distance from code subspace it was expected to be in defined as minimum of distances to words in a code. If where is a single closest word \(r\) we say (assuming errors in used channel are rare) that \(r\) is a correction of \(m\). Hence code \(C\) corrects \(e\) errors if every word in \(C\) has a closed ball with center in it and radius \(e\) with no other words of \(C\) inside (including boundary).
Let \(d\) be a minimal distance between words in \(C\). This condition is equivalent to \(e = \operatorname{floor}(\frac{d}{2})\). Since \(w(a+b) = w(a+c+b+c)\) we can isometrically shift code by any vector. Hence \(d = \min_{a \neq 0 \in C}w(a)\).
Every vector space has a basis. We want to compute minimal weight of a code and we already can do it given a basis. Every word in a vector space over \(\mathbb{F}_2\) can be identified with characteristic function of its decomposition into basis vectors so we can map over all words of length equal to dimension of the code without using any complex operations. This algorithm is more effective than the one we give in the end since its complexity depends on dimension of the code and not space itself.
But we want to introduce another approach.
Consider ring \(R = \mathbb{F}[x]/(x^n-1)\). It is a vector space over \(\mathbb{F}_2\) with finite dimension \(n\). Any two vector spaces of same dimension are isomorphic, so \(R \cong \mathbb{F}_2^n\). This isomorphism \(\phi\) is given by identifying polynomial with vector of its coefficients.
Consider codes generated by linear combinations of cyclic shifts of a single vector \(g\). They are called cyclic codes. Cyclic shifts form a cyclic group and are generated by a single one-positional shift \(\sigma\). In \(\mathbb{F}_2[x]\) \(\phi(\sigma(g)) = x \cdot \phi(g)\) by construction. It is a straightforward check that this relation holds in \(R\).
\(x\) is a basis of \(R\) as a finitely generated \(\mathbb{F}_2\)-algebra. Hence every element \(f \cdot \phi(g)\) is a sum \(\phi(g) \cdot \sum x^i\) for some ‘s. Hence cyclic codes are in 1:1 correspondence to ideals in \(R\) generated by single polynomial. From general theory it is known that \(R\) is a principal ideal domain, hence these are all ideals.
Immediate consequence of this form of cyclic codes is easy encoding — to encode word in cyclic code we only have to multiply it by generating polynomial. It is just an example of why this presentation is useful.
There are several bounds on minimal weight. And for many codes, especially for best codes (I use this adjective for Hamming codes and simplicial codes) it is exactly known. But we are interested in brute-force algorithm. We have one from linear algebra and it has the best performance in most cases. But let’s use polynomial presentation.
Idea is to filter list of all polynomials of degree below \(n\) over \(\mathbb{F}_2\) by a condition of being in \(C\) and then compute lengths. Unlike when we use linear structure we have no fast way to list all polynomials in a code.
The most interesting question is how to represent a polynomial.
Naive representation is to represent polynomial as list of degrees of monomials and compute modulo \(x^n-1\) by using Euclid’s algorithm. Algorithm is here.
This algorithm runs about a week on code of length \(31\) with generating polynomial of degree \(16\) (of dimension \(15\)). For reference linear algorithm succeeds in \(3\) seconds, this is obvious cost of choice of parameter which controls complexity. In linear algorithm we have to evaluate length of each of \(2^{16}\) words. In polynomial algorithm we have to list code first by checking any of \(2^{31}\) polynomials if they belong to \(C\).
But recall that \(\phi\) is isomorphism of vector spaces. Hence we can store polynomial as a bit word. It cannot reduce complexity asymptotically but does it give us a better measures of single operation?
And it does.
Sum of vectors over \(\mathbb{F}_2\) is computationally a simple xor. It is implemented on hardware level and is fast.
Multiplication on a monomial of degree \(d\) is a bitwise shift on \(d\) bits. Hence multiplication of \(f\) and \(g\) can be performed in single cycle in at maximum \(\operatorname{deg}(f) \cdot \operatorname{deg}(g)\) hardware-level operations — one multiple for shifts, another for additions since we iterate by another polynomial.
Modulo is the hardest of three. Consider \(f \operatorname{mod} g\). \(f \operatorname{mod} g = \sum_i a_ix^i \operatorname{mod} g\) where \(x^i\) are monomials of \(f\). \(x^i \operatorname{mod} g = x^i\) if degree of \(x^i\) is less than degree of \(g\). If degrees are equal, modulo is equal to \(x^i + g\). Otherwise divide \(i\) by \(t = \operatorname{deg}(g)\).
We obtain \(i = qt + r\) and \(x^i \operatorname{mod} g = x^{tq} \operatorname{mod} g + x^r \operatorname{mod} g = (g+x^i)^t + x^r \operatorname{mod} g\).
Polynomial at the end has degree strictly less than initial hence we can divide it again and eventually compute result.
Implementation is here. It runs about six times faster than the naive one.
Implementation of operations with binary polynomials of a level of bits is an important side-effect. Latest algorithm can be easily rewritten in C and be run on specific hardware. And these operations are important not only for discrete signal transmission but are also crucial for all symmetric cryptography.
]]>We have constructed the category of restricted Haskell types. It gave us a coherent notion of composition. But it’s not enough. The major strength of Haskell is its separation of computations of different natures. So we need to be able at least to cluster types into objects with some common property. Let’s develop machinery to deal with it.
Consider categories \(\mathrm{C}\) and \(\mathrm{D}\) with pair of mappings \(F_{\operatorname{Ob}} : \operatorname{Ob}(\mathrm{C}) \to \operatorname{Ob}(\mathrm{D})\) and \(F_{\operatorname{Hom}}\) with one of the following signatures: \(\operatorname{Hom}_{\mathrm{C}}(A,B) \to \operatorname{Hom}_{\mathrm{D}}(F_{\operatorname{Ob}}(A),F_{\operatorname{Ob}}(B))\) or \(\operatorname{Hom}_{\mathrm{C}}(A,B) \to \operatorname{Hom}_{\mathrm{D}}(F_{\operatorname{Ob}}(B),F_{\operatorname{Ob}}(A))\) — mapping of all morphisms of a category, defined on each Hom-set.
We can construct a pair \(F = (F_{\operatorname{Ob}}, F_{\operatorname{Hom}})\). Its definition contains all data necessary to define a map between categories. But the composition of such maps is not well-defined (check it). It can be fixed by the following definitions:
\(F\) is called a covariant functor or functor if the following diagram commutes:
or a contravariant functor if the following diagram commutes:
Here are several useful definitions:
Let \(F : \mathrm{C} \to \mathrm{D}\) be a functor (covariant, contravariant definitions are similar). Consider \(F_{\operatorname{Hom}}\).
If \(\forall A, B \in \mathrm{C}\; F_{\operatorname{Hom}} : \operatorname{Hom}_{\mathrm{C}}(A,B) \to \operatorname{Hom}_{\mathrm{D}}(F_{\operatorname{Ob}}(A),F_{\operatorname{Ob}}(B))\) is injective, \(F\) is called faithful. If surjective — full. If bijective — fully faithful.
Categories \(\mathrm{C}\) and \(\mathrm{D}\) are said to be equivalent if there exists fully faithful functor \(F : \mathrm{C} \to \mathrm{D}\) such that every object of \(\mathrm{D}\) is isomorphic to \(F(A)\) for some \(A \in \operatorname{C}\).
We have constructed the category \(Hask\) and we have a notion of composable mappings between categories. Functors from category to itself are called endofunctors. However, it may be convenient to talk about subcategories in \(Hask\) and about functors between them.
Category \(\mathrm{D}\) is a subcategory of \(\mathrm{C}\) if \(\operatorname{Ob}(\mathrm{D}) \subseteq \operatorname{Ob}(\mathrm{C})\) and \(\forall A,B \in \operatorname{Ob}(\mathrm{D})\; \operatorname{Hom}_{\mathrm{D}}(A,B) \subseteq \operatorname{Hom}_{\mathrm{C}}(A,B)\).
If \(\forall A,B \in \operatorname{Ob}(\mathrm{D})\; \operatorname{Hom}_{\mathrm{D}}(A,B) = \operatorname{Hom}_{\mathrm{C}}(A,B)\) \(\mathrm{D}\) is called full subcategory of \(\mathrm{C}\).
Every subcategory gives rise to faithful embedding functor \(Emb : \mathrm{D} \to \mathrm{C}\) with identical actions both on objects and morphisms. If \(\mathrm{D}\) is a full subcategory, then \(Emb\) is fully faithful.
Now let’s take a look at functors in Hask.
Consider the declaration of new data type like
data Either a b = Left a | Right b
.
There are several possible constructions of \(Hask\)-endofunctor arising from this definition. The two most natural are defined here:
Left a
; \(Left_{\operatorname{Hom}}(f : a \to c)\) = (\Left a -> Left (f a))
.Right a
; \(Right_{\operatorname{Hom}}(f : b \to c)\) = (\Right a -> Right (f a))
.Both of them are well-defined covariant faithful endofunctors in \(Hask\). More specifically, \(Right\) is a functor to the category \(Either\;a\;\_\) and \(Left\) is a functor to the category \(Either\;\_\;b\).
However, only \(Right\) is supported by a valid Functor
instance in Haskell.
Instance for Either
:
instance Functor (Either c) where
fmap :: (a -> b) -> (f a -> f b)
fmap _ (Left x) = Left x
fmap f (Right y) = Right (f y)
Note: fmap
defines the action of functor on morphisms. We change the Prelude
definition for now — it is valid and will be justified in the next post.
Can \(Left\) functor be expressed? Yes:
swap :: Either b a -> Either a b
swap (Right a) = Left a
swap (Left a) = Right a
fmap' :: (a -> c) -> (Either a b -> Either c b)
fmap' f = swap . fmap f . swap
But it cannot be expressed in terms of Functor
typeclass as long as there is no single-parametric type Either _ b
in Haskell. In particular, we see that not any subcategory of \(Hask\) is encapsulated in a type.
Note that the uniqueness and derivability of Functor
is not an elementary question. Since it’s not a question of category theory, let me refer to SO.
Laws of the Functor
typeclass represent the usual definition of functor via the following diagram:
Yet another restriction on Haskell Functor
typeclass is that it does
not allow functors between nontrivial subcategories of \(Hask\).
Example of such a functor: \(LM : [] \to Maybe\);
\(LM_{\operatorname{Ob}}\) = listToMaybe;
\(LM_{\operatorname{Hom}}\) = \f -> listToMaybe . f . maybeToList
.
This functor is fully faithful. It admits faithful functor to the other side:
\(ML_{\operatorname{Ob}}\) = maybeToList
;
\(ML_{\operatorname{Hom}}\) = \f ->
maybeToList
. f . listToMaybe
.
It’s easy to check that functor \(toList : Vector \to []\) with a similar definition makes subcategories of vectors and lists equivalent.
All three of these functors are not endofunctors in \(Hask\) since they are not everywhere defined.
For any category \(\mathrm{C}\) and object \(A\) there exist two functors.
First — \(\operatorname{Hom}(A,\_) : \mathrm{C} \to Set\) is a covariant functor, moving \(X\) to \(\operatorname{Hom}(A,X)\). Second — contravariant \(\operatorname{Hom}(\_,A)\) with the same signature, moving \(X\) to \(\operatorname{Hom}(X,A)\).
Both functors matter a lot for future constructions and obviously exist in \(Hask\).
Morphisms — functions between types, hence arrows a -> b
. They are
ordinary types, hence their prefix form is (->) a b
and there can
only exist Functor
instance for covariant \(\operatorname{Hom}\).
Here it is (note that fully applied (->)
is a function):
instance Functor ((->) a) where
fmap f g = f . g
Let’s define alternating type <-
isomorphic to (->) b a
.
For this type, we can define an instance of Contravariant, which represent contravariant functors.
instance Contravariant ((<-) b) where
contramap :: (a -> c) -> f c -> f a
contramap f g = g . f
Laws of Contravariant
form the following familiar diagram:
All these constructions are relevant to \(Hask\).
Let’s take a look at the introduced structures. At the level of types, we have types and morphisms between them. Morphisms can be surjective, injective, or bijective, in the last case they are isomorphisms. Now we turn to the level of subcategories of \(Hask\) and we have functors that can be full, faithful, or fully faithful. These properties are finer than properties of Set-level morphisms but they are similar in spirit.
At the moment we can take one of two steps:
For now, we follow the second path.
Consider \(F,G\), — covariant functors from \(\mathrm{C}\) to \(\mathrm{D}\). We will call family \(\eta\) of morphisms in \(\mathrm{D}\) natural transformation from \(F\) to \(G\) if for any object \(X\) in \(\operatorname{Ob}(C)\) and morphism \(f : X \to Y\) there exist morphisms \(\eta_X : F(X) \to G(X)\) and \(\eta_Y : F(Y) \to G(Y)\) in \(\mathrm{D}\) such that \(\forall f \in \operatorname{Hom}_{\mathrm{C}}(X,Y)\) the following diagram commutes:
If both functors are contravariant, vertical arrows are reversed.
This definition lets us see \(LM\)-functor from the other side — as a natural transformation between endofunctors \([]\) and \(Maybe\). Naturality is checked by the same reasoning as being a functor.
The Haskell ecosystem contains several packages trying to express natural transformation. For example, natural-transformation package.
It’s worth noting that popular servant (0.10, link to enter function) web framework used to use explicitly typed natural transformations very close to its user interface for a long time. Here is how it was used: v0.10 tutorial
Functors (covariant without loss of generality) between tho categories \(\mathrm{C}\) and \(\mathrm{D}\) with objects — functors and morphisms — natural transformations form a category.
This is a well-known statement not about \(Hask\) with obvious proof by construction, so it will not be given.
Note that in the example above \(LM \circ ML = Id_{Maybe}\).
This category of functors is denoted as \(\operatorname{Fun}(\mathrm{C},\mathrm{D})\). \(\operatorname{Fun}(\mathrm{C},\mathrm{C})\) has a more convenient synonym — \(\operatorname{End}(\mathrm{C})\) and is called category of endofunctors of \(\mathrm{C}\).
We come up with another notable object we will use in the future — category \(\operatorname{End}(Hask)\) of endofunctors of \(Hask\).
]]>Let’s start with a basic definition:
An entity \(\mathrm{C}\) is a category if all of the following holds:
Category with class \(\operatorname{Ob}\) and all \(\operatorname{Hom}\)-classes being sets is called small.
A basic example of a category is \(Set\) — category with objects — sets and morphisms — functions between sets.
We want to construct a reasonable category out of Haskell types. We might want to construct a reasonable category out of abstract types. For this purpose Haskell is a good yardstick and support — it was designed with respect to category theory and we can go far by judging the existence of some constructions in our category by the existence of their GHC implementations.
First attempt
Let’s consider \(Hask'\) with \(\operatorname{Ob}(Hask')\) — types of Haskell and \(\operatorname{Hom}_{Hask'}(A,B)\) — all functions (closed expressions) of type $A \to B$.
It is not a category — seq undefined () = _|_
and
seq (undefined . id) () = ()
, hence undefined
\(\neq\) undefined . id
. See
post by Andrej Bauer and discussion.
Haskell wiki knows several more examples where the bottom breaks abstractions.
Failure of this attempt makes whole categorical reasoning about Haskell limited but still useful. Unfortunately, the real world is contradictory.
Second attempt
Consider \(\operatorname{Ob}(Hask)\) — Haskell types without \(\bot\) with natural \(\operatorname{Hom}\)-sets — all functions between these platonic types excluding partial and nonterminating functions. To shorten notation here we overload terms of Haskell Wiki. Wiki.Hask = Hask'; Wiki.Platonic Hask = Hask
.
Note: I’m not using the term “Maximal total subset of Haskell” in the sense of Wikipedia article since I’m not talking about provability here. The set of functions we take seems not to have a constructive definition and is broader.
\(Hask\) is a category.
With the assumption of totality and termination of all functions equational reasoning is legal (Church-Rosser property holds — it was the exact property broken by \(\bot\)).
We only have to check the properties of the composition:
f . (g . h) =(1) \y -> f (\x -> g (h x) $ y) =(2) \y -> f (g (h y))
(f . g) . h =(1) \y -> (\x -> f (g x) $ h y) =(2) \y -> f (g (h y))
seq
in a previous attempt violated identity law. Proof of the identity law by equational reasoning is trivial.(1) in equations represents taking definition, (2) refers to \(\beta\)-reduction, in both cases, we follow the applicative order of evaluation. Equality is the \(\alpha\)-congruence relation.
Following posts will operate \(Hask\).
The category \(Hask\) is not equivalent to \(Set\): consider the type data Foo = Foo (Foo -> Bool)
. The map Foo :: (Foo -> Bool) -> Foo
is an injective map from \(2^{Foo}\) to \(Foo\). This situation is impossible in \(Set\). Here is an older Reference.
It is also common to denote \(Hom_{\mathrm{C}}(A,A)\) as \(End_{\mathrm{C}}(A)\) — set of endomorphisms of \(A\).
I will frequently use the following notation:
f : A
for element \(f\) of \(A \in \operatorname{Ob}(Hask)\)f : A -> B
for \(f \in \operatorname{Hom}_{Hask}(A,B)\)This post starts a series of texts trying to explore the category of Haskell types without impredicativity (platonic Hask).
Despite the series is probably more interesting for those who adopt Haskell after category theory, I will try to recall all necessary definitions. I expect some level of mathematical culture to be comfortable with this series, otherwise, it may become hard to read.
Probably someone interprets it as a sort of intro to category theory by example.
]]>