cthulix.com index preview status consulting
status
TLDR, there has been no code release so far. The principal developer
signed with a commercial venture in early 2021. There will be a period of
inactivity for this project.
This document explores project status through a focus on the layers of the
platform, and is current as of 15 January 2021.
Conceptual layers of the Cthulix operating system
------------------------
| Application Layer |
------------------------ ------------------------
| Locate Layer | | Community Layer |
------------------------ ------------------------
| Driver Layer | | Documentation Layer |
------------------------ ------------------------
^ Technology stack ^ Adoption stack
Development stages
Design -> Development -> Draft -> Release
-- Driver Layer
Summary
This layer concerns netbooting Linux into the physical hosts that
serve as Cthulix nodes.
Why this exists
If you have thousands of hosts, you do not want the operating system
of each being maintained as a unique snowflake.
Rather, we network-boot an operating system into system RAM. At each
boot, a host loads the current OS version.
Subsystems
Opserver
There is a dedicated host, the Opserver. This is a non-Cthulix
host that serves DHCP/PXE and related services.
This be redundant within an infrastructure.
Cthulix Linux PXE Distribution (Cthulix/LPD).
This is the Linux distribution that Cthulix physical hosts boot,
via PXE.
Current Status: Draft
This layer moved into Draft on 10 January 2021. It requires polish,
particularly for construction of the Cthulix/LPD images.
Opserver
Working reference implementation, built on Debian 10.
Decent documentation coverage for setup and troubleshooting.
Cthulix Linux PXE Distribution (Cthulix/LPD)
Working. Verified against physical-host and qemu configurations.
This needs to be streamlined, scripted and documentd.
We could use,
Linux distro developer
The current Arch/mkinitcpio approach is solid. We do not need to
evolve beyond it. Still, there is room to create something tighter
via the following combination,
Busybox/Toybox
Eudev and no systemd
Maintained via a crafted tree of Dracut scripts.
Ideal: evolve Cthulix/LPD into a toolkit that services even non
Cthulix users who want to ram-boot tight Linux systems.
Linux kernel developer
Directly participate in the linux kernel effort to extend io_uring
coverage. Create a pure-async kernel derived from linux that
offers only io_uring services. Seek to significantly reduce the
code footprint of the pure-async kernel whilst retaining full
driver compatibility with Linux.
Core library guru
Create a build-tree well-suited to benchmarking and optimising
glibc, gmp, mprf, mpc, blas across intel/ryzen/epyc.
-- Locate Layer
Summary
This layer is responsible for the space between (1) a Cthulix physical
host that has booted and (2) a Cthulix node connecting to its kernel.
The key purpose of this layer is to allow newly-booted hosts to work
out who they are and which Cthulix kernels they should to attach to.
Why this exists
At boot, a host does not know which Cthulix Systems to associate with.
Through Locate, it determins this and then launches the appropriate
processes.
Subsystems
Cthulix Locate
This is a binary included in the TFTP image. It contacts services
hosted on an Opserver in order to find kernels to connect to. It
launches an Agent Process for each. Cthulix Locate also maintains
a connection to a set of Ophub instances (see Ophub for context).
Locate Toolkit
This is a set of introspection mechanisms built into the Locate.
Ophub
Runs on the Opserver. Keeps track of (the set of Cthulix Locates)
and status about each. Keeps track of the live kernels. When a
kernel raft changes membership, the the raft keeps relevant Ophubs
informed. Ophub hosts a shell that allows you to have individual
Locates run commands from their Locate Toolkit. Releases are
published to and downloaded from the Ophub.
Current Status: Development
The current momentum is to move this into Draft.
We could use
Rust developer experienced with Kerrisk/Stevens, to evolve the Locate
Toolkit.
Python developer to evolve the Ophub shell.
Engineer fluent in Cisco IOS to guide Ophub's expansion to robustly
manage switches. In particular, it would be useful to easily control
vlan configurations.
Engineer fluent with IPMI interfaces to guide Ophub's integration to
remote-hands interfaces.
Rust/network developer to raft the Ophub.
-- Application Layer
Summary
This is the layer of the Cthulix Kernel and its gridapps.
Why this exists
This is the essence of the Cthulix system, and the focus of the
/preview/ page on this website.
Subsystems
Kernel Raft
Kernel API
Agent Process
This is a unix process that connects to a specific Cthulix Kernel,
and acts as its agent on that host. When the kernel seeks to start
a process on the host, it does this via the Agent.
Resource Process
This is a unix process that has been launched by an agent for the
intent of doing work that serves application layer business logic.
Init API
Gridapps
Current Status: Design.
There is a work-in-progress to model the complete system in in Wandle
DSL (https://github.com/cratuki/wandle), a design language for async
systems.
We could use
Developer with Rust fluency and experience building ETLs.
There is opportunity to create a generalised ETL gridapp. This may
be the killer-app for this platform.
Protocol developer, fluent in Rust
The RPC protocol used to communicate between nodes is a neat,
isolated problem. Someone may wish to focus on building a custom
RPC mechanism for this, or even to replace it with an approach
based on cap'n-proto approaches.
Developer, fluent in Rust or Python3 asyncio, to evolve the Shell
The draft shell for the Cthulix kernel is built using the python3
asyncio library. There is an open opportunity to find ways to make
this shell more powerful and polished, whilst maintaining the
low-barrier-of-entry. (We do not seek to recreate bash)
-- Documentation Layer
Summary
Platform manual, tutorials, Generally Accepted Cthulix Principles.
This platform is BSD-like in the sense that documentation is part of
to platform, rather than Linux-like, where documentation is
independent.
Why this exists
Help new-starters to become fluent with the system.
Steer community discussion towards first-principles reasoning.
Current Status: Development
-- Community Layer
Summary
Incubate a Cthulix Community.
Why this exists
The project seeks to dethrone MPI. It needs to draw an audience to
succeed.
Current Status: Design
Current plan is to hold off on evangelising the project until there is
a draft reference-implementation available, including a decent set of
documentation.
We could use
Inquiry from firms/projects who work on interesting
distributed-compute problems and want to talk about Cthulix.