cthulix.com index preview status consulting


status

    TLDR, there has been no code release so far. The principal developer
    signed with a commercial venture in early 2021. There will be a period of
    inactivity for this project.


    This document explores project status through a focus on the layers of the
    platform, and is current as of 15 January 2021.
    
    Conceptual layers of the Cthulix operating system

        ------------------------
        | Application Layer    |
        ------------------------    ------------------------
        | Locate Layer         |    | Community Layer      |
        ------------------------    ------------------------
        | Driver Layer         |    | Documentation Layer  |
        ------------------------    ------------------------
         ^ Technology stack          ^ Adoption stack

    Development stages

        Design -> Development -> Draft -> Release


    -- Driver Layer

    Summary

        This layer concerns netbooting Linux into the physical hosts that
        serve as Cthulix nodes.

    Why this exists

        If you have thousands of hosts, you do not want the operating system
        of each being maintained as a unique snowflake.

        Rather, we network-boot an operating system into system RAM. At each
        boot, a host loads the current OS version.

    Subsystems

        Opserver

            There is a dedicated host, the Opserver. This is a non-Cthulix
            host that serves DHCP/PXE and related services.
            
            This be redundant within an infrastructure.

        Cthulix Linux PXE Distribution (Cthulix/LPD).
        
            This is the Linux distribution that Cthulix physical hosts boot,
            via PXE.

    Current Status: Draft

        This layer moved into Draft on 10 January 2021. It requires polish,
        particularly for construction of the Cthulix/LPD images.

        Opserver

            Working reference implementation, built on Debian 10.
            
            Decent documentation coverage for setup and troubleshooting.

        Cthulix Linux PXE Distribution (Cthulix/LPD)

            Working. Verified against physical-host and qemu configurations.

            This needs to be streamlined, scripted and documentd.


    We could use,

        Linux distro developer

            The current Arch/mkinitcpio approach is solid. We do not need to
            evolve beyond it. Still, there is room to create something tighter
            via the following combination,

                Busybox/Toybox
            
                Eudev and no systemd

                Maintained via a crafted tree of Dracut scripts.

            Ideal: evolve Cthulix/LPD into a toolkit that services even non
            Cthulix users who want to ram-boot tight Linux systems.

        Linux kernel developer

            Directly participate in the linux kernel effort to extend io_uring
            coverage. Create a pure-async kernel derived from linux that
            offers only io_uring services. Seek to significantly reduce the
            code footprint of the pure-async kernel whilst retaining full
            driver compatibility with Linux.

        Core library guru

            Create a build-tree well-suited to benchmarking and optimising
            glibc, gmp, mprf, mpc, blas across intel/ryzen/epyc.


    -- Locate Layer

    Summary

        This layer is responsible for the space between (1) a Cthulix physical
        host that has booted and (2) a Cthulix node connecting to its kernel.

        The key purpose of this layer is to allow newly-booted hosts to work
        out who they are and which Cthulix kernels they should to attach to.

    Why this exists

        At boot, a host does not know which Cthulix Systems to associate with.
        Through Locate, it determins this and then launches the appropriate
        processes.

    Subsystems

        Cthulix Locate
        
            This is a binary included in the TFTP image. It contacts services
            hosted on an Opserver in order to find kernels to connect to. It
            launches an Agent Process for each. Cthulix Locate also maintains
            a connection to a set of Ophub instances (see Ophub for context).

        Locate Toolkit
        
            This is a set of introspection mechanisms built into the Locate.

        Ophub
        
            Runs on the Opserver. Keeps track of (the set of Cthulix Locates)
            and status about each. Keeps track of the live kernels. When a
            kernel raft changes membership, the the raft keeps relevant Ophubs
            informed. Ophub hosts a shell that allows you to have individual
            Locates run commands from their Locate Toolkit. Releases are
            published to and downloaded from the Ophub.

    Current Status: Development

        The current momentum is to move this into Draft.

    We could use

        Rust developer experienced with Kerrisk/Stevens, to evolve the Locate
        Toolkit.

        Python developer to evolve the Ophub shell.

        Engineer fluent in Cisco IOS to guide Ophub's expansion to robustly
        manage switches. In particular, it would be useful to easily control
        vlan configurations.

        Engineer fluent with IPMI interfaces to guide Ophub's integration to
        remote-hands interfaces.

        Rust/network developer to raft the Ophub.


    -- Application Layer

    Summary

        This is the layer of the Cthulix Kernel and its gridapps.

    Why this exists

        This is the essence of the Cthulix system, and the focus of the
        /preview/ page on this website.

    Subsystems
        
        Kernel Raft

        Kernel API

        Agent Process

            This is a unix process that connects to a specific Cthulix Kernel,
            and acts as its agent on that host. When the kernel seeks to start
            a process on the host, it does this via the Agent.

        Resource Process

            This is a unix process that has been launched by an agent for the
            intent of doing work that serves application layer business logic.

        Init API

        Gridapps

    Current Status: Design.
        
        There is a work-in-progress to model the complete system in in Wandle
        DSL (https://github.com/cratuki/wandle), a design language for async
        systems.

    We could use

        Developer with Rust fluency and experience building ETLs.

            There is opportunity to create a generalised ETL gridapp. This may
            be the killer-app for this platform.

        Protocol developer, fluent in Rust

            The RPC protocol used to communicate between nodes is a neat,
            isolated problem. Someone may wish to focus on building a custom
            RPC mechanism for this, or even to replace it with an approach
            based on cap'n-proto approaches.

        Developer, fluent in Rust or Python3 asyncio, to evolve the Shell

            The draft shell for the Cthulix kernel is built using the python3
            asyncio library. There is an open opportunity to find ways to make
            this shell more powerful and polished, whilst maintaining the
            low-barrier-of-entry. (We do not seek to recreate bash)


    -- Documentation Layer

    Summary

        Platform manual, tutorials, Generally Accepted Cthulix Principles.

        This platform is BSD-like in the sense that documentation is part of
        to platform, rather than Linux-like, where documentation is
        independent.

    Why this exists

        Help new-starters to become fluent with the system.

        Steer community discussion towards first-principles reasoning.

    Current Status: Development


    -- Community Layer

    Summary

        Incubate a Cthulix Community.

    Why this exists

        The project seeks to dethrone MPI. It needs to draw an audience to
        succeed.

    Current Status: Design

        Current plan is to hold off on evangelising the project until there is
        a draft reference-implementation available, including a decent set of
        documentation.

    We could use

        Inquiry from firms/projects who work on interesting
        distributed-compute problems and want to talk about Cthulix.