schmonz.com is a Fediverse instance that uses the ActivityPub protocol. In other words, users at this host can communicate with people that use software like Mastodon, Pleroma, Friendica, etc. all around the world.
This server runs the snac software and there is no automatic sign-up process.
Giving up for today, running into the #c #coding "worst case":
* some bug leading to a crash is hidden without compiler optimizations
* the stack where it happens only contains two frames, one from libthr.so and one from libc.so
Only way forward will be disabling features to see whether it stops crashing ... tomorrow 🤯
Solved! 🥳
This was a pretty "interesting" bug. Remember when I invented a way to implement #async / #await in #C, for jobs running on a threadpool. Back then I said it only works when completion of the task resumes execution on the *same* pool thread.
Trying to improve overall performance, I found the complex logic to identify the thread job to put on a pool thread a real deal-breaker. Just having one single MPMC queue with a single semaphore for all pool threads to wait on is a lot more efficient. But then, a job continued after an awaited task will resume on a "random" thread.
It theoretically works by making sure to restore the CORRECT context (the original one of the pool thread) every time after executing a job, whether partially (up to the next await) or completely.
Only it didn't, at least here on #FreeBSD, and I finally understood the reason for this was that I was using #TLS (thread-local storage) to find the context to restore.
Well, most architectures store a pointer to the current thread metadata in a register. #POSIX user #context #switching saves and restores registers. I found a source claiming that the #Linux (#glibc) implementation explicitly does NOT include the register holding a thread pointer. Obviously, #FreeBSD's implementation DOES include it. POSIX doesn't have to say anything about that.
In short, avoiding TLS accesses when running with a custom context solved the crash. 🤯
Finally getting somewhere working on the next evolution step for #swad. I have a first version that (normally 🙈) doesn't crash quickly (so, no release yet, but it's available on the master branch).
The good news: It's indeed an improvement to have *multiple* parallel #reactor (event-loop) threads. It now handles 3000 requests per second on the same hardware, with overall good response times and without any errors. I uploaded the results of the stress test here:
https://zirias.github.io/swad/stress/
The bad news ... well, there are multiple.
1. It got even more memory hungry. The new stress test still simulates 1000 distinct clients (trying to do more fails on my machine as #jmeter can't create new threads any more...), but with delays reduced to 1/3 and doing 100 iterations each. This now leaves it with a resident set of almost 270 MiB ... tuning #jemalloc on #FreeBSD to return memory more promptly reduces this to 187 MiB (which is still a lot) and reduces performance a bit (some requests run into 429, overall response times are worse). I have no idea yet where to start trying to improve *this*.
2. It requires tuning to manage that load without errors, mainly using more threads for the thread pool, although *these* threads stay almost idle ... which probably means I have to find ways to make putting work on and off these threads more efficient. At least I have some ideas.
3. I've seen a crash which only happened once so far, no idea as of now how to reproduce. *sigh*. Massively parallel code in C really is a PITA.
Seems the more I improve here, the more I find that *should* also be improved. 🤪
Just released: #swad 0.11 -- the session-less swad is done!
Swad is the "Simple Web Authentication Daemon", it adds cookie/form #authentication to your reverse #proxy, designed to work with #nginx' "auth_request". Several modules for checking credentials are included, one of which requires solving a crypto challenge like #Anubis does, to allow "bot-safe" guest logins. Swad is written in pure #C, compiles to a small (200-300kiB) binary, has minimal dependencies (zlib, OpenSSL/LibreSSL and optionally libpam) and *should* work on many #POSIX-alike systems (#FreeBSD tested a lot, #Linux and #illumos also tested)
This release is the first one not to require a server-side session (which consumes a significant amount of RAM on really busy sites), instead signed Json Web Tokens are now implemented. For now, they are signed using HMAC-SHA256 with a random key generated at startup. A future direction could be support for asymmetric keys (RSA, ED25519), which could open up new possibilities like having your reverse proxy pass the signed token to a backend application, which could then verify it, but still not forge it.
Read more, grab the latest .tar.xz, build and install it ... here: 😎
Just released: #swad 0.10
https://github.com/Zirias/swad/releases/tag/v0.10
Swad is the "Simple Web Authentication Daemon". If you're looking for a way to add #authentication (and/or proof-of-work access as known from #anubis) to your #nginx reverse proxy -- without adding yet another reverse proxy -- swad could be for you! It's written in pure #C, has few external dependencies (just zlib, and optionally OpenSSL/Libressl and/or libpam) and compiles to a pretty small binary. It's designed for usage with nginx' 'auth_request'.
Swad is tested on #FreeBSD, some basic functionality tests were also done on #Linux and #illumos (descendant from #solaris). It *should* build and work on most #POSIX-alike systems.
This release mainly brings performance improvements and a few bugfixes. It's now stress-tested with Apache jmeter, verifying it can deal with at least 1000 requests per second on my personal (somewhat limited) FreeBSD host machine.
Now that #swad 0.7 is released, it's time to prepare a new release of #poser, my own lib supporting #services on #POSIX systems, following a #reactor with #threadpool design.
During development of swad, I moved poser from using strictly only POSIX APIs (with the scalability limits of e.g. #select) to auto-detected support for #kqueue, #epoll, #eventports, #signalfd and #timerfd (so now it could, in theory(!), "compete" with e.g. libevent). I also fixed quite some hidden bugs, and added more base functionality, like a #dictionary using nested hashtables internally, or #async tasks mimicking the async/await pattern known from e.g, #csharp. I also deprecated two features, the periodic and global "service tick" (superseded by individual timers) and the "resolve hosts" property of a "connection" (superseded by a separate resolve class).
I'll have to decide on a few things, e.g. whether I'll remove the deprecated stuff immediately and bump the major version of the "posercore" lib. I guess I'll do just that. I'd also like to add all the web-specific stuff (http 1.0/1.1 server) that's currently part of the swad code as a "poserweb" lib. This would get a major version of 0, indicating a generally unstable API/ABI as of now....
And then, I'd have to decide where certain utility classes belong to. The rate limiter is probably useful for things other than web, so it should probably go to core. What about url encoding/decoding, for example? 🤔
Stay tuned, something will come here, maybe helping you to write a nice service in plain #C 😎:
I'd boost it (the blog is federated - @ltning), but @writefreely doesn't render nicely for some reason.
#Fail #DOS #C #Retrocomputing
(Also @obsoletemediauk - got any new updates lately? :)
When writing a #daemon that follows best practices (handling of #detaching with a locked #pidfile, and #SIGHUP for #configuration #reload), an extremely simple "init script" will do (reliably!) for #FreeBSD's mewburn-rc. 😎
I'm trying to add "genric" #signal handling to #poser. Ultimate goal is to provide a way for #swad to handle #SIGHUP, although signal handling must be done in poser's main event loop (signals are only ever unblocked while waiting for file descriptor events).
Okay, I could just add explicit handling for SIGHUP. But a generic solution would be nicer. Just for example, a consumer might be interested in #SIGINFO which doesn't even exist on all platforms ... 🤔
Now, #POSIX specs basically just say signal constants are "integer values". Not too helpful here. Is it safe to assume an upper bound for signal numbers on "real world" OS implementations, e.g. 64 like on #Linux? Should I check #NSIG and, if not defined, just define it to 64? 🙈
Revisiting #async / #await in #POSIX C, trying to "add some #security" 🙈
Recap: Consider a classic #reactor-style service in C with a #threadpool attached to run the individual request handlers. When such a handler needs to do some I/O, it'll have to wait for its completion, and doing so is kind of straight forward by just blocking the worker thread executing the job until whatever I/O was needed completes.
Now, blocking a thread is never a great thing to do and I recently tooted about an interesting alternative I found: Make use of the (unfortunately deprecated) POSIX user context switching to enable releasing the worker thread while waiting. In a nutshell, you create a context with #makecontext that has its own private #stack, and then you can use #swapcontext to get off the thread, and later again to get back on the thread. A minor issue is: It must be the *same* thread ... so you might have to wait until it completes something else before you can resume your job. But then, that's probably okayish, you can make sure in your job scheduling to only use worker threads with awaited tasks attached when no other thread is available.
In my first implementation, I just used #malloc to create a 64kiB private stack for each thread job. That's perfectly fine if you can guarantee your job will never consume more stack space, AND it won't have any vulnerabilities allowing some attacker to mess with the stack. But in practice, especially for a library offering this async/await implementation, it's nothing but a wild #CVE generator.
So, I now improved on that:
* Allocate a much larger stack of now 2MiB. That alone makes issues at least less likely. And on a sane modern OS, we can still assume pages will only be mapped "on demand".
* Only allocate the stack directly before running the thread job, and delegate allocation to some internal "stack manager" that keeps track of all allocated stacks and reuses them, only freeing them on exit. This should avoid most of the allocation overhead.
* If MAP_ANON / MAP_ANONYMOUS is available, use #mmap for allocating the stack. That at least gives a chance to stay away from other allocations ....
* But finally, if MAP_STACK is available, use this flag! From my research, #FreeBSD, #OpenBSD and #NetBSD will for example make sure there's at least one "guard page" below a stack mapped with this flag, so a stack overflow consistently takes the SIGSEGV emergency exit 😆. #Linux knows this flag as well, but doesn't seem to implement such protection at this time ... 🤔
Just released: #swad 0.5
swad is the "Simple Web Authentication Daemon", meant to add authentication using a #cookie and a #login form to your reverse proxy. It's designed for #nginx' "auth_request" module. It's written in pure #C with very few external dependencies (zlib, and depending on build options OpenSSL/LibreSSL and #PAM).
And with this release, it also allows guest logins using the crypto puzzle you may already know from #Anubis!
Read more in the release notes, grab the .tar.xz and build/install it 😎
Just released: #swad v0.3!
https://github.com/Zirias/swad/releases/tag/v0.3
swad is the "Simple Web Authentication Daemon", your tiny, efficient and (almost) dependency-free solution to add #cookie + login #form #authentication to whatever your #reverse #proxy offers. It's written in pure #C, portable across #POSIX platforms. It's designed with #nginx' 'auth_request' in mind, example configurations are included.
This release brings a file-based credential checker in addition to the already existing one using #PAM. Also lots of improvements, see details in the release notes.
I finally added complete build instructions to the README.md:
https://github.com/Zirias/swad
And there's more documentation available: manpages as well as a fully commented example configuration file.
I need some advise: Is there a good portable and free (really free, not GPL!) #implementation of #bcrypt in #C around?
There's #OpenBSD source I could use, but integrating that would probably be quite a hassle...
Background: I want to start creating a second credential checker for #swad using files. And it probably makes sense to support a sane subset of #Apache's #htpasswd format here. Looking at the docs:
https://httpd.apache.org/docs/current/misc/password_encryptions.html
... the "sane subset" seems to be just bcrypt. *MAYBE* also this apache-specific flavor of "iterated" MD5, although that sounds a bit fishy ...
Today, I implemented the #async / #await pattern (as known from #csharp and meanwhile quite some other languages) ...
... in good old #C! 😎
Well, at least sort of.
* It requires some standard library support, namely #POSIX user context switching with #getcontext and friends, which was deprecated in POSIX-1.2008. But it's still available on many systems, including #FreeBSD, #NetBSD, #Linux (with #glibc). It's NOT available e.g. on #OpenBSD, or Linux with some alternative libc.
* I can't do anything about the basic language syntax, so some boilerplate comes with using it.
* It has some overhead (room for extra stacks, even extra syscalls as getcontext unfortunately also always saves/restores the signal mask)
But then ... async/await in C! 🥳
Here are the docs:
https://zirias.github.io/poser/api/latest/class_p_s_c___async_task.html
I finally eliminated the need for a dedicated #thread controlling the pam helper #process in #swad. 🥳
The building block that was still missing from #poser was a way to await some async I/O task performed on the main thread from a worker thread. So I added a class to allow exactly that. The naive implementation just signals the main thread to carry out the requested task and then waits on a #semaphore for completion, which of course blocks the worker thread.
Turns out we can actually do better, reaching similar functionality like e.g. #async / #await in C#: Release the worker thread while waiting to do other jobs. The key to this is user context switching support like offered by #POSIX-1.2001 #getcontext and friends. Unfortunately it was deprecated in POSIX-1.2008 without an obvious replacement (the docs basically say "use threads", which doesn't work for my scenario), but still lots of systems provide it, e.g. #FreeBSD, #NetBSD, #Linux (with #glibc) ...
The posercore lib now offers both implementations, prefering to use user context switching if available. It comes at a price: Every thread job now needs its private stack space (I allocated 64kiB there for now), and of course the switching takes some time as well, but that's very likely better than leaving a task idle waiting. And there's a restriction, resuming must still happen on the same thread that called the "await", so if this thread is currently busy, we have to wait a little bit longer. I still think it's a very nice solution. 😎
In any case, the code for the PAM credential checker module looks much cleaner now (the await "magic" happens on line 174):
https://github.com/Zirias/swad/blob/57eefe93cdad0df55ebede4bd877d22e7be1a7f8/src/bin/swad/cred/pamchecker.c
On a #coding mission to improve my #poser lib 😎.
In the current implementation of #swad, I don't really like that I need an extra thread, just to control a child #process. A first piece to add to poser is generic "child process support", which I'm testing right now. I realized I could reuse my #Connection class, which was built for #sockets, but works just as well with #pipes 🙃
TODO now is mostly testing. See screenshots for some mimimal testing code and its output ... would you like this kind of interface? 🤔
More #poser improvements:
* Use arc4random() if available, avoids excessive syscalls just to get high-quality random data
* Add a "resolver" to do #reverse #DNS lookups in a batch, remove the reverse lookup stuff from the connection which was often useless anyways, when a short-lived connection was deleted before resolving could finish 🙈
As a result, #swad can now reliably log requests with reverse lookups enabled 🥳
Still working on #swad, and currently very busy with improving quality, most of the actual work done inside my #poser library.
After finally supporting #kqueue and #epoll, I now integrated #xxhash to completely replace my previous stupid and naive hashing. I also added a more involved #dictionary class as an alternative to the already existing #hashtable. While the hashtable's size must be pre-configured and collissions are only ever resolved by storing linked lists, the new dictionary dynamically nests multiple hashtables (using different bits of a single hash value). I hope to achieve acceptable scaling while maintaining also acceptable memory overhead that way ...
#swad already uses both container classes as appropriate.
Next I'll probably revisit poser's #threadpool. I think I could replace #pthread condition variables by "simple" #semaphores, which should also reduce overhead ...
First change since #swad 0.2 will actually be a (huge?) improvement to my #poser lib. So far, it was hardwired to use the good old #POSIX #select call. This is perfectly fine for handling around up to 100 (or at least less than 1000, YMMV) clients.
Some #select implementations offer defining the upper limit for checked file descriptors. Added support for that.
POSIX also specifies #poll, which has very similar #scalability issues, but slightly different. Added support for this as well.
And then, I went on to add support for the #Linux-specific #epoll and #BSD-specific #kqueue (#FreeBSD, #NetBSD, #OpenBSD, ...) which are both designed to *solve* any scalability issues 🥳
A little thing that slightly annoyed me about kqueue was that there's no support for temporarily changing the signal mask, so I had to do the silly dance shown in the screenshot. OTOH, it offers changing event filters and getting events in a single call, which I might try to even further optimize ... 😎
Just released: #swad v0.2
SWAD is the "Simple Web Authentication Daemon", meant to add #cookie #authentication with a simple #login form and configurable credential checker modules to a reverse #proxy supporting to delegate authentication to a backend service, like e.g. #nginx' "auth_request". It's a very small piece of software written in pure #C with as little external dependencies as possible. It requires some #POSIX (or "almost POSIX", like #Linux, #FreeBSD, ...) environment, OpenSSL (or LibreSSL) for TLS and zlib for response compression.
Currently, the only credential checker module available offers #PAM authentication, more modules will come in later releases.
swad 0.2 brings a few bugfixes and improvements, especially helping with security by rate-limiting the creation of new sessions as well as failed login attempts. Read details and grab it here:
Released: #swad v0.1 🥳
Looking for a simple way to add #authentication to your #nginx reverse proxy? Then swad *could* be for you!
swad is the "Simple Web Authentication Daemon", written in pure #C (+ #POSIX) with almost no external dependencies. #TLS support requires #OpenSSL (or #LibreSSL). It's designed to work with nginx' "auth_request" module and offers authentication using a #cookie and a login form.
Well, this is a first release and you can tell by the version number it isn't "complete" yet. Most notably, only one single credentials checker is implemented: #PAM. But as pam already allows pretty flexible configuration, I already consider this pretty useful 🙈
If you want to know more, read here:
https://github.com/Zirias/swad
Trying to come up with my own little self-hosted #http #authentication #daemon to work with #nginx' "authentication request" facility ... first step done! 🥳
Now I have a subset of HTTP 1.x implemented in #C, together with a dummy handler showing nothing but a static hello-world root document.
I know it's kind of stubborn doing that in C, but hey, #coding it is great fun 🙈
I just decided my new tool needs protection against #CSRF. It's surprisingly little code, once the generic tooling is in place 😎
https://github.com/Zirias/swad/commit/ecfeb68f87245d621c53d7ca440f9c36d909a18a
Result of today's #C #coding session: I can now authenticate with #PAM 🥳
https://github.com/Zirias/swad/commit/8983ae30955a407c4732c6e3e3a4aeba6db77a93
This will soon be "production-ready" at least for me 😎
Once you have your happy path working, it's time to deal with all the "unhappy stuff" (aka proper error handling) to reach production quality ... 🙈
https://github.com/Zirias/swad/commit/a0417bbc1db4cb5ca9f99534d04d1723492107f9
First "production test" successful 💪 ... after band-aid "deployment" (IOW, scp binaries to the prod jail).
#swad integrates with #nginx exactly as I planned it. And #PAM authentication using a child process running as root also just works (while the main process dropped privileges). 🥳
So, I guess I can say goodbye to #AI #bots hammering my poor DSL connection just to download poudriere build logs.
Still a lot to do for #swad: Make it nicer. So many ideas. Best start would probably be to implement more credentials checking modules besides PAM.