<<
path:
root/public/blog.git/html/README.md
blob: 6b652bd0b9dc7884fd4f55d3687f071653a6decf
[raw]
[clear marker]
0# Static Blog Generator for ptrace.dev
1This is a very tailored html generator and search server for my blog.
2In a sense where it probably won't fit your needs. But you're welcome to explore
3the code and use parts of it.
5Since I wanted to avoid writing 30 lines of Javascript, I opted for writing
6+1600 lines of Jai code to have a search function.
10The search server is build with sockets and a very, very minimal HTTP
12Since it's so low-level, there's no hand holding in regards of security and
13stability. To guarantee those attributes, you need to make some things happen:
15- Test the connection with several types of clients (described below)
16- Load test via `wrk` & `libfuzz`
17- Sandbox with `seccomp`, `landlock`, and `namespaces` (more details below)
21The code would be smaller, if all clients would've a good connection.
22But the reality is different. Clients could be:
25- Dripping (giving us a few bytes every `n` times)
26- Stalling (just refuse to send ACKs back)
27- Spammy (flood us with requests aka DoS or dDoS)
29And the next issue is, how you design your server. This server is synchronous and
30single threaded. Which sounds horrible at first, but you'll be surprised how performative
31it is and it solves the mentioned problems!
34If you design your server just with the classic `accept()` and `recv()`, clients will
35block your server thread. Resulting in a unresponsive website for other clients, since
36your code will block at `recv()`!
38One might justâ„¢ open new threads, but depending on your machine, this concept will fail
41But we can add `epoll()` into the mix!
43Instead of waiting till the client send its data, the kernel does the waiting and
44returns data that is ready to process.
45I recommend to look into `man epoll` for more details.
48Clients are not stalling our thread anymore, but they still could linger around and
49waste a file descriptor. For this case, you have to implement a timeout, which disconnects
50the client after some time.
54This server expects this http request:
56`GET /search?q=<search term> HTTP/1.x`
58You could do the full validation routine, that checks every method kind. You could also
59implement some logic that handles the HTTP version.
60Or you could ask yourself, what does _my_ server really need?
62If you know what requests you expect, you can omit whole chunks of code, which is also
65That being said, I know that my server only accepts GET requests. And it doesn't support
66long-living connections. It's just request in, response out, and close connection aka
69The validation for that becomes really simple. You can straight up check if the request
70even starts with GET or not, and reject it early before do any parsing.
72Having this tiny validation routine, makes it easier to spot bugs.
74Now we can add fuzzing. I'm using LLVMs `libfuzz` to perform tests on
75certain functions. Which reveals, if your program blows up or does not handle specific
80Since anyone can throw requests towards my server, it is vital to not only ensure your
81server handles weird requests, it's also important to handle the case where someone
82took over the process because of some unknown vulnerability in your program.
84In this case, we use tools to achieve a sandboxed environment.
86The easiest way would be, to use systemd.exec and just start your program in such
87restrictive environment.
89If you don't like the dependency of systemd you could also implement the sandboxing by
90yourself. Or implement partial sandboxing and close gaps with systemd -
93I'm using two concepts for restricting my program:
97There's also a third concept, called `namespaces`. But for this tiny project I didn't
98implemented it in code, since it required way more care and loc
99(Which is the reason why I reached for systemd.exec).
103At the top of the markdown file we have to declare some metadata.
104This is how an entry looks like:
110published: 2026-01-01-12-00
114Some text that gets rendered as html later.
117Lets walk through each element.
120If `hidden: true` is provided it will ignore this entry entirely.
121Providing `hidden: false` or omitting it entirely will render it.
124Some blog posts are using code blocks where the block will overflow with
125the standard width. You can set the width in pixel individually per blog post here.
128The title of the blog post will be presented as big orange text.
131When the blog post will be published as: YYYY-MM-DD-H-M
132You could omit hours and minutes and just supply YYYY-MM-DD
134It is also possible to omit it or use `now` - in both cases it will
137After running this program it will add the datetime as milliseconds.
140When the article was updated the last time. You could also omit it or
141use `now` like mentioned above.
145This is important to close our 'header' with `META;` so it gets parsed correctly.
146If this is missing, the program will complain.
151- 2026-04-27: When viewing a search result, that contains an image, it does not
152 load it, since the path is wrong.
153 This happens because `documents_html: *[]Entry` is not filtered,
154 so it doesn't contain updated image paths.