Logo

index : blog

---

  • summary
  • about
  • tree
  • log
  • branches
<< path: root/public/blog.git/html/README.md blob: 6b652bd0b9dc7884fd4f55d3687f071653a6decf [raw] [clear marker]

        
0# Static Blog Generator for ptrace.dev
1This is a very tailored html generator and search server for my blog.
2In a sense where it probably won't fit your needs. But you're welcome to explore
3the code and use parts of it.
4
5Since I wanted to avoid writing 30 lines of Javascript, I opted for writing
6+1600 lines of Jai code to have a search function.
7
8
9## Search Server
10The search server is build with sockets and a very, very minimal HTTP
11implementation.
12Since it's so low-level, there's no hand holding in regards of security and
13stability. To guarantee those attributes, you need to make some things happen:
14
15- Test the connection with several types of clients (described below)
16- Load test via `wrk` & `libfuzz`
17- Sandbox with `seccomp`, `landlock`, and `namespaces` (more details below)
18
19
20### Clients
21The code would be smaller, if all clients would've a good connection.
22But the reality is different. Clients could be:
23
24- Slow
25- Dripping (giving us a few bytes every `n` times)
26- Stalling (just refuse to send ACKs back)
27- Spammy (flood us with requests aka DoS or dDoS)
28
29And the next issue is, how you design your server. This server is synchronous and
30single threaded. Which sounds horrible at first, but you'll be surprised how performative
31it is and it solves the mentioned problems!
32
33#### Epoll
34If you design your server just with the classic `accept()` and `recv()`, clients will
35block your server thread. Resulting in a unresponsive website for other clients, since
36your code will block at `recv()`!
37
38One might justâ„¢ open new threads, but depending on your machine, this concept will fail
39very quickly.
40
41But we can add `epoll()` into the mix!
42
43Instead of waiting till the client send its data, the kernel does the waiting and
44returns data that is ready to process.
45I recommend to look into `man epoll` for more details.
46
47#### Timeout
48Clients are not stalling our thread anymore, but they still could linger around and
49waste a file descriptor. For this case, you have to implement a timeout, which disconnects
50the client after some time.
51
52
53### Malformed Requests
54This server expects this http request:
55
56`GET /search?q=<search term> HTTP/1.x`
57
58You could do the full validation routine, that checks every method kind. You could also
59implement some logic that handles the HTTP version.
60Or you could ask yourself, what does _my_ server really need?
61
62If you know what requests you expect, you can omit whole chunks of code, which is also
63easier to test.
64
65That being said, I know that my server only accepts GET requests. And it doesn't support
66long-living connections. It's just request in, response out, and close connection aka
67HTTP/1.0.
68
69The validation for that becomes really simple. You can straight up check if the request
70even starts with GET or not, and reject it early before do any parsing.
71
72Having this tiny validation routine, makes it easier to spot bugs.
73
74Now we can add fuzzing. I'm using LLVMs `libfuzz` to perform tests on
75certain functions. Which reveals, if your program blows up or does not handle specific
76szenarios right.
77
78
79### Sandboxing
80Since anyone can throw requests towards my server, it is vital to not only ensure your
81server handles weird requests, it's also important to handle the case where someone
82took over the process because of some unknown vulnerability in your program.
83
84In this case, we use tools to achieve a sandboxed environment.
85
86The easiest way would be, to use systemd.exec and just start your program in such
87restrictive environment.
88
89If you don't like the dependency of systemd you could also implement the sandboxing by
90yourself. Or implement partial sandboxing and close gaps with systemd -
91which was my approach.
92
93I'm using two concepts for restricting my program:
94- seccomp
95- landlock
96
97There's also a third concept, called `namespaces`. But for this tiny project I didn't
98implemented it in code, since it required way more care and loc
99(Which is the reason why I reached for systemd.exec).
100
101
102## Markdown Metadata
103At the top of the markdown file we have to declare some metadata.
104This is how an entry looks like:
105
106```
107hidden: false
108width: 700
109title: test image
110published: 2026-01-01-12-00
111updated: now
112META;
113
114Some text that gets rendered as html later.
115```
116
117Lets walk through each element.
118
119### hidden [Optional]
120If `hidden: true` is provided it will ignore this entry entirely.
121Providing `hidden: false` or omitting it entirely will render it.
122
123### width [Optional]
124Some blog posts are using code blocks where the block will overflow with
125the standard width. You can set the width in pixel individually per blog post here.
126
127### title
128The title of the blog post will be presented as big orange text.
129
130### published
131When the blog post will be published as: YYYY-MM-DD-H-M
132You could omit hours and minutes and just supply YYYY-MM-DD
133
134It is also possible to omit it or use `now` - in both cases it will
135use the current date.
136
137After running this program it will add the datetime as milliseconds.
138
139### updated
140When the article was updated the last time. You could also omit it or
141use `now` like mentioned above.
142
143
144### META;
145This is important to close our 'header' with `META;` so it gets parsed correctly.
146If this is missing, the program will complain.
147
148
149## Known Issues
150
151- 2026-04-27: When viewing a search result, that contains an image, it does not
152 load it, since the path is wrong.
153 This happens because `documents_html: *[]Entry` is not filtered,
154 so it doesn't contain updated image paths.
155
156
157
Copyright 2026  E766CB298A6D1E64 | Git-Thing heavily inspired by cgit