Author: Eryk Trzeciakiewicz
-

Caching in GenAI powered systems – Semantic Caching
Recap: What is caching? Caching is the process of saving a once-computed value with a view to retrieving it again later on, to avoid re-computations. This is a very complex subject with many pitfalls, you can read about them in my different article – here When we need to get the value, we first check…
-

The hidden challenges of caching
Caching is often associated with fast, performant applications. When done correctly, it gives you amazing performance results. However, there are several traps that can break your app in subtle ways. I’ll explain them in this post.
-

Counting the uncountable with Hyperloglog – an algorithm that estimates billions of unique visitors with kilobytes of memory
The Magic of Approximation Imagine you’re running a social media platform with millions of users and want to know how many unique visitors you had this month. However, storing every IP address would require gigabytes of memory – a bit too much for a simple counter, isn’t it? Well, there is an algorithm that can…
-

Cost optimization of LLM-Based systems
Introduction Most of modern GenAI based systems are powered by a Large Language Model (LLM) under the hood. If you are a tech startup, working on your pet project or even operate in a consulting commpany, chances are you are not using a self hosted model (they have great use cases but the maintenance overhead…
-

Integers and floating point numbers in memory – definitive intermediate guide
Have you ever wondered how a floating point number like *3.14159* is internally stored in computer memory? This article explains the industry standard solution to this problem – the IEE754.