· Zen HuiFer · Learn · 11 min read
How to cache well in Go
Optimize Go app performance with caching strategies. Learn local vs distributed cache, memory management, and eviction policies. Enhance efficiency with Go's new unique package.
Cache is essential for speeding up application APIs, so it is indispensable in the initial design stage if there are high performance requirements.
If caching is needed during the design phase, the most important thing is to Estimate how much memory is needed 。
We first need to clarify what content of data we need to cache.
It is not advisable to cache all the data used in applications with continuously increasing user numbers.
Due to the limitation of single machine physical resources on the local memory of the application, unlimited cached data will eventually result in OOM, leading to the forced exit of the application.
If it is a distributed cache, the high hardware cost also requires us to perform a trade off.
If When there are no limitations on physical resources, it is best to put them all into the fastest physical devices.
But the reality of business scenarios does not allow it, so we need to divide the data into hot and cold data, and even archive and compress the cold data appropriately, storing it in cheaper media.
Analyzing which data can be stored in local memory is the first step in doing a good job of local caching.
Balance between stateless applications
Since the data is stored locally in the application, in a distributed system, our application is no longer stateless.
Taking web backend applications as an example, if we deploy 10 Pods as backend applications and add cache in one of the Pods that processes requests, when the same request is forwarded to another Pod, the corresponding data cannot be retrieved.
There are three solutions:
Using Distributed Cache Redis
Forward the same request to the same Pod
Cache the same data in each Pod
The first method does not need to be elaborated here, as storage has also become centralized.
The second method requires specific identification information, such as the useruid
To implement specific forwarding logic, limited by practical scenarios.
The third method will consume more storage space. Compared to the second method, we need to store data in each Pod. Although it cannot be said to be completely stateless, the probability of cache breakdown is lower compared to the second method, because when the gateway encounters problems and cannot forward to a Pod with specific data, other Pods can also process requests normally.
There is no silver bullet, just choose according to the actual scenario, but the farther the cache goes, the longer it takes.
Goim
Also, use memory alignment to make the cache hit as much as possible.
When the CPU performs an operation, it first searches for the required data in L1, then in L2, and finally in L3. If none of these caches are available in the end, the required data needs to be retrieved from the main memory.
The further you go, the longer the computation time will be.
Elimination strategy
If there is strict memory size control for the cache, then it can be usedLRU
To manage memory, let’s take a look at Go’s implementation of LRU caching.
LRU cache
Suitable for scenarios that require controlling cache size and automatically eliminating infrequently used caches.
For example, if you only want to save 128 key values, in LRU, if they are not fully saved, they will continue to increase, and when they are used in the middle or when new values are added, the key will be pushed back to the front to avoid being eliminated.
https://github.com/hashicorp/golang-lru That’s oneGO
Implemented LRU cache.
We pass throughTest
Let’s take a look at the usage of LRU
func TestLRU(t *testing.T) {
l, _ := lru.New[int, any](128)
for i := 0; i < 256; i++ {
l.Add(i, i+1)
}
//Value not eliminated
value, ok := l.Get(200)
assert.Equal(t, true, ok)
assert.Equal(t, 201, value.(int)) //The value has been eliminated
value, ok = l.Get(1)
assert.Equal(t, false, ok)
assert.Equal(t, nil, value)
}
You can see200
The key has not been eliminated yet, so it can be obtained.
and1
The key has already exceededsize = 128
Due to cache restrictions, it has been phased out and cannot be accessed normally.
This situation applies when the amount of data we save is too large, and frequently used data will be constantly moved to the head, thereby improving the cache hit rate.
The internal implementation of open source packages is to maintain all cached elements through linked lists
every timeAdd
When… ifkey
If it already exists, it willkey
Move to the head.
func (l *LruList[K, V]) move(e, at *Entry[K, V]) {
if e == at {
return
}
e.prev.next = e.next
e.next.prev = e.prev e.prev = at
e.next = at.next
e.prev.next = e
e.next.prev = e
}
Ifkey
If it does not exist, it will be passed throughinsert
Method to Insert
func (l *LruList[K, V]) insert(e, at *Entry[K, V]) *Entry[K, V] {
e.prev = at
e.next = at.next
e.prev.next = e
e.next.prev = e
e.list = l
l.len++
return e
}
If the linked list cachesize
If it exceeds the limit, the last element of the linked list will be removed, which is an element that was added earlier and not used.
func (c *LRU[K, V]) removeOldest() {
if ent := c.evictList.Back(); ent != nil {
c.removeElement(ent)
}
}func (c *LRU[K, V]) removeElement(e *internal.Entry[K, V]) {
c.evictList.Remove(e)
delete(c.items, e.Key)
//Callback after deleting key
if c.onEvict != nil {
c.onEvict(e.Key, e.Value)
}
}func (l *LruList[K, V]) Remove(e *Entry[K, V]) V {
e.prev.next = e.next
e.next.prev = e.prev
//To prevent memory leaks, set it to nil
e.next = nil
e.prev = nil
e.list = nil
l.len-- return e.Value
}
Cache update
Timely updating cache in distributed systems can reduce data inconsistency issues.
Different methods also apply to different scenarios.
There are different situations for obtaining cached data. For example, when it comes to popular rankings, it is not related to the user. When there are multiple Pods, we need to maintain them in the local cache. When there is a write update operation, we need to notify all Pods to update.
If the data is unique to the user themselves, we would prefer to have it in a fixed Pod and then stably send requests to the same Pod through the user ID (UID). This way, we don’t need to maintain multiple copies of data in different Pods and reduces memory consumption.
Most of the time, we also want our application to be stateless, so we put this cached data intoRedis
Up there.
There are three main strategies for updating distributed cache: bypass update strategy, write to database after cache, and write back strategy.
Bypass update strategy It is the most commonly used method in our daily lives, which involves deleting the cache before writing it to the database when updating data. If the cache does not exist during subsequent reads, it is then retrieved from the database and updated.
This strategy may result in inconsistency when the QPS of the read is very high, because when deleting the cache before updating the database, a read operation is called again, which will write the old value, resulting in the old value still being read from the database.
Although the probability of this situation actually occurring is not high, we still need to evaluate the specific scenarios used. If it is a devastating blow to the system data when it occurs, then this strategy cannot be used.
If this situation is acceptable, but you want to minimize inconsistent time as much as possible, you can set a cache expiration time. When there is no write operation triggered, the cache can be automatically expired to refresh the cached data.
After writing to the cache, both the write to library and write back strategies update the cache first, and then write to the database, with the only difference being whether it refreshes one or a batch.
One obvious disadvantage of this strategy is that it is relatively easy to lose data. Although Redis also has a write back disk strategy, for applications with high QPS, losing one second of data after a machine loses power is still a very large amount of data. Therefore, it is necessary to decide whether to adopt this strategy based on the actual situation of the business and scenario.
And ifRedis
If it still cannot meet our performance requirements, then we need to directly pass the cached content through Application variable storage Down, which is a local cache, is returned directly to the user after access, without the need for network requests.
So what we are discussing below is the strategy of updating local cache in a distributed situation.
1. Proactively notify updates, similar to the bypass update strategy.
Distributed can be achieved throughETCD
Broadcast quickly spreads cached data without waiting for the next query to load.
But there will be a problem here, such asT1
The notification of cache update has been sent, but at this time, the downstream service has not yet fully updated,T2=T1 + 1s
Another signal of cache update has been generated, andT1
The time has not been fully updated yet.
At this point, it is possible that the update speed may be affectedT2
The updated new values are overwrittenT1
The old value of time.
In this case, a monotonically increasing time can be addedversion
To solve it. WhenT2
After the version of the data takes effect,T1
It can no longer be correctT2
The cache is updated to avoid the problem of overwriting old values.
In proactive notifications, we can specify the correspondingkey
Value is used to update specific caches, thereby avoiding excessive load caused by updating all cached data.
This update strategy is similar to bypass updates, except that it changes from updating the distributed cache to updating the local cache.
2. Wait for the cache to automatically expire
This usage is suitable for scenarios where data consistency is not a high requirement. For local caching, if we want to spread it to all Pods, the maintenance strategy will also be relatively high.
We can usego
Related open-source packages https://github.com/patrickmn/go-cache To maintain memory expiration time without the need to implement it yourself.
Next, let’s take a look at how Go cache implements local caching
Go Cahce
https://github.com/patrickmn/go-cache It is an open-source local cache package for Go
Internally, data is stored through maps
type Cache struct {
*cache
}type cache struct {
defaultExpiration time.Duration
items map[string]Item
mu sync.RWMutex
onEvicted func(string, interface{})
janitor *janitor
}
items
All corresponding data has been stored.
every timeSet
andGet
When it comes toitems
Obtain from within.
janitor
At specific time intervals, expiredkey
Delete, the specific time interval can be specified by oneself.
func (j *janitor) Run(c *cache) {
ticker := time.NewTicker(j.Interval)
for {
select {
case <-ticker.C:
c.DeleteExpired()
case <-j.stop:
ticker.Stop()
return
}
}
}
Will passTicker
Generate signals and pass them at regular intervalsDeleteExpired
Method to delete expired itemskey
func (c *cache) DeleteExpired() {
//The eliminated KV value
var evictedItems []keyAndValue
now := time.Now().UnixNano()
c.mu.Lock()
//Find expired keys and delete them
for k, v := range c.items {
if v.Expiration > 0 && now > v.Expiration {
ov, evicted := c.delete(k)
if evicted {
evictedItems = append(evictedItems, keyAndValue{k, ov})
}
}
}
c.mu.Unlock()
//If there is a callback after being eliminated, it will be executed
for _, v := range evictedItems {
c.onEvicted(v.key, v.value)
}
}
From the code, we can see that cache expiration is achieved through a dependency loop elimination method.
If we obtain expired but haven’t had a chance yetDelete
ofkey
What kind of outcome would it be?
When obtaining, we will also make judgmentskey
Will the value of expire.
func (c *cache) Get(k string) (interface{}, bool) {
c.mu.RLock()
//If not found, return directly
item, found := c.items[k]
if !found {
c.mu.RUnlock()
return nil, false
} //If the obtained content has expired, return nil directly and wait for the loop to delete the key
if item.Expiration > 0 {
if time.Now().UnixNano() > item.Expiration {
c.mu.RUnlock()
return nil, false
}
}
c.mu.RUnlock()
return item.Object, true
}
It can be seen that every time a specific value is obtained, a judgment is made, so it is possible to accurately avoid obtaining expired kv.
Cache preheating
How to preload during startup, Should we wait for initialization to complete before starting? Can we start in segments? Will concurrency cause pressure on middleware All of these are issues that need to be considered when preheating the cache during startup.
If starting the preloading process after all initialization is completed during startup consumes a large amount of overall resources, we can perform initialization and preloading in parallel, but we need to ensure that certain key components (such as database connections, network services, etc.) are ready to avoid resource unavailability during the preloading process.
If a request enters the application before it has been fully loaded, there needs to be a corresponding fallback strategy to ensure the normal return of access.
The advantage of segmented loading is that it can shorten the initialization time through concurrency, but concurrent loading not only improves the efficiency of preloading, but also puts pressure on middleware (such as cache servers, databases, etc.).
When encoding, it is necessary to evaluate the system’s concurrency processing capability and set reasonable concurrency limits. Adopting a current limiting mechanism can alleviate concurrency pressure and avoid overloading middleware.
It can also be done in Go throughchannel
To achieve the limit of concurrency in a certain way.
Cache preheating plays a very important role in practical production scenarios. During the release process, the local cache of the application will disappear with the restart. If it is a rolling update, there may be some issues At least one Pod needs to go back to the source, and in the case of a very large QPS, it is possible that the peak QPS of this Pod will drag down the database This leads to the avalanche effect.
There are two ways to deal with this situation. One is to minimize version upgrades during peak periods and instead opt for low traffic periods, which can be easily detected through monitoring.
Another way is to pre load the data at startup and provide services to the outside world after the loading is complete. However, there is a problem with this. If there are issues with the release version and we roll back, the service startup time will be extended, which is not conducive to our quick rollback.
Both methods have their own advantages and disadvantages. In practical scenarios, choose according to your own needs, but the most important thing is still Minimize reliance on special circumstances as much as possible The more dependencies are needed, the more likely problems are to occur when publishing.