New from O’Reilly: The memory architecture behind adaptive AI agents

Read the report

Blog

Speeding Up Geographic Commands in Redis 7

February 21, 20239 minute read
Filipe OliveiraAmhdal would say, “The overall performance improvement gained by optimizing a single part of a system is limited by the fraction of time that the improved part is actually used.”

So, let’s focus on the largest possible optimization.

Figure 1: Initial profile information for a simple GEOSEARCH command, showcasing 55% of CPU cycles spent within trigonometric functions.

The first optimization round: reduce wasteful computation

As the profile data above demonstrates, 54.78% of CPU cycles are generated by geohashGetDistanceIfInRectangle. Within it, we call geohashGetDistance three times. The first two times, we call the routine to produce intermediate results (such as to check if a point is out of range in longitude or latitude). This avoids CPU-expensive distance calculations if the conditions are false. It makes sense to check that data is in range, but we don’t need to do so repeatedly.

Our first optimization attempt to reduce the wasteful computation of intermediate results in two steps (and described in detail in the PR #11535):

  • Reduce expensive computation on intermediate geohashGetDistance calculations.
  • Avoid the expensive longitude distance calculation if the latitude distance condition fails.

That optimization made a significant difference. For that GEOSEARCH query, using a single benchmark client with no pipeline, we decreased the average latency (including round trip time) from 93.598ms to 73.046ms, representing a latency drop of approximately 22%.

The second round: wasteful computation

The same performance analysis of GEOSEARCH queries also showed that, in some cases, Redis performs spurious calls to sdsdup and sdsfree. These commands allocate and deallocate strings’ memory, respectively. This happens with large datasets, where many elements are outside the search’s range.

Figure 2: In this process, over 28% of CPU attention was spent on the sdsdup command. Anything we can do to reduce this transaction speeds up the geospatial search results.

The performance improvement we proposed in PR 11522 was simple: Instead of pre-allocating the string’s memory, we changed the role of geoAppendIfWithinShape to let the caller create the string only when needed. This resulted in approximately 14% latency drop and 25% improvement in the achievable ops/sec.

The third round: simplifying algorithms

To optimize geographic index querying, we focused on data pattern information. The intent is to simplify the number of calculations required for the Haversine distance. This is purely a mathematics exercise.


When the longitude difference is 0:

  • Given a longitude difference of zero, the asin(sqrt(a)) on the Haversine is asin(sin(abs(u))).
  • Set arcsin(sin(x)) equal to x when x ∈[−𝜋/2,𝜋/2].
  • Given a latitude between [−𝜋/2,𝜋/2] we can simplify arcsin(sin(x)) to abs(x).

The PR #11579 describes the optimization and obtained performance improvement of this simplification. The bottom line: It resulted in 55% increase in the achievable operations per sec of the Redis GEOSEARCH command.

The fourth round: a representation issue

All use cases that heavily rely on converting a double to a string representation (such as the conversion of a double-precision floating point number (1.5) to a string (“1.5”), can benefit by replacing the call to snprintf(buf,len,”%.4f“,value) with an equivalent fixed-point double to string optimized version.

The GEODIST command, shown in Figure 3, is a good example. About a quarter of the CPU time is devoted to the type conversion.

Figure 3: The profile information for the GEODIST command, showcasing 26% of CPU cycles spent converting a double to a string representation.

PR 11552 describes in detail the optimization we suggested, which resulted in a 35% increase in the achievable operations per second of the GEODIST command.

Putting it all together in Redis 7.0

A primary reason for Redis’s popularity as a database is its performance. We generally measure queries in sub-millisecond response time – and we want to keep improving it!

The cumulative effect of the optimizations we describe in this blog post: We increased the performance of Redis geospatial commands by up to four times their previous speed. You can already benefit from these enhancements, as they are part of Redis 7.0.7. And that should get you where you’re going, faster than you did before.

CommandTest caseAchievable ops/sec Redis 7.0.5Achievable ops/sec Redis 7.0.7Improvement factor
GEODIST key …geo-60M-elements-geodist-pipeline-107755249936321.3 X
GEOSEARCH … FROMLONLAT … BYRADIUSgeo-60M-elements-geosearch-fromlonlat11.813.81.2 X
GEOSEARCH … FROMLONLAT … BYBOXgeo-60M-elements-geosearch-fromlonlat-bybox13.249.63.8 X
CommandTest caseOverall p50 latency including RTT (ms) Redis 7.0.5Overall p50 latency including RTT (ms) Redis 7.0.7Improvement factor
GEODIST key …geo-60M-elements-geodist-pipeline-102.5752.0071.3 X
GEOSEARCH … FROMLONLAT … BYRADIUSgeo-60M-elements-geosearch-fromlonlat679.935598.0971.1 X
GEOSEARCH … FROMLONLAT … BYBOXgeo-60M-elements-geosearch-fromlonlat-bybox605.739161.7913.7 X

Note that this is not an isolated performance improvement – quite to the contrary. This set of improvements is a real-life example of the impact of the performance efforts that we call the “Redis benchmarks specifications.” We are constantly pushing for greater visibility and better performance across the entire Redis API.

Going the distance

Want to learn more about geospatial computing? This five-minute video gives you the lay of the land.

Find out more about in the performance blogs in this series, most of them co-written with Intel:

*Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries.

Get started with Redis today

Speak to a Redis expert and learn more about enterprise-grade Redis today.