What fascinates me most is not the protocol itself, but the potential to add a thin scripting layer (similar to Redis' EVAL or the newer modules API) to enable complex logic with atomic transactions and high QPS.
In the past, we relied heavily on using EVAL[SHA] with 200+ loc Lua scripts in order to implement high throughput, atomic transactions for realtime systems. We also used the JSON & Redis Query Language (previous named "full-text search") to build a more maintainable & strongly consistent system than using raw key-values and manually building secondary indexes.
We’ve since migrated to a native FoundationDB and SQLite hybrid setup, but this approach would have been really helpful for early-stage prototyping with a higher performance ceiling (thanks to FDB sharding) than a single-node Redis with AOF.
Related: Redis Cluster is a world of pain when handling clustering keys and cross-node queries and orchestration. DragonflyDB is chasing after the market of companies considering sharding Redis because of performance issues by providing better single-node performance. There's probably an alternative approach that could work by using an architecture like this.
The awesome thing about Redis is that it's less a "cache" and more a set of fast network-shared, generic data structures. To have that then also be transactional and durable would be really handy sometimes.
I was discussing this with a colleague; Java is supposed to be some of the most performant runtimes and yet, every time when someone introduces anything in Java, literally everyone tech in the company goes 'aw shit no Java please'. You can show me a running top processlist without the names of the processes and i'll point out the Java processes to you. And it seems that companies can make viable businesses around porting to Rust/Go/C++ of (Apache) Java projects because they are so resource intensive and slow (unless you spin up 10000 nodes). Why is anyone using it still outside legacy?
Not sure if correct, but I'll tell you my point of view.
Many people nowadays are programming not in Java, but in Spring. And it is slow, resource intensive and in the long term - development speed is badly affected too. You have simple tutorial service? Easy. You have real business case and want to do something a little bit skewed from "the best way" - you are screwed, in the debugger, deep into Spring code, praying to find workaround.
JVM is fast, but raw Java is not seen so often in corpo-rat world. And if for single endpoint getting data from database and encoding to JSON you have to schedule 2 cores and 4G of RAM for every few hundred QPS - something is wrong.
> The ecosystem of Java is so huge. Most people who use Java barely have any idea what the rest of the software industry is doing.
I think herein lies the problem. The Java enterprise software (usually spring) world and the everything else under the sun world are very separate. Java devs usually have minimal visibility of other ecosystems and their patterns, and are very reliant on extremely mature (and heavy) tooling that other languages don't tend to use. Other devs hate how bloated the Java ecosystem feels, and that they can't use any of their usual tools. Neither tends to understand that their approach isn't the only way.
Even inside Java, projects like quarkus don’t receive fast adoption because Spring is the “professional” way to work.
Spring developer experience is awful, I miss hot reload.
> Most people who use Java barely have any idea what the rest of the software industry is doing.
Not sure this is really true any more, but it definitely brings back memories from when I was learning OO programming (let's say, a couple decades ago, pre-github).
At the time, it seemed to be an industry-wide assumption that "software engineering" is exclusively done in Java. Every learning resource I could find at the time was deep, deep into inheritance and OO design patterns. Things seem better these days!
Just starting up the trivial Spring application I work on takes multiple seconds. Firing up other services for an integration test can take 30+ seconds. It's ridiculous.
The latency makes integration testing unnecessarily tedious. Don't even get me started on Maven -- dev tooling has to reimplement the build system rather than invoking it because the performance is so poor.
Well, give me a real world software example (so not a hello world thing) I can download and test and compare. I have dev & devops experience with Java, so let's take something that's used a lot; Zookeeper; Clickhouse ported it to C++ for a reason; the original sucks resources, CH Keeper you don't even notice. Maybe you will say it's written badly, but if a Apache posterchild is written badly, what does that say? Same for Apache Cassandra; even starting it for dev is a crime, let alone use it unless you have many nodes. Scylla, the C++ implementation, runs circles around a cluster of it on one box. But again, you might say it was badly done. ElasticSearch... Best not mentioned as it's a very well known resource hog vs native implementations search engines. As others said; anything Apache Spring for any real life application.
So what's a good example so I can compare it to a non Java version?
Can you how me some evidence that Java uses 3x the memory? Sure OO heap spam with Java is certainty possible, but the same code written the same way in both languages wont have that much memory difference.
That just notes the max amount of memory used, but not the actual memory it needs. If Java has memory available it makes sense that it uses it instead of spending time cleaning up unused objects.
In addition I have a feeling that these benchmarks are comparing short lived processes. Java does take a bit longer than some other languages to start unless you use native. But that doesn't matter much when you are running long lived services.
Is it actually, in practical applications? Just firing up the JVM can take seconds, while Go is often used for CLI programs that only run for milliseconds.
> You can show me a running top process list without the names of the processes and I'll point out the Java processes to you.
This was quite eye opening when thinking about this, I am aware of how underappreciated the performance of JVM is, but I never thought about how widely deployed it is
One thing I wish redis had is the ability to extend itself beyond its memory capacity. So if I run out of memory, just move some stuff to disk. Or if I'm using persisted redis, just free up some memory.
Redis on Flash [1] is one of the key ways Redis gets people on their enterprise plan (esp before the SSPL relicense). I've spoken with their enterprise team before; it's ungodly expensive, even compared to AWS MemoryDB (wish I had the numbers on hand).
If you're looking for a Redis-compatible key-value store, kvrocks[1][2] is an excellent choice. We've used it in many projects, and it's proven to be very stable. Since it's based on RocksDB, it doesn't have the memory limitations of Redis and the license delirium ;-)
What fascinates me most is not the protocol itself, but the potential to add a thin scripting layer (similar to Redis' EVAL or the newer modules API) to enable complex logic with atomic transactions and high QPS.
In the past, we relied heavily on using EVAL[SHA] with 200+ loc Lua scripts in order to implement high throughput, atomic transactions for realtime systems. We also used the JSON & Redis Query Language (previous named "full-text search") to build a more maintainable & strongly consistent system than using raw key-values and manually building secondary indexes.
We’ve since migrated to a native FoundationDB and SQLite hybrid setup, but this approach would have been really helpful for early-stage prototyping with a higher performance ceiling (thanks to FDB sharding) than a single-node Redis with AOF.
Related: Redis Cluster is a world of pain when handling clustering keys and cross-node queries and orchestration. DragonflyDB is chasing after the market of companies considering sharding Redis because of performance issues by providing better single-node performance. There's probably an alternative approach that could work by using an architecture like this.
This is exactly what [Tarantool](https://www.tarantool.io/en/doc/latest/platform/app/lua_tuto...) does. It's a shame I think that it never got much popular attention outside of Russia. I'm not sure why.
The awesome thing about Redis is that it's less a "cache" and more a set of fast network-shared, generic data structures. To have that then also be transactional and durable would be really handy sometimes.
> more a set of fast network-shared, generic data structures
Exactly. It's a distributed list, map, etc that is often used as a cache, and sometimes as a queue, but it's bigger than all that.
"sometimes as a queue"
Oh? I'd be interested in hearing more about that. Is this common?
Just google "langname redis queue"
If only it wasn't Java.. but it looks very cool anyway.
I was discussing this with a colleague; Java is supposed to be some of the most performant runtimes and yet, every time when someone introduces anything in Java, literally everyone tech in the company goes 'aw shit no Java please'. You can show me a running top processlist without the names of the processes and i'll point out the Java processes to you. And it seems that companies can make viable businesses around porting to Rust/Go/C++ of (Apache) Java projects because they are so resource intensive and slow (unless you spin up 10000 nodes). Why is anyone using it still outside legacy?
Not sure if correct, but I'll tell you my point of view. Many people nowadays are programming not in Java, but in Spring. And it is slow, resource intensive and in the long term - development speed is badly affected too. You have simple tutorial service? Easy. You have real business case and want to do something a little bit skewed from "the best way" - you are screwed, in the debugger, deep into Spring code, praying to find workaround.
JVM is fast, but raw Java is not seen so often in corpo-rat world. And if for single endpoint getting data from database and encoding to JSON you have to schedule 2 cores and 4G of RAM for every few hundred QPS - something is wrong.
Java can be performant if you are mindful of performance when writing it. But that requires way more effort than simply writing the thing in Rust/Go.
The ecosystem of Java is so huge. Most people who use Java barely have any idea what the rest of the software industry is doing.
> The ecosystem of Java is so huge. Most people who use Java barely have any idea what the rest of the software industry is doing.
I think herein lies the problem. The Java enterprise software (usually spring) world and the everything else under the sun world are very separate. Java devs usually have minimal visibility of other ecosystems and their patterns, and are very reliant on extremely mature (and heavy) tooling that other languages don't tend to use. Other devs hate how bloated the Java ecosystem feels, and that they can't use any of their usual tools. Neither tends to understand that their approach isn't the only way.
Even inside Java, projects like quarkus don’t receive fast adoption because Spring is the “professional” way to work. Spring developer experience is awful, I miss hot reload.
> Most people who use Java barely have any idea what the rest of the software industry is doing.
Not sure this is really true any more, but it definitely brings back memories from when I was learning OO programming (let's say, a couple decades ago, pre-github).
At the time, it seemed to be an industry-wide assumption that "software engineering" is exclusively done in Java. Every learning resource I could find at the time was deep, deep into inheritance and OO design patterns. Things seem better these days!
How to treat Java-phobia is a question for medicine professionals, not to IT professionals.
Java is the most performant runtime outside of C/C++/Rust. It is a first choice for any project.
Just starting up the trivial Spring application I work on takes multiple seconds. Firing up other services for an integration test can take 30+ seconds. It's ridiculous.
The latency makes integration testing unnecessarily tedious. Don't even get me started on Maven -- dev tooling has to reimplement the build system rather than invoking it because the performance is so poor.
Well, give me a real world software example (so not a hello world thing) I can download and test and compare. I have dev & devops experience with Java, so let's take something that's used a lot; Zookeeper; Clickhouse ported it to C++ for a reason; the original sucks resources, CH Keeper you don't even notice. Maybe you will say it's written badly, but if a Apache posterchild is written badly, what does that say? Same for Apache Cassandra; even starting it for dev is a crime, let alone use it unless you have many nodes. Scylla, the C++ implementation, runs circles around a cluster of it on one box. But again, you might say it was badly done. ElasticSearch... Best not mentioned as it's a very well known resource hog vs native implementations search engines. As others said; anything Apache Spring for any real life application.
So what's a good example so I can compare it to a non Java version?
Depends on the performance metrics you care about, and what the application actually does.
JavaScript can be faster in some tasks. Even if JS is 2x slower in another task, it uses 3x less memory than Java.
Which costs more, double CPU time or triple memory requirement?
Can you how me some evidence that Java uses 3x the memory? Sure OO heap spam with Java is certainty possible, but the same code written the same way in both languages wont have that much memory difference.
https://programming-language-benchmarks.vercel.app/java-vs-j...
That just notes the max amount of memory used, but not the actual memory it needs. If Java has memory available it makes sense that it uses it instead of spending time cleaning up unused objects.
In addition I have a feeling that these benchmarks are comparing short lived processes. Java does take a bit longer than some other languages to start unless you use native. But that doesn't matter much when you are running long lived services.
> Java is the most performant runtime outside of C/C++/Rust.
Add Go to that list. Or any popular compiled language for that matter.
Java faster than Go and has much more advanced GC.
Is it actually, in practical applications? Just firing up the JVM can take seconds, while Go is often used for CLI programs that only run for milliseconds.
> You can show me a running top process list without the names of the processes and I'll point out the Java processes to you.
This was quite eye opening when thinking about this, I am aware of how underappreciated the performance of JVM is, but I never thought about how widely deployed it is
One thing I wish redis had is the ability to extend itself beyond its memory capacity. So if I run out of memory, just move some stuff to disk. Or if I'm using persisted redis, just free up some memory.
Maybe this is it.
Redis on Flash [1] is one of the key ways Redis gets people on their enterprise plan (esp before the SSPL relicense). I've spoken with their enterprise team before; it's ungodly expensive, even compared to AWS MemoryDB (wish I had the numbers on hand).
[1] https://redis.io/resources/building-large-databases-redis-en...
If you're looking for a Redis-compatible key-value store, kvrocks[1][2] is an excellent choice. We've used it in many projects, and it's proven to be very stable. Since it's based on RocksDB, it doesn't have the memory limitations of Redis and the license delirium ;-)
[1] https://kvrocks.apache.org/ [2] https://github.com/apache/kvrocks
Other API-compatible reimplementations (others?):
https://github.com/dragonflydb/dragonfly DragonflyDB (not open source, BuSL-1.1) with more performance
https://github.com/apache/kvrocks Apache Kvrocks (Apache-2.0) uses disk-based NoSQL database to lower memory usage
The naming implies it’s a KDE desktop process monitoring tool with history graphs.
Damnit this has been in my ideas list for a long time
Oh it’s in Java we’re good
There's a version written in Go, though clearly just a hobbyist project: https://forums.foundationdb.org/t/introducing-the-redis-prot...
Much easier to do OLAP. OLTP is way harder.