There is a general push in the software industry towards using more natural language. Generative AI tools allow people to build software using natural language. Behavior Driven Development methodologies use natural language to shape requirements and write tests. This allows more people to participate in these activities. You don’t necessarily have to become an expert in a programming language to meaningfully contribute to a software project or communicate with software developers.
However, natural language has some serious drawbacks as a vessel for communication. It is often ambiguous. Words have multiple meanings, and different people use different words for the same concepts. Even when people are using the same word with the same ‘meaning’, those words may have very different connotations for different people, which can lead to very different outcomes.
Poor communication can have tremendous costs. Many of us have been involved in projects where weeks or months of work was wasted because important requirements were not adequately communicated. Natural language frequently doesn’t cut it when it comes to correctly building systems.
Words as Messages for Communication
Language is composed of words. Words are symbols. A symbol represents something else, it isn’t the thing itself. The word sun
isn’t going to melt my mouth when I speak it, or melt my fingers when I write it. Let’s look at the word eat
. Eat
is a little symbol that represents a tremendously complex process. If we start breaking that down into what actually happens, we find that we can fill hundreds of books, with hundreds of pages each, describing all the details that actually occur. You move your arms to get food to your mouth. That involves muscle contractions, which involves the brain. There are a whole lot of details to dive into there. You take a bite, and chew. More muscles. You taste the food. There’s another biological system we could describe. You swallow. Your stomach breaks down the food. Now we’ve got chemical processes we can get into. And so on, and so on.
To be a little more specific, words are symbols that map to your accumulated real world experiences that are stored in your brain.
Words are dense
. These symbols pack a tremendous amount of information into a small space. This is compression. Why is compression important? For efficiency. When you compress information, you can transmit more data, faster. Compression is why you can watch HD video on your phone, or store all of your thousands of photos.
When you compress and transmit information, you have…
- A
source
of data that thesender
compresses… - …into a
message
that thesender
sends… - …to a
receiver
who decompresses it.
Lossless compression keeps all the data. It uses patterns in the data or alternative representations of the data to send the same information in a different, smaller format. People do this all the time with acronyms. LOL
is a lossless compression of Laugh Out Loud
. Computers do this all the time with repeated data. For a very basic example, the string of letters aaaaaaaaaaaa
can be shortened to a12
(the letter a, 12 times) without losing any information. We can talk about how ‘good’ compression techniques are by comparing how small the messages they produce are relative to the source information.
Lossless compression is essential when you have to send all the data, but you just need it to be sent faster. But sometimes you don’t need to send all the data. Some compression techniques are lossy. Whenever you send messages, there is a trade off between speed and information. Maybe you can send 80% of the information in only 20% of the time. There are plenty of situations where that is a worthwhile trade off.
Let’s get back to words. Words are compressed messages. They are used to efficiently convey information between two parties. They are lossy compression, because there is a lot of data that the sender is compressing from that isn’t carried in the message itself, which can’t be recovered by the receiver. When I say the word eat
, that symbol is mapped in some way to my accumulated set of experiences with that activity. Information that I associate with that word, or details I may be thinking in my head when I say eat
are not carried along in the message itself. When you hear the word eat
, you decompress it – you map it to your accumulated set of experiences. You and I have had different experiences, so you are not going to decompress eat
to what I had in my head when I said it. The more similar our experiences are, the less is lost in translation
. When people from different cultures communicate, information is often lost. If my cultural norm is to eat a lot of raw beef with my hands, and your cultural norm is to eat a lot of green vegetables with chopsticks, at least one of us is going to be in for a surprise when we say Let's go eat!
One term for all the extra information that is lost when you condense experiences into words is Context
. When I say I ate it
– what is the context? Where is the information about what it
is? Was there a previous message, like Where is the cookie
? Am I even talking about food, or am I saying I just tripped and slammed my face into the ground? Or maybe I ‘ate the cost’ of a bad investment? Context can be critical for correctly interpreting words – for correctly decompressing the messages. This ties into LLMs. They have a context window that is used to interpret your probably ambiguous statements based on the previous message exchanges. People do the same thing. You interpret statements based on the physical world around you, the person who is talking and your history with them, and the rest of the conversation you are currently having.
Words can also have varying degrees of precision
. How narrow or broad is the set of experiences that the word maps to? Words that map to a narrower set of experiences are more precise
, words that map to a broader set of experiences are less precise
. Color words are a good illustration of this. Blue
refers to a wide range of colors – light blues, dark blues, purplish blues, greenish blues, grayish blues, etc. Cobalt
is more precise, it refers to a specific shade of blue. In the digital world, we could get even more precise and use a hex ‘word’ – #0047AB
. The more precise a word is, the less lossy it is likely to be when communicated.
The trade off between speed and information comes into play all the time when we communicate with words. Because words are lossy compression, we are always losing information when we use them. We constantly have to balance how much time
to spend communicating against how much information
we communicate. The more information we need to convey, the longer it is going to take. Another aspect of this decision process is the cost of losing information. If the cost of losing information is small, I’ll probably value speed – fewer words, and less precise words. If the cost of losing information is high, I’ll value information – more words, and more precise words. If I tell you to get cookies
and you come back with snickerdoodles instead of chocolate chip, no harm done. If I tell you to get cookies
and fail to mention I have a deathly allergy to peanuts, and you come back with a cookie with peanuts, well that’s a big cost to losing information about what kind of cookies I need you to get.
Natural Language can be Good for Communication…
Yes, the article title is clickbait. It has plenty of truth, but as good software engineers, we evaluate things in terms of trade offs. As we just discussed, there is a trade off in communication between speed and information. Natural language tends to favor speed. It tends to be spoken and experienced in a richer context, and doesn’t need to convey all the details about the environment, because the participants in the communication channel can get those through other means, like body language, tone of voice, smell, culture, shared experiences, and so on. Many natural language communications are low stakes. If data is missing, you can ask for clarification. If data is misinterpreted, maybe you offend someone a little bit and you can apologize.
…But We Need Something More
When we build systems that need to be durable, reliable, scalable, and long-lived, we have different communication needs than what natural language is typically used for. If something is done wrong, the cost is often high. This could be a monetary loss, when an online service goes down and customers go to a competitor. This could be a physical loss, when a temperature control system breaks and perishable goods are spoiled. This could be a loss of human life, when a safety system fails. Requirements need to be precise and unambiguous. High stakes professions usually develop their own technical jargon and domain specific languages for exactly this reason. Lawyers use legalese
, because if they don’t, someone is going to lose a lot of money. Medical professionals use medspeak
, because if they don’t, in a critical situation, someone is going to die.
This is where things like Domain Specific Languages and Ubiquitous Language come in. Legalese and medspeak are examples of Domain Specific Languages. People in those professions have a shared Ubiquitous Language for certain critical concepts and details. These languages provide precise definitions of terms and broader context, and compress information for efficient communication. Every domain and every situation is different. But they all have communication needs. Language is the means of communication, and language can be optimized to fit the speed and information needs of a given situation. The better your language is – the faster you can communicate more information accurately – the faster you can work and the better the outcomes will be.
Conclusion
Nothing is perfect. Communication between people and across time is always going to lose data and be misinterpreted. But we can always do better. The more we understand the nature of communication, the more can account for its shortcomings, and build better communication systems – better languages – to get better results.