Stop Saying Entropy is Disorder

We misunderstand entropy because we’re taught it’s a measure of disorder, but the real definition of entropy is far more useful and intuitive.

Look, I get it. You can’t throw a textbook at someone with no formal understanding of statistical mechanics when they ask a question. An answer that someone cannot understand is worth as much to them as no answer at all. Even taking this fact into account, you shouldn’t give someone a wrong answer because you think they can’t understand the right answer.

In this article, I’m going to explain entropy and how it has nothing to do with disorder. Since others have written about the history of the association, I won’t cover it in this article. I will also refrain from using any math that a middle school student couldn’t understand. I will explain a few technical terms to help you better understand entropy in case someone wants to know more. Although I will focus on a few specific examples, these concepts apply to far more than the scenarios I present.

Microstates and Macrostates

If you’re talking to someone who wants a quick answer, you don’t need to go into the differences between macrostates and microstates. I want to make sure you understand the concept inside and out if you’re talking to someone who wants the full answer.

A microstate is an exact description of all parts of a system. For example, if you were to flip a penny, a nickel, a dime, and a quarter, an example microstate would look like

  • Penny: Heads
  • Nickel: Tails
  • Dime: Tails
  • Quarter: Tails

With a few coins, you can specify a microstate without trouble. To describe all the atoms of a common system, you would need around 420 times the world’s current total data center storage capacity. Most of this information would be irrelevant to the information you would want to know, such as the internal energy of the gas or its temperature. For this reason, we describe systems with macroscopic properties. We call such a description a macrostate as it uses macroscopic properties to describe a system. Here are some example macrostates:

  • Exactly 1,000,000 gas particles in the left half of a box.
  • Between 900,000 and 1,100,000 gas particles in the left half of a box.
  • 1,000 Joules of heat in a solid.
  • A mixture of 50% ice and 50% liquid water.

If we go back to the coin-flipping example, “three tails” would be a macrostate. Note that four microstates correspond to the macrostate three tails: { HTTT, THTT, TTHT, TTTH }. In fact, one macrostate can correspond to any number of microstates. If all microstates are equally likely, macrostates with a lot of microstates are more likely. We call the number of microstates that correspond to a macrostate the multiplicity of the macrostate. We represent multiplicity with an omega (Ω) symbol.

What is Entropy?

Like almost every concept in physics, entropy connects concepts together:

  • Entropy measures the probability¹ of a macrostate. The more likely the macrostate, the higher the entropy.
  • Changes in entropy relate temperature to changes in internal energy.

If you can find out how likely each macrostate is, you can then find out how the system responds to changes in temperature and internal energy. You can then use that information to come up with more information, such as the ideal gas law.

If you don’t want to go through the definition of a macrostate, you could give the definitions:

  • Entropy measures how likely a system will be in some configuration/state.
  • Entropy relates temperature to energy.

After the first definition, I would recommend giving some example states.

The Second Law of Thermodynamics

If you’ve heard of entropy, you’ve likely heard that the Second Law of Thermodynamics states that entropy never decreases. For entropy to decrease, it would have to move from a more likely state to a less likely state. While possible, doing so is generally so improbable that you would need to wait an absurd amount of time for it to happen.

Example: Gas in a Box

Let’s say I have a box with a divider in the middle. Let’s also say that I have 1,000,000 gas particles in the left half of the box and 0 particles in the right half. As soon as I lift the divider, each particle has a 1% chance of moving to the other half every second. The box is large enough such that each gas particle has negligible interactions with any other gas particle.

If we look at the system at the start, the number of particles in the left half can only stay the same or decrease. If every particle has a 1/100 chance of switching sides, then you would expect around 1/100 of the 1,000,000 particles (i.e. 10,000 particles) to move to the right. More than 10,000 or less than 10,000 can move to the right, but on average 10,000 particles will move to the right. After that point, around 1/100 remaining 990,000 particles in the left half (i.e. 9,900) will move to the right. On the other hand, 1/100 of the 10,000 particles in the right half (i.e. 100) will move back to the left. We end up with a net flow of 9,800 particles from the left to the right.

You can find the expected net flow of particles by taking the difference between the number of particles in each half and multiplying by 1%. Going back to the previous example, after the first second, you would expect (1,000,000 – 0) × 1 /100 = 10,000 particles to move from left to right. After the next second, you would expect (990,000–10,000) × 1 /100 = 9,800 particles to move from left to right. As you can see, particles tend to move from the side with more particles to the side with fewer particles. If you wait long enough, both sides will have about the same number of particles. Of course, once it reaches 500,000 on both sides, it won’t stay at 500,000 exactly, but it won’t deviate too far from 500,000. I don’t want to go too deep in the math, so I’ll leave it here. If you want to know more, the system becomes like a Binomial Distribution once it reaches thermal equilibrium.

This system will move towards macrostates with equal numbers of particles in each half. To bring this back to entropy, remember that entropy measures the probability of the macrostate. You’ll have a higher probability of around equal numbers of particles in both halves than you will of one half having more particles than the other.

Isolated Systems

You might think that we could establish a scientific law that states “a system will tend to its most probable macrostate,” and you’d be almost right. If we make a slight change, we have a statement of the Second Law of Thermodynamics. If we find some way of forcing a system into a less probable macrostate, we can figure out the restrictions on our new law.

In the case of the Gas in a Box example, we want a large difference between the number of particles on each side. We could force the system back into a less probable macrostate by pushing all the particles to the left half and then sliding the divider down. We could also create an unlikely macrostate by constantly adding particles to the left side of the box and removing particles from the right side of the box. We could also cover the inside of the left side of the box with some material that would stick the particles to the sides of the box and trap them. You can come up with more examples, but these suffice. What do all these examples have in common?

In all cases, we’re adding or removing something to the system, whether it be energy or matter. We need to refine the Second Law of Thermodynamics to only apply to isolated systems. Isolated systems don’t exchange energy or particles with their environment. In contrast, closed systems exchange energy with their environment and open systems exchange energy and matter with their environment. For this reason, the Second Law of Thermodynamics states that an isolated system tends to move towards its most probable macrostate.

Isn’t it Still Possible to go back to the Original State?

Note that in the Gas in the Box example, all particles could move back into the left side of the box, but it’s almost guaranteed to never happen. For example, how long do you think it would take on average for the Gas in the Box example to return to its original state?

A million years? No.

A billion years? No.

A trillion trillion trillion trillion years? Not even close.

It would take around 10³⁰¹⁰²² years. The heat death of the universe will occur in about 10¹⁰⁰ years for comparison. The sheer amount of time is so large that I don’t even know what to compare it to.

Even if you only wanted 10,000 (1% of a million) more particles on one side than the, you would still have to wait around 10 trillion trillion trillion years. After waiting that long, only black holes and the remains of stars will exist in the universe. The box would have disintegrated long before then and none of this would matter.

And you have to keep in mind that a million particle system is nothing in the grand scheme of things. A system with a million particles is so small that it’s not even a drop in the bucket. A drop would have around a billion trillion particles. If you were to make the box with the density of air at standard temperature and pressure (STP), you could only fit 7 viruses in it.

If I let my fingers wander idly over the keys of a typewriter it might happen that my screed made an intelligible sentence. If an army of monkeys were strumming on typewriters they might write all the books in the British Museum. The chance of their doing so is decidedly more favourable than the chance of the molecules returning to one half of the vessel.
— Arthur Eddington, referring to a system like the gas in a box scenario

In short, it can, but you, the Earth, and your system will be long gone before it does.

Poincare Recurrence Theorem

If a system has a finite volume and infinite time, it will almost surely eventually return to its original state. Remember that nothing is making sure that there are around the same number of particles distributed between both sides of the box. An uneven distribution tends not to happen because it is improbable. That being said, by the Infinite Monkey Theorem, any event with a non-zero probability will almost surely happen given enough time.

Why not use Entropy in the Statement of the Second Law?

I consider the probability statement to be the best statement of the Second Law of Thermodynamics. You can derive everything else from it and it’s the most intuitive way of saying it. You don’t introduce any complex words and every word you use makes sense. From there, you can introduce entropy as a measure of probability and everything falls into place. If you start with the entropy statement of the Second Law of Thermodynamics, people who don’t understand entropy (including those who think entropy is disorder) will misunderstand the law. On the other hand, even if you kind of understand probability, the probability statement of the Second Law of Thermodynamics makes sense.

Heat Flow and the Second Law of Thermodynamics

If you go back to the Gas in a Box example, imagine we fill both sides of the box with an equal number of particles. Let’s say that the particles on the left are much faster than the particles on the right. If we remove the divider, the faster particles and the slower particles will mix and both sides will have the same average particle speed.

The average speed of the particles determines the temperature. Since the left side of the box has faster particles than the right side before we remove the divider, it is hotter. Once we remove the divider, the particles will spread throughout the box and both sides will have the same temperature. In other words, the heat from the left side of the box flows into the right side of the box. This fact leads to another way of expressing the Second Law of Thermodynamics: heat flows from hot things to cold things in an isolated system.

Of course, you don’t need to exchange molecules for heat to move from hot to cold. Say we left the divider up between the two sides and that the divider could store and emit energy. If a particle hits the divider on a spot with less energy, then the particle will give some of its energy to the divider. If a particle hits the divider on a spot with more energy, then the divider will give some of its energy to the particle. Since the hot side has more energy than the cold side, the hot side will give energy to the divider and the cold side will take energy from the divider. The energy will flow from the hot side to the divider to the cold side as heat.

Entropy as a Measure of Used Up Energy

This view of entropy is more of a side effect than either of the definitions I gave you earlier. As a system moves from a less likely state to a more likely state, the system will transfer energy around. You can use that energy to do work. Once it’s in the most likely state, it will stop moving energy around and you can no longer use the system to do work.

If we replace the divider in the Gas in a Box with a sliding divider, the gas will expand like a piston and put energy into moving the divider. If we were to put a weight on top of the divider, the expanding gas would push the weight up until the force of the expanding gas matches the weight. At this point, we can’t get any more work out of the system. Now, you still could use the system to convert energy to work, but the system itself has run out of available energy.

Entropy as a Measure of Lost Information

Information is the resolution of uncertainty. The more information you have, the more questions you can answer. We measure information in bits, and every bit is an answer to a yes or no question.

The entropy of a macrostate corresponds to the number of microstates. As the system moves from a lower entropy macrostate to a higher entropy macrostate, you will lose information about the system.

Let’s go back to the Gas in a Box example. If the system were in the macrostate “all particles are in the left half of the box,” we could only be in one possible microstate. If the system were in the macrostate “each half has the same number of particles,” you would need about 1,000,000 more bits of information to know which microstate you’re in.

Like the view of entropy as a measure of used up energy, the loss of information is a side effect of the probability definition of entropy.

What About Entropy in Information Theory?

Entropy in information theory refers to the average amount of information per possible outcome in a variable. For example, if someone told you a word started with the letter e, you would have a low chance of guessing the next letter. If someone told you a word started with the letter q, the next letter is going to be u unless the word is a transliteration like Qatar. In this sense, knowing the first letter is a q resolves most of the uncertainty of the next letter. Put simply, q is a high information letter and e is a low information letter. If you take the average amount of information per letter (taking the frequency of the letters into account) for some text, you end up with the information-theoretical entropy of the text. In the case of the English language, there is around 1 bit of information per character. If you want a basic overview of this version of entropy, then check out this article about the number of possible unique English tweets.

These concepts have the same name because they have the same mathematical formula. They might also share the name because of a suggestion from von Neumann.

I’m not going to go into any more depth here. I dislike entropy as a measure of lost information because people have enough misconceptions about Information Theory and entropy already. We don’t need misconceptions about one leading to misconceptions about the other.

Why do we Keep Talking About Gases?

The Second Law of Thermodynamics and entropy apply everywhere in the universe and all states of matter. I chose to use ideal gases in my examples because ideal gases have a lot of simple properties that make them good for explaining entropy and the Second Law of Thermodynamics.

What About Disorder?

When you hear the term “disorder,” you think disorganization. You have this image of everything lined up in a grid system falling apart into an unstructured goop. This disorder description of entropy falls apart when considering:

  • Oil and water separating (Read through the top three answers as they’ll go into more depth about why all these systems form something more “ordered”).
  • The formation of crystals. Crystals are so ordered that you can describe them with the positions of a few atoms and three primitive translation vectors.
  • Atomic dipoles (magnetic or electric) in a material aligning.
  • Refrigerating things.
  • Reverse osmosis.
  • Anything a biological cell does.
  • Anything humans build, from the pyramids to the computer or smartphone you’re using to read this article.

All these scenarios involving the release of energy into the environment doesn’t seem to make the universe more disordered. A crystal that forms by making the air particles move a little faster seems like a net gain of order. If everything tends towards disorder, then none of these things should be possible. So what gives? In all of these cases, the total entropy of the entire universe increases while the local entropy decreases. As we see in this answer to the oil and water question, the more “disorganized” macrostates need more energy than the more “organized” macrostates. By moving to a lower energy macrostate, the system can release energy into its surroundings. The energy released increases the entropy of its surroundings enough to offset the local decrease in entropy. In short, the entire universe is moving towards more likely macrostates, even if some parts of it are moving towards less likely macrostates.

To make entropy relate to disorder, you have to take disorder to mean randomness, but that’s still not enough. You then have to take randomness to mean probability. It’s as if people say entropy is disorder for no reason other than that was how they heard it defined. Here’s an example of such an article. My college-level Physics I textbook is another. At that point, why not say probability? Why use a vague word that you have to redefine as a synonym for another word? It’s not like the concept of disorder simplifies anything. In fact, the vagueness of disorder makes entropy even harder to understand. What does it mean for a system to be disorderly? How do you measure disorder? Does someone look at a system and say “Yep, that’s about seven disorder.”? Why is disorder measured in units of energy divided by temperature? Unlike the disorder “definition,” the two definitions of entropy and the Second Law of Thermodynamics I’ve given in this article are simple, but exact — powerful, but intuitive.

In short, disorder is vague, meaningless, unintuitive, and contributes to misconceptions. Probability is exact, meaningful, intuitive, and promotes understanding.

The Problem with Bad Definitions of Entropy

As with all misconceptions in science, people use them to make dumb jokes or add credibility to some psychological phenomenon. (Building habits has nothing to do with Newton’s First Law.)

While a few bad jokes are a minor problem, scientific illiteracy is a much bigger problem. As holes in a piece of fabric make it easier to tear new holes, misconceptions lead to more misconceptions. The vagueness of “disorder” allows people to substitute the misguided intuitions of their audience for proof.

While all pseudoscientists abuse scientific illiteracy, no one gets as much out of defining entropy as disorder as creationists. To keep this article short, I split this article into another article, Creationism and Entropy. In that article, I show how the misunderstanding of entropy fuels the argument.


Pseudoscientists are going to lie about whatever they need to lie about to sell you whatever they want to sell. As members of the scientific community, we can at least make it harder for them. We should not use vague or meaningless definitions to define our terms. Instead, we should take Einstein’s advice to make things “as simple as possible, but no simpler.” If someone asks you what entropy is, do not tell them it’s something to do with disorder. Instead, give them these two definitions:

  • Entropy measures the probability of a state.
  • Entropy relates energy to temperature.

If someone asks you about the Second Law of Thermodynamics, don’t say “entropy always increases.” Most people still think entropy relates to disorder and they won’t distinguish between isolated and open systems. Instead, tell them

  • An isolated system will tend to move towards a more probable state.
  • Heat flows from hot to cold unless you put energy into the system to force it to go the other way.
  • Energy tends to spread out. As it spreads out, we can use less and less of it.

The first statement is the most accurate and should make intuitive sense. The second statement fits into what most people already know but doesn’t explain why it happens. The last statement explains why we can’t reuse energy and fits into the second statement.

Now that we know what entropy is, let’s put our understanding to use in my other article, Let’s Derive the Ideal Gas Law From Scratch!.


¹More precisely, entropy measures the number of microstates that correspond to a macrostate, a.k.a. the multiplicity. I will defend defining entropy in terms of probability for this article because the multiplicity for a macrostate is directly related to the probability of a macrostate and the total number of microstates for the system, which turns out to be a constant added to every entropy.

Currently an Undergrad Student Majoring in Physics, Computer Science, and Math with a Minor in High Performance Computing. I intend to get a PhD in Physics.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store