Essay:Painting the World

Space is big. Really big. You just won't believe how vastly hugely mindbogglingly big it is...

The scale of the universe
I tend to refer back to this Douglas Adams quote, taken from Hitchhikers Guide a lot. This is because, in my most humble of opinions, it is perhaps the closest to a true statement we have in the entire wide world. I doubt evidence isn't going to dramatically alter it as a statement, and I'm pretty sure new observations won't turn it on its head - no matter how we look at the universe, and what we do to make these observations, it's size and scale are going to remain absolutely incredible. The scale of the universe is literally impossible to grasp, and I don't use the words "literally impossible" here lightly as a mere rhetorical flourish, I think it truly is impossible to do. The universe always has been, and always will be, bigger than we are, and that very property is what makes it so unfathomable.

There are about 1080 atoms in the observable universe. I don't need to tell anyone that this can is "a fucking lot" but let's just lay it out. That number is:


 * 100,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000

80 zeros in there, and there's no way to put it truly in scale with anything else, once we're passed the first ten or so orders of magnitude it becomes a bit difficult to visualise. And that's just the estimated number of atoms too; for particles of energy it would be more, and though it's a pain to calculate and the number might not look terribly more impressive when written out like that, adding perhaps a dozen more zeroes on the end. It would seem, in our limited way of writing down numbers, to be relatively negligibly bigger despite being billions upon billions of times larger.

To even attempt a bit of context is difficult and fraught with difficulty. For instance, we can say that the Bible has about 180,000 words in it... but, that doesn't help at all, as that's about 6 orders of magnitude accounted for out of 80. What about two Bibles? Again, that doesn't actually add an order of magnitude- for that we need 10 Bibles. Then 100, then 1000. Even if we take what I think are self-gratifying overestimates, the claim that there have been 7.5 billion Bibles, that makes 1015 words:


 * 1,000,000,000,000,000

That still leaves over 60 orders of magnitude more atoms. That's 10 times larger, sixty times, not 60 times as we might instinctively think about it. Take every book ever written, published, or every word on the internet and you would still not be able to approach 1080. It defies analogy.

This is hardly surprising. A word printed on a page is made of atoms, assembled into molecules of ink printed down onto the page. We could never reach a word count of 1080, ever, because we'd simply run out of stuff long before reaching it.

Co-ordinates
Suppose we wanted to record the position of each atom in the universe, to make an accurate "snapshot" of what it really looked like at any one time. This would, of course, be still a very limited picture. It wouldn't display any of the laws of physics or tell us anything time dependent because it would only be a single, lonely keyframe, but it's a nice thought experiment to think about, and something that is capable of doing that much is often referred to as "Laplace's Demon".

To do this, we'd need to record at least four piece of information. Each atom will need a unique identifying number, although I suppose in principle such a thing is optional, and this would be at least 80 digits long, as we'd need to be able to count to 1080 as shown above. Then we'd need X, Y and Z co-ordinates - or radial co-ordinates if you prefer - which composes three essential pieces of information to locate the atom in 3-dimensional space. For this, each co-ordinate would need to contain sufficient information to place the atom precisely within the error region created by the uncertainty principle, out of something the size of the universe. The radius of the universe is 45 billion light years (we can take something of a shortcut by defining an arbitrary center and slicing the information in half with a + or - co-ordinate, and this is my one and only nod to base-2 in this section, though in strict information theory terms this doesn't actually do anything), which is about 4x1026 metres, but the radius of the atomic nucleus is in the range of 10-14 to 10-15 meters, so we need about 41 numeric digits to place our atoms with sufficient precision. And we need three numbers of that size. Oh, we'd probably need to say what element it was too, and that's another 3 digit number to identify up to the 120 or so currently known. So just one atom would be written as


 * ID: 001,2406,457,068,127,175,865,847,234,234,246,300,240,055,066,246,564,425,853,009,235,660,246,501


 * El: 45


 * X: +452,643,622,601,602,691,001,027,012,532,496,261,460,300


 * Y: -002,446,352,357,877,799,246,375,350,112,573,886,236,236


 * Z: +136,134,246,466,132,112,248,258,246,246,352,537,090,246

Repeat 100,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 times.

Quickly you come to realise that to do this is like writing a book with a word count larger than 1080 - you exceed the amount of material in the universe in order to just record sufficient information about the universe. The entire project is just a little bit futile. It's simply not possible. We'd need something larger than the universe to do it, and who knows how much time to process that information. We can throw Turing completeness into the mix to complicate things, but even given that, the only thing capable of storing all the information in the universe is the universe itself, and the only thing capable of computing, processing and storing 100 years of the universe's existence is the universe itself over a period of 100 years (or more).

Storing information
To change track slightly, we can look at how we could conceive of storing information. Can we really store more bits of information in less space? Consider the following string of 1s and 0s that make up some random piece of information.


 * 0 1 1 0 1 1 0 0 1 1 0 0 1 0 1 0

That's 16 characters, or 16 bits. Can we represent that in fewer? Not really. We could try a clever compression algorithm that notices the repeating units in places. For instance, the sequence 1 1 0 0 appears twice, so we could treat that as a redundancy. But can we encode both the code and the compression algorithm that explains the redundancies in the same space? Not likely. We'd find ourselves pretty stumped if we tried to maintain all the relevant information and pack it down. At least we can't do this in a model where each bit actually does represent a genuine bit of information - as a true bit of information possesses no redundancy anyway. We can compress redundant bits, but we can't compress actual defined information in a strict "information theory" sense. If we have only 8 bit spaces available, we simply can't encode those 16 bits.

We can't squeeze these 16 bits:


 * 0 1 1 0 1 1 0 0 1 1 0 0 1 0 1 0

...into these eight spaces:



...without losing something of value. This is the case where there is no redundancy, but what when there is some clear redundancy and so we don't actually lose real information when we do the squeeze? Although this is going to move the goalposts a bit and assume that the compression algorithm doesn't need to be represented in the same space, the compression algorithm is going to be the same in all cases. Consider this string:


 * 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0

In this case there are all 1s on the left side of the string and all 0s on the right. There's quite a bit of redundancy here so it seems obvious how we can represent it. Say, if we took the first 8 bits and averaged them, and took the last 8 bits and averaged them, we can quite nearly express it in just two:


 * 1 0

And when we apply our rule backwards, we get a string of eight 1s and eight 0s again. This is how compression and information works, it's fairly simple stuff.

Problems of compression
Our string of 1 and 0s above collapses neatly down into a simple two bit code. It's clear and beautiful. But what abouts something with just a slight bit more complexity?


 * 1 0 1 1 0 1 1 1 0 0 0 0 1 0 1 0

Here we have an issue in trying to create the same level of compression. The first eight still posses more 1s than 0s, the latter eight still posses more 0s than 1s, so we can just about get away with compressing it down into the two bit code:


 * 1 0

Here's the problem, though. What happens when we run our routine backwards: we get a simplified version back. We get:


 * 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0

No matter what we do, we've lost the positions of those exceptions. We've compressed them out and lost the subtleties of the code. Should we compress down, then uncompress, we will find that what we get back doesn't quite match the original code. We've lost the subtle bits of information.

While we're on compression, I'll mention a slight peeve of mine: people who put MPEGs and JPEGs into ZIP files. Why? They've already been compressed! You can't actually squeeze them any further. It's futile and just gives us an extra layer of crap to deal with just for the sake of a file that's less than 1% smaller. We're not going to miss that extra 1% of hard drive space or extra 1% of download time. Anyway...

Compressing the universe
This is where we stop banging on about slightly abstract bits and pieces and start talking about the real problem at hand. We know we can't cram 16 bits of information (that is, true information with no redundancy) into 8 bits without losing something. We certainly can't break it down into just 2 bits and then re-expand it again thinking it will look the same. So, going back to the top with our 1080 atoms, we know for a fact we could never expect to know all their locations, or even some of their locations, with the accuracy we need to represent the universe. That's a lot of atoms, and a lot of information to process, so we cannot possibly expect our brains, composed of a mere 100 billion neurons, to hold it all. It's a pretty big universe, and we have pretty small heads. Even everyone combined makes up a tiny fraction of it. We don't even need to consider the universe, our brains are a small fraction of our whole bodies, so could never expect to hold the atomic data for the fleshy bits that transport our minds about the place.

We need to compress, and we need to take certain short cuts. There is no fucking way we can understand the whole universe as it "really" is, or even small portions of it because there's just too much information to process - and for the most part of it, it's not really relevant to us. That's not the same as redundancy from an information theory perspective, but there's information we can afford to lose and there's information we certainly cannot. It wouldn't do us any good to try and calculate the locations and trajectories of every atom in a bus in order to cross a road, as by the time we've finished dealing with barely a byte or two of the very first atom, those very atoms will have turned us into a flat bloody pulp by the side of the road. Similarly with nice little evolutionary stories about predators eating us - the people who thought too much about it would get eaten first, the ones who thought about it the least ran away in time.

To do this compression, we need to look at properties of the things around us. Although even these things we call "properties" are already compressions of the physical atoms and data that are constantly assaulting our brains. For instance "colour" isn't a real thing, it's photons of specific wavelengths causing photochemical reactions in the receptor cells of our eyes - the real property behind "colour" (or at least the best representation of it) would be a wavelength dependent spectrum of the photons in a particular sample. Yet we compress this force of nature to something we call "colour" and treat it very much like a real thing. It's the same thing behind other properties like "hard" and "soft", or "light" and "heavy" - they're all just atoms that are bonded differently, and then we interpret them as textures and feelings and sights and sounds.

This particular layer of grouping and modeling aside, we have to ask ourselves what properties do things in the universe share, what do they do, how do they appear to us, and so on. If something possess a property, it gets a 1 in that box, if it doesn't, it gets 0. Perhaps there's a bit of fuzziness, but that could tick a different box to say "it's part way between the two", or it ticks both, or neither. Or perhaps one person will have a clean cut off with the two properties at one point and another will cut them at a different point. Thus is the problem of asking a yes/no question to something with a fuzzy definition. And broadly speaking we start to generate new categories like that. The problem occurs when we try to use multiple properties together to make a category. Say we have eight properties, if it ticks most of the boxes it gets a 1 in our compressed version, if it doesn't tick most of the boxes it gets a 0.


 * 1 0 1 1 0 1 1 1

...will easily compress down and give us:


 * 1

Compressing down from however many bits to just one or two like this is almost what we do instinctively with everything. However, sometimes we forget about properties and just skip straight to using the compressed model. We categorise and effectively ask the question "does it fit in this compressed category". Yes, 1. No, 0. This is essential, just because it's faster. Sometimes you might think it's two categories, A and B, but it's the same question. For example, asking "is this person a man or a woman" (M or F) is the same as asking is this person a man, or not (M or ¬M) or is this person a woman, or not (M or ¬F).

Except...

Compression artifacts of the reality
If the above sets of interesting questions about the "gender binary", then you're thinking precisely right. We compress down to a binary and it doesn't work in some cases where we uncompress again. The same thing happens with religion; we separate the world into "believers" and "non-believers" (B or ¬B) and suddenly get stuck when not all atheists have the same ideals and not all religions follow the same gods.

So this isn't all simply waxing lyrical about information theory and the scale of the universe, it does generate true effects that we see and is the root cause of many fallacies. It leads to what we think the universe looks like not matching up with reality very well. We've instinctively compressed down a complex series of questions about objects, individuals and identities into a binary, 1 or 0, and when we expand it again our pattern doesn't always match reality. Our brains project the simple model into reality, and so we suddenly become surprised when we expand:


 * 1 0

...into:


 * 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0

...and find that it doesn't quite match up with:


 * 1 0 1 1 1 0 1 1 0 0 0 1 0 0 0 0

When we talk of consciousness raising, we're talking about pointing such things out and asking people to alter their compression algorithms. This makes us realise that when we compress down, uncompressing our ideas doesn't mismatch with reality anymore. For this we need to look at more specific categories and properties, then recognise their limits and account for them correctly. This doesn't demand that we stop categorising and compressing, just that when we choose to compress down, we lose as little information as possible. Broad categories like race, nationality and sex fail to take into account so much more information, nuance and individuality that they result in unhealthy stereotypes. They're useful in some ways, but limited in so many others. I once went to an "alternative" music night and was passed around various people based on what music everyone listened to; "hey, they listen to that band too, you're sure to get on" seemed to be the idea. Yet I can't say I enjoyed the experience too much, it was too forced. Granted, I had more in common with those people than if it was matched on less precise categories - can you imagine someone trying "hey, you're a man too, you're bound to get on with each other!" or "you're both from country X" instead? We see the latter all the time. We judge things by the information we assume about them, and this is all because once we compress down, we can't get that detail back again.

Observed to inferred
Compression and decompression like this is how we have to work, out of necessity, but it also lets us make certain deductive leaps. Well, I say "lets us", when I could easily say "forces us" too. It all depends on how useful those leaps are. Such deductions are a blessing and curse to human thought, they're why we perform certain things better than programmed computers (like object recognition, faces for example) and other things far worse (11 dimensional calculus). The trick is to figure out whether such a leap of deduction is justified. If we need to tick 8 boxes in a row to make sure we can be sure that something fits in a category, is it okay to tick just 7 and assume the last one?

Say, and these properties are left deliberately blank to stop any unwanted inferences, we observe the following:


 * 1 1 1 1 1 1 1 ?

...then compress it down to our master category:


 * 1

...and decompress it back to:


 * 1 1 1 1 1 1 1 1

We have inferred the final property, but not (yet) observed it. Was that reasonable? In short, it entirely depends on whether we're willing to change our mind if we observe it to be wrong. That's what science and falsifiability is about. The obvious example of inference is that if we see something that looks squishy, has two arms, two legs, a face, talks to you, then they're human, and so if you shoot them in the head they will die. We don't have to kill someone to assume the reasonableness of that leap of logic, observation gives us enough evidence to be sure enough not to risk it. Another inference is that if we see something that looks squishy, has two arms, two legs, a face, talks to you, then they're human and will have similar feelings to you. If you insult them, they'll feel hurt, if you smile at them, they'll feel more welcome - precisely because this what you'd expect. Again, this is the root of psychological projection; you have a lot more information about the state of your own mind than the state of someone else's, so you really have little choice but to fill in those blank ? bits with your own experience. The Golden Rule states to treat others as you'd like them to treat you precisely because you can infer the feelings of someone else from your own in exactly this way.

Again, this remains reasonable so long as you're willing to replace your inferred assumptions with reality once they've been observed.

Properties and categorisation
Because of the compression, we need to lump things together in suitably similar groups, ones where decompression artifacts cause the fewest possible problems. This lets us quickly grasp ideas and infer properties without having to wait to observe them. Let's consider two well-known (or, at least, apparently well-known) categories people use every day.

The ticks and crosses there are observed, and given the particular properties this would be self explanatory. The others are our inferences - we mark them with "?" to suggest "do they really have this property? Or do we think that they do?". Is the table above sexist? Actually, not so much in and of itself. We have two categories of properties, that could represent any arbitrary groupings, headed by a combination of vowels and consonants. Would anyone immediately perceive the above as sexist if the titles were different? So to accuse it of sexism off hand would just be to react, in a knee-jerk manner, with pure indignation rather than to ask the more nuanced question: "do these properties actually belong together?" Sometimes they do, and sometimes they really don't. This is why we need to categorise meaningfully, so that when we compress and decompress again we don't make incorrect judgments about it - merely reacting badly to a grouping on its own, as many anti-sexists and anti-racists tend to do, doesn't help solve the decompression issue at its source.

Our "isms", what we normally call "prejudice", happen when we use the table above in either (or both) of two specific ways:


 * Firstly, by applying a No True Scotsman principle, we assume that to be a member of the category, one must posses all properties. This actually has wider implications than mere prejudice. Consider properties that we might never observed, but are nevertheless strongly implied by membership of a category. Human mortality, for instance. One wouldn't suggest that humans can be immortal, yet one wouldn't similarly insist that we die before we can truly be counted as "human". And more besides, we can subdivide such mortality into further properties - mortality to poisons or mortality to bullets, for example, and we usually only get to see one. Unless you're Rasputin, apparently.


 * Secondly, in a prescriptive sense. In fact, this applies the same fallacy as the first but from a different angle. In this case we presume that all people posses all properties. Either they must posses those properties, or they should posses these properties. Membership of the set demands it.

The difference between these two uses of what is effectively the same fallacious assumption becomes more clear if we simply add a new property to the bottom of each category:

Seems a little redundant, but this probably more closely reflects reality when we're talking about categorising people and objects. In general, to be a member of category X, we need the property "is a member of category X". In the first version of the fallacy, this property is triggered after all the observed qualities are filled, if, and only if, someone ticks all the boxes do they get a tick under this guise and they earn the label. In the second version of it, the box is ticked as soon as a certain number of the most appropriate and most salient properties are satisfied - well, we hope that they're appropriate. Arguably, this is the most damaging version from the point of view of society. This is effectively what is happening when we use stereotypes, we immediately infer certain sets of properties from just a few simple observed ones and ignore the potential for these inferences to be wrong. These are the extra connotations that we think of but might not necessarily admit to when defining the category for us - simply because they're either not obvious or we don't consider them to be mere inferences, and mistake them for being directly observed. After all, we've triggered the category and all properties, we know these properties must be true "by definition".

Often these connotations and unobserved characteristics, whether we admit to them or not, are far more important than the observed characteristics. This is especially true in looking at the motivation as to why we want the categories sorted in a certain way. A straight, single man, out in a bar looking for someone to have a one-night-stand with, is interested in sorting out the "is female" property mostly because of the inferred property "I can have sex with them". Sounds a crude example, but it's illustrative of the point. In this case it would be silly for such a person to go around trying to have sex with everything - although I'm assured some people have tried - to test that characteristic. Instead, he would view a few simple observed properties and make an inference - hence the absolutely hiiiilarity of picking up one of the Lady Boys of Bangkok by accident.

Another pertinent example of motivated categorisation is in the abortion debate when people ask the question "is a fetus a human life?" We take an adult human of 20-30 years of age, and list their properties side by side with the properties of a fetus. Lining the two up like this we can demonstrate that they're totally different, and we can even demonstrate by certain properties that a freshly fertilised embryo has more commonalities with a carrot seed than with a grown human. Yet the question still remains: "is it a human life?". The motive here is often overlooked, but it is clear: what is under discussion is not the physically observable properties but the connotation and implication "can we kill it?" - or to use a less emotive, but eerily synonymous, phrase "can we prevent its existence as an emergent entity from continuing?" Often the argument is about the properties, or about the name of the category, but looking at what we want to achieve guides.

Motive for why someone would like to categorise is important, and often lies with the connotations and inferences more than the observed properties. It's desirable to think that, when walking through a deserted path in the dark of night, the rustling in the bushes is just the wind, it's useful (from a fight or flight perspective) to think that it might be a stranger out to kill us. These are both common and salient deductions, but doesn't deflect from the fact that the only observed properties we've experienced, so far, is a rustling bush.

Probabilistic properties
Slightly more useful than this "all-or-nothing" approach to categorisation is to admit a certain degree of probability to each property. While more accurate, it also produces a massive level of complication - such a degree that may lose some of its usefulness. Consider a table representing characteristics of nationality. A simple search on Google brings up some stereotypical characteristics of the English, for instance, and instead of ticks, crosses and question marks, we can apply something of a probability. These probabilities can have some basis in empirical fact, too, which helps us populate the list with some degree of accuracy, although I don't claim to have done sufficient research for the figures below - i.e., the numbers are pretty much PIDOOMA.

There are a few problems here. Of course, the "is English" gets 100% by definition - this at least contrasts it with the "born in England" property, as not every red blooded Englishman may have been born on the soil. It's this question that triggers the category and brings up this list of features - as people are mostly interested in that question, we have to trigger this as a proper "tick" for the category to exist. The major problem from a systematic point of view is that these probabilities are not independent. If one hates the French, but likes tea, is the probability of having a stiff upper lip the same as if one likes the French but hates tea? They may seem unrelated, but the thorough statistical analysis we need to use to give meaningful numbers might show correlations. The probabilities, therefore, will be averaged at the start, and dynamic as new properties are observed to be true and ticked off.

Yet there is another problem in simply using this approach. Giving a probability that someone has a property because they're part of a category, P(A), implies the existence of P(¬A). And P(¬A) is composed of competing hypotheses, some of which might be mutually exclusive some of which might not. Why, in a list of national characteristics, simply give P(likes tea), and not P(likes coffee), or P(likes orange juice) as well? Indeed, we can have literally as many properties as we like listed, all with certain empirical probabilities associated with them. The fact that we'd list only a certain number of properties under a category and not consider others is giving undue privilege to a particular hypothesis, and so biases our thinking a lot. We need to include all possible properties equally if we want to best describe reality. Here's the conundrum; if this is the case, why categorise at all? Grouping similar properties together is rendered completely redundant by the fact that we need to give all possible properties a probability, and make them dynamic relative to each other. In short, the only way of meaningfully reflecting the world is to consider property A, and then ask what's the probability that that thing also possess property B - naturally, the relationships implied by Bayes theorem becomes useful here but that doesn't solve the problem that categorisation is rendered redundant through this method - even though it most closely reflects reality.

Instead of asking "what's the probability that an Englishman likes tea", we're asking "what's the probability that someone likes tea, given that they're born on English soil". Then "what's the probability that they have a stiff upper lip given that they like tea", and then "what's the probability that they have a stiff upper lip given that they're born on English soil". If we need to look at probabilistic relationships between P(A) and P(B), then we also need relationships between P(A) and P(C), then P(B) and P(C)... then again for D, E, F and so on. The number of relationships increases dramatically, to the point where we can't process this. As shown in the first half, we simply can't compress this much information.

A map like this can't fit into our small heads so we're actually back to the starting point - where we have a set of properties, which are selected for us, and a trigger that brings them up for us.

Tight categorisation
Instead of altering the way we use groups and sets of properties to define the world, we need to be able to use them properly. By this, I mean make sure that they satisfy a certain number of conditions. Most generally, that the categories we use are tight, finely tuned machines that enable us to make good predictions about the world. I propose the following as a rough guide, though by no means do I claim it's perfect, it's just that lists often sound more profound:


 * 1) Associated properties are empirically justifiable: Such properties are often shown together, like 99% of the time, or preferably higher. Human appearance and mortality, for instance. This means that our inferences have as much basis in fact as possible, and aren't just thrown together based on what we're taught. We see someone on Fox News, we can be pretty sure they'll be right-wing.
 * 2) Exceptions are a minority: Given a set of probabilities for, say, 10 observed/inferred characteristics, then an object triggering that category but not possessing all 10 properties should be in a minority. We can't think that "terrorist" is one of these primary characteristics under "Muslim" because it's empirically demonstrable that that such an overlap is a minority.
 * 3) We note a distinction between observed and inferred: While there's no hard and fast distinction between these two realms, we need to be clear on what we're actually viewing based on evidence and what we're actually inferring based on our categories. We see a rustling bush, we're inferring a murderous stranger standing behind it.
 * 4) We adapt our inferences to new observations, and new observations should be useful and specific: You can't get more information just by multiplying broad categories together. You can't really refine "man" down just by adding "gay" or "white" or "middle class" or even all three to it. If you look at an individual person and into their history and what they actually say, what they actually believe, and what they do, you can make a much more informed inference.
 * 5) We don't enforce the all-or-nothing effect of our category: If we see exceptions to our categories, we allow for them. If we observe that someone is female, then infer that she must like pink and play with dolls, we don't continue to believe this inference when proved otherwise. We don't use No True Scotsman because broader categories should be a guide to get us to more refined categories and for us to look for more information.

Of course, broad strokes like "male", "female", and then "black", "white", "gay", "straight" and so on don't end up following these rules unless we restrict their usage. What can we reasonably infer from "gay" other than sexual orientation? Musical taste? Driving skill? It's all about what predictive qualities a label and what a category have as that is what inferred characteristics are. If we observe that someone has dark skin, what can we reliably predict about their behaviour based on this trait alone - the fact that the answer is "very little, if anything" is the rationalist justification for why racial prejudice is wrong. The fact that the overlap of "Muslims" and "terrorists" is small makes the statement "all Muslims are terrorists" nonsensical and irrational. The casual disregard with which people use such broad categories to make tight inferences often shocks people, and sometimes doesn't. We chastise certain people for saying "Muslims" in a way that implies "all Muslims are X", we don't tend to chastise others for saying "all men are X" in the same way - even though both are the same mistake of being careless with the precision that they're grouping things together and the implications they make because of it.

It requires more mental effort to use finer categories, but less effort than processing an atom by atom, 1-to-1 scale map of the world. Yet we get out of it what we pay for in such mental effort.

The painting analogy


I called this "painting the world" because I originally planned on spending a bit more time on what I view as an apt analogy for how best to resolve these problems, something often so overlooked when discussing logical fallacies and rationality.

Consider an artist, or less pretentiously, a painter. Given a blank canvas, a painter is given the option to represent the world as they like. This isn't a comment on "what is art" or some criticism of "art", and I would certainly hope no one takes it as such - although I imagine anyone who does so wouldn't be capable of grasping this point anyway, so I don't particularly care. A cubist, or other abstract painter might take a vast canvas, and then make broad strokes to create an obscure looking figure, perhaps it's a person, perhaps it isn't. Either way, it seems to set the art world on fire with debate. What is it? What does it mean? What are they trying to say? Perhaps in the art world, this is more interesting. A world where there can be no wrong answers and disagreements are preferred. Nightmarish, I know. They have taken their subject matter and presented it as they view the world, and now people argue over it.

Then we can look at more realist painting. Those who take their time, and use their tools appropriately to make the best representation of reality they can. It takes time, it takes skill. It takes practice in both how to craft the work and how to observe the world correctly. But at the end of it all, they have captured the world far closer to how it really is - and there is no one to argue what it is or what it represents except for those who don't find the real world, or accurate representations of it, desirable. They can be dismissed as not particularly creative in the art world, but it's a great skill and far more apt for rational discussion. No artist can make an atom for atom replica of a real landscape - they can't get detail so fine as to have individual blades of grass placed perfectly in the right place. Yet, despite this it's still a better representation of reality to use skill and technique to add the right hues, shades and strokes to represent the grass, and dot the occasional overgrown tuft or stray weed than it is to get the widest brush you can find, dip it in some thick green gloop and smear it all over the page and call it "the ground". Indeed, you'd have to insist on calling it "the ground" in order for people to recognise it as such if your representation was too abstract - analogous to those who have to insist that a category has been triggered and their inferred properties are ticked, when they don't have the observational evidence to back that up.

To achieve a good picture of reality, we need fine brush strokes, not broad ones. We need tools that are appropriate to paint such a model of the world, and we need eyes trained to see what is there and not what we think is there or what we want to be there. It takes more time, but it cannot be done with the wrong tools or the wrong approach. Just as we can't paint like Constable with a brush more suited for wallpaper paste, we can't rationally discuss the world with categories more suited to quick and dirty judgments that play to our preconceptions.

In bringing this back to the discussion above, this painting is about losing as little information as possible when we're forced to compress down, so there's less to debate when we decompress afterwards. We want to paint a category where reality has already ticked as many properties as possible, and our reconstructions and decompressions are reasonable. And for this we need fine brushes because we need them to represent, as closely as we are able, what is really there and stop inferring wrong things because our brushes are too big.