The Signal and the Noise

October 23, 2022

I recently re-read Nate Silver’s book, The Signal and the Noise (The Penguin Press, 2012).

From a value investing point of view, it’s crucial to bear in mind that trying to forecast the stock market will typically cause you to make less money than you otherwise would.  It’s far more reliable and profitable over the long term to stay focused on individual businesses without ever trying to predict the market or the economy.

Yet it’s worth reviewing Silver’s book because it discusses Bayes’ rule, which is essential for anyone trying to make predictions.

Most of us, even many scientists, do a poor job when it comes to making predictions and when it comes to updating our beliefs.  To understand why we make these errors so often, it helps to recall that we have two different mental systems:

System 1:  Operates automatically and quickly;  makes instinctual decisions based on heuristics.

System 2:  Allocates attention (which has a limited budget) to the effortful mental activities that demand it, including logic, statistics, and complex computations.

Usually we rely on System 1 to make predictions.  Most of the time, these predictions are accurate because they deal with areas of life with very high predictability, like the fact that there is gravity.

But when we encounter complex phenomena where only the careful use of proper statistical thinking can help us make good decisions, System 1 nearly always makes mistakes.  In these situations, we have to slow down and consciously activate our System 2.

Once we’ve learned to activate System 2 when it is required, there are two separate steps we need to learn:

  • First, we must train ourselves to make good predictions on the basis of all available evidence.
  • Second, we must train ourselves to test our predictions and to update our hypotheses on the basis of new information.  This is where Bayes’ rule comes in.

Here is an outline for this blog:

  • Ignore Macro Forecasting; Focus on Individual Businesses
  • Scientific Progress
  • Out of Sample Events
  • Foxes vs. Hedgehogs
  • Thinking Very Big and Very Small
  • Chaos Theory
  • Earthquakes
  • Economic Forecasting
  • Bayes’ Rule
  • The Problem of False Positives
  • Conclusion

Note:  Bayes’ rule is in the running as the most important formula in artificial intelligence.  Although a market-beating AI value investor may be 10-20 years away, it’s interesting to follow some of the developments.



No one has ever been able to predict the stock market with any sort of reliability.  Ben Graham, the father of value investing, had about a 200 IQ.  Buffett calls Graham “the smartest man I ever knew.”  Here is what Graham said about market forecasting:

… if I have noticed anything over these 60 years on Wall Street, it is that people do not succeed in forecasting what’s going to happen to the stock market.

If you’re a value investor buying individual businesses when their stocks are cheap, then macroeconomic variables generally aren’t relevant.  Furthermore, most investors and businesspeople that pay attention to political and economic forecasts end up worse off as a result.  Here are a few good quotes from Buffett on forecasting:

Market forecasters will fill your ear but never fill your wallet.

We will continue to ignore political and economic forecasts, which are an expensive distraction for many investors and businessmen.

Charlie and I never have an opinion on the market because it wouldn’t be any good and it might interfere with the opinions we have that are good.

If we find a company we like, the level of the market will not really impact our decisions.  We will decide company by company.  We spend essentially no time thinking about macroeconomic factors.  In other words, if somebody handed us a prediction by the most revered intellectual on the subject, with figures for unemployment or interest rates or whatever it might be for the next two years, we would not pay any attention to it.  We simply try to focus on businesses that we think we understand and where we like the price and management.

The great economist John Maynard Keynes developed a similar investment philosophy to that held by Buffett and Munger.  Though Keynes was a true genius, he failed twice trying to invest based on macro predictions.  Finally, he realized that a concentrated value investment approach was far more effective.

Keynes did very well over decades as a focused value investor.  His best advice:

  • Buy shares when they are cheap in relation to probable intrinsic value;
  • Ignore macro and market predictions, and stay focused on a few individual businesses that you understand and whose management you believe in;
  • Hold those businesses for many years as long as the investment theses are intact;
  • Try to have negatively correlated investments (for example, the stock of a gold miner, says Keynes).



One of Silver’s chief points in the book is that we have more data than ever before, but the signal is often overwhelmed by the noise.  Says Silver:

Data-driven predictions can succeed – and they can fail.  It is when we deny our role in the process that the odds of failure rise.  Before we demand more of our data, we need to demand more of ourselves. 

When it comes to demanding more of ourselves, what Philip Tetlock and Barbara Mellers are doing with The Good Judgment Project is very worthwhile:

Silver points out that if the underlying incidence of true hypotheses is low, then it’s quite likely we will have many false positives.  John Ioannidis has already shown this – with respect to medical research – in his 2005 paper, ‘Why Most Published Research Findings Are False.’  Silver spoke with Ioannidis, who said:

I’m not saying that we haven’t made any progress.  Taking into account that there are a couple of million papers, it would be a shame if there wasn’t.  But there are obviously not a couple of million discoveries.  Most are not really contributing much to generating knowledge.



There were many failures of prediction related to the 2008 financial crisis.  Silver observes that there is a common thread to these failures:

  • The confidence that homeowners had about housing prices may have stemmed from the fact that there had not been a substantial decline in U.S. housing prices in the recent past. However, there had never before been such a widespread increase in U.S. housing prices like the one that preceded the collapse.
  • The confidence that the banks had in Moody’s and S&P’s ability to rate mortgage-backed securities may have been based on the fact that the agencies had generally performed competently in rating other types of financial assets. However, the ratings agencies had never before rated securities as novel and complex as credit default options.
  • The confidence that economists had in the ability of the financial system to withstand a housing crisis may have arisen because housing price fluctuations had generally not had large effects on the financial system in the past. However, the financial system had probably never been so highly leveraged, and it had certainly never made so many side bets on housing before.
  • The confidence that policy makers had in the ability of the economy to recuperate quickly from the financial crisis may have come from their experience of recent recessions, most of which had been associated with rapid, ‘V-shaped’ recoveries. However, those recessions had not been associated with financial crisis, and financial crises are different.

Silver explains that these events were out of sample, which was a major reason for the failed forecasts.  The problem is that few forecasters ever want to look for examples and evidence outside of what their models have already considered:

We will be forced to acknowledge that we know less about the world than we thought we did.  Our personal and professional incentives almost always discourage us from doing this.

We forget – or we willfully ignore – that our models are simplifications of the world.  We figure that if we make a mistake, it will be at the margin.

In complex systems, however, mistakes are not measured in degrees but in whole orders of magnitude…

One of the pervasive risks that we face in the information age… is that even if the amount of knowledge in the world is increasing, the gap between what we know and what we think we know may be widening.  This syndrome is often associated with very precise-seeming predictions that are not at all accurate.



Silver tabulates Philip Tetlock’s descriptions of foxes versus hedgehogs. (Tetlock is the author of Expert Political Judgment: How Good Is It? How Can We Know? and also Superforecasting: The Art and Science of Prediction.)

How Hedgehogs Think

  • Specialized: Often have spent the bulk of their careers on one or two great problems.  May view the opinions of ‘outsiders’ skeptically.
  • Stalwart: Stick to the same ‘all-in’ approach – new data is used to refine the original model.
  • Stubborn: Mistakes are blamed on bad luck or idiosyncratic circumstances – a good model had a bad day.
  • Order-seeking: Expect that the world will be found to abide by relatively simple governing relationships once the signal is identified through the noise.
  • Confident: Rarely hedge their predictions and are reluctant to change them.
  • Ideological: Expect that solutions to many day-to-day problems are manifestations of some grander theory or struggle.

How Foxes Think

  • Multidisciplinary: Incorporate ideas from different disciplines and regardless of their origin on the political spectrum.
  • Adaptable: Find a new approach – or pursue multiple approaches at the same time – if they aren’t sure the original one is working.
  • Self-critical: Sometimes willing (if rarely happy) to acknowledge mistakes in their predictions and accept the blame for them.
  • Tolerant of complexity: See the universe as complicated, perhaps to the point of many fundamental problems being irresolvable or inherently unpredictable.
  • Cautious: Express their predictions in probabilistic terms and qualify their opinions.
  • Empirical: Rely more on observation than theory.

Foxes are better forecasters than hedgehogs.  But hedgehogs – because of their big, bold predictions – are much more likely to be interviewed on television.

Silver describes three broad principles that he relies on in the FiveThirtyEight forecasting model:

Principle 1:  Thinking Probabilistically

Each forecast comes with a range of possible outcomes.  The distribution of possible outcomes is an honest expression of the uncertainty that exists in the real world.  What typifies a good forecaster is that the range of possible outcomes is itself supported by the later results of the forecasts.  In other words, if you examine all the times when a good forecaster said there was a 90 percent chance of an event happening, those predicted events should have happened about 90 percent of the time.

Foxes very often give a range of possible outcomes, while hedgehogs rarely do.

Principle 2:  Update Your Forecasts

When good forecasters get new information that changes the probabilities associated with their prediction, they update their prediction accordingly.  A fox has no trouble changing her mind if that’s what the new evidence suggests.

Unfortunately, some people think changing one’s mind on the basis of new evidence is a sign of weakness.  But if the forecaster is simply incorporating new information as well as possible, that’s a sign of strength, not weakness.  Silver quotes John Maynard Keynes:

When the facts change, I change my mind.  What do you do, sir?

Principle 3:  Look for Consensus

Very often the consensus estimate is better than most (and sometimes all) individual forecasts:

Quite a lot of evidence suggests that aggregate or group forecasts are more accurate than individual ones, often somewhere between 15 and 20 percent more accurate depending on the discipline.

A common experiment is to present a group of at least thirty people with a jar of pennies, and then ask each person in the group to guess how many pennies are in the jar.  In nearly every case, the average guess of the group is more accurate than every individual guess.

Stock prices can be thought of in this way.  But there are exceptions occasionally.

The lesson for the fox – in addition to recognizing when the aggregate is likely the best estimate – is to attempt to implement a process of aggregation within your own mind.  Try to incorporate as many different types of information and points of view as possible in the process of developing a prediction.



Sometimes innovation is very incremental, while other times it involves a big jump forward:

Good innovators think very big and they think very small.  New ideas are sometimes found in the most granular details of a problem where few others bother to look.  And they are sometimes found when you are doing your most abstract and philosophical thinking, considering why the world is the way that it is and whether there might be an alternative to the dominant paradigm.  Rarely can they be found in the temperate latitudes between these two spaces, where we spend 99 percent of our lives.  The categorizations and approximations we make in the normal course of our lives are usually good enough to get by, but sometimes we let information that might give us a competitive advantage slip through the cracks.

Most great forecasters constantly innovate and improve.



Silver explains how chaos theory applies to systems in which two properties hold:

  • The systems are dynamic, meaning that the behavior of the system at one point in time influences its behavior in the future;
  • And they are nonlinear, meaning they abide by exponential rather than additive relationships.

Trying to predict the weather is trying to predict a chaotic system:

The problem begins when there are inaccuracies in our data… Imagine that we’re supposed to be taking the sum of 5 and 5, but we keyed in the second number wrong.  Instead of adding 5 and 5, we add 5 and 6.  That will give us an answer of 11 when what we really want is 10.  We’ll be wrong, but not by much: addition, as a linear operation, is pretty forgiving.  Exponential operations, however, extract a lot more punishment when there are inaccuracies in our data.  If instead of taking 5 to the 5th power – which should be 3,215 – we instead take 5 to the 6th power, we wind up with an answer of 15,625.  That’s way off: we’ve missed our target by 500 percent.

This inaccuracy quickly gets worse if the process is dynamic, meaning that our outputs at one stage of the process become our inputs in the next.  For instance, say that we’re supposed to take five to the fifth, and then take whatever result we get and apply it to the fifth power again.  If we’d made the error described above, and substituted a 6 for the second 5, our results will now be off by a factor of more than 3,000.  Our small, seemingly trivial mistake keeps getting larger and larger.

The weather is the epitome of a dynamic system, and the equations that govern the movement of atmospheric gases and fluids are nonlinear – mostly differential equations.  Chaos theory therefore most definitely applies to weather forecasting, making the forecasts highly vulnerable to inaccuracies in our data.

Sometimes these inaccuracies arise as the result of human error.  The more fundamental issue is that we can only observe our surroundings with a certain degree of precision.  No thermometer is perfect, and if it’s off in even the third or the fourth decimal place, this can have a profound impact on the forecast.

Silver notes that perhaps the most impressive improvements have been in hurricane forecasting.  Twenty-five years ago, the National Hurricane Center missed by an average of 350 miles when it forecasted a hurricane’s landfall three days in advance.  Today the average miss is only about one hundred miles.  (Forecasters have not gotten much better at forecasting hurricane intensity, however, since the forces that govern intensity occur at a much smaller scale.)



Seismologists have specific definitions for prediction and forecast:

  • A prediction is a definitive and specific statement about when and where an earthquake will strike: a major earthquake will hit Kyoto, Japan, on June 28.
  • Whereas a forecast is a probabilistic statement, usually over a longer time scale: there is a 60 percent chance of an earthquake in Southern California over the next thirty years.  (149)

The United States Geological Survey (USGS) holds that earthquakes cannot be predicted, but they can be forecasted.  Silver includes the following table in his book:


Anchorage 1 per 30 years
San Francisco 1 per 35 years
Los Angeles 1 per 40 years
Seattle 1 per 150 years
Sacramento 1 per 180 years
San Diego 1 per 190 years
Salt Lake City 1 per 200 years
Portland, OR 1 per 500 years
Charleston, SC 1 per 600 years
Las Vegas 1 per 1,200 years
Memphis 1 per 2,500 years
Phoenix 1 per 7,500 years
New York 1 per 12,000 years
Boston 1 per 15,000 years
Philadelphia 1 per 17,000 years
St. Louis 1 per 23,000 years
Atlanta 1 per 30,000 years
Denver 1 per 40,000 years
Washington, DC 1 per 55,000 years
Chicago 1 per 75,000 years
Houston 1 per 100,000 years
Dallas 1 per 130,000 years
Miami 1 per 140,000 years


According to the Gutenberg-Richter law, for every increase of one point in magnitude, an earthquake is ten times less frequent.  Thus, given information on past earthquakes and their magnitudes in a given area, it’s straightforward to predict the frequency of more powerful earthquakes in the same area.

As far as specific predictions are concerned, however, weather forecasters are much further along than seismologists.  Weather forecasters have been able to develop a good theoretical understanding of the earth’s atmosphere because they can observe a great deal of it.  Seismologists, on the other hand, are trying to predict the results of events that mostly occur fifteen kilometers below the earth’s surface.  So it’s far more difficult for seismologists to develop a model of what is actually happening.

Overfitting: The Most Important Scientific Problem You’ve Never Heard Of

Mistaking noise for a signal is overfitting.  If the model fits past observations too loosely, it is underfitting.  If the model fits past observations too closely, it is overfitting.  Overfitting is a much more common error than underfitting, as Silver describes:

This seems like an easy mistake to avoid, and it would be if only we were omniscient and always knew about the underlying structure of the data.  In almost all real-world applications, however, we have to work by induction, inferring the structure from the available evidence.  You are most likely to overfit a model when the data is limited and noisy and when your understanding of the fundamental relationships is poor;  both circumstances apply in earthquake forecasting.

…Overfitting represents a double whammy: it makes our model look better on paper but perform worse in the real world.  Because of the latter trait, an overfit model eventually will get its comeuppance if and when it is used to make real predictions.  Because of the former, it may look superficially more impressive until then, claiming to make very accurate and newsworthy predictions and to represent an advance over previously applied techniques.  This may make it easier to get the model published in an academic journal or to sell to a client, crowding out more honest models from the marketplace.  But if the model is fitting noise, it has the potential to hurt science.

… To be clear, these mistakes are usually honest ones.  To borrow the title of another book, they play into our tendency to be fooled by randomness.



Economists have a poor track record of predicting recessions.  But many of them may not have good incentives to improve.

Silver examined, from 1993 to 2010, economic forecasts of GDP as stated by economists in the Survey of Professional Forecasters.  The Survey is unique in that it asks economists to give a range of outcomes and associated probabilities.  If economists’ forecasts were as accurate as they thought, then from 1993 to 2010, only 2 forecasts out of 18 would fall outside their prediction intervals.  But in fact, actual GDP fell outside the prediction intervals 6 times out of 18.

If you examine how economic forecasts have actually performed, writes Silver, then a 90 percent prediction interval spans about 6.4 points of GDP:

When you hear on the news that GDP will grow by 2.5 percent next year, that means it could quite easily grow at a spectacular rate of 5.7 percent instead.  Or it could fall by 0.7 percent – a fairly serious recession.  Economists haven’t been able to do any better than that, and there isn’t much evidence that their forecasts are improving.

Silver met with the economist Jan Hatzius, who has been somewhat more accurate in his forecasts (in 2007, he warned about the 2008 crisis).  Silver quotes Hatzius:

Nobody has a clue.  It’s hugely difficult to forecast the business cycle.  Understanding an organism as complex as the economy is very hard.

Silver lists three fundamental challenges economists face, according to Hatzius:

  • First, it is very hard to determine cause and effect from economic statistics alone.
  • Second, the economy is always changing, so explanations of economic behavior that hold in one business cycle may not apply to future ones.
  • Third, as bad as their forecasts have been, the data that economists have to work with isn’t much good either.

Some data providers track four million statistics on the U.S. economy.  But there have only been eleven recessions since the end of World War II.  Silver:

If you have a statistical model that seeks to explain eleven outputs but has to choose from among four million inputs to do so, many of the relationships it identifies are going to be spurious.  (This is another classic case of overfitting – mistaking noise for a signal…)

For example, the winner of the Super Bowl correctly ‘predicted’ the direction of the stock market in 28 out of 31 years (from 1967 thru 1997).  A test of statistical significance would have said that there was only a 1 in 4,700,000 possibility that the relationship was due to chance alone, says Silver.

…of the millions of statistical indicators in the world, a few will have happened to correlate especially well with stock prices or GDP or the unemployment rate.  If not the winner of the Super Bowl, it might be chicken production in Uganda.  But the relationship is merely coincidental.

Economic variables that are leading indicators in one economic cycle are often lagging indicators in the next economic cycle.

An Economic Uncertainty Principle

Feedback loops between economic forecasts and economic policy can be particularly problematic for economic forecasters.  If the economy looks like it’s at risk of going into recession, then the government and the Federal Reserve will take steps to lessen that risk, perhaps even averting a recession that otherwise would have occurred.

Not only do you have to forecast both the economy and policy responses.  But even when you examine past economic data, you have to take into account government policy decisions in place at the time, notes Silver.  This issue was first highlighted by economist Robert Lucas in 1976.  Silver continues:

Thus, it may not be enough to know what current policy makers will do;  you also need to know what fiscal and monetary policy looked like during the Nixon administration.  A related doctrine known as Goodhart’s law, after the London School of Economics professor who proposed it, holds that once policy makers begin to target a particular variable, it may begin to lose its value as an economic indicator….

At its logical extreme, this is a bit like the observer effect (often mistaken for a related concept, the Heisenberg uncertainty principle):  once we begin to measure something, its behavior starts to change.  Most statistical models are built on the notion that there are independent variables and dependent variables, inputs and outputs, and they can be kept pretty much separate from one another.  When it comes to the economy, they are all lumped together in one hot mess.

An Ever-Changing Economy

An even more fundamental problem is that the American and global economies are always evolving.  Even if you correctly grasp the relationships between different economic variables in the past, those relationships can change over the course of time.

Perhaps you correctly account for the fact that the U.S. economy now is dominated more by the service sector.  But how do you account for the fact that major central banks have printed trillions of dollars?  How do you account for interest rates near zero (or even negative)?

The U.S. stock market seems high based on history.  But if rates stay relatively low for the next 5-10, the U.S. stock market could gradually move higher from here.  U.S. stocks may even turn out, in retrospect, to be cheap today if there has been a structural shift to lower interest rates.

Furthermore, as Silver points out, you never know the next paradigm shift that will occur.  Will the future economy, or the future stock market, be less volatile or more?  What if breakthroughs in technology create a much wealthier economy where the need for many forms of human labor is significantly curtailed?  Is that the most likely way that debt levels can be reduced and interest rates can move higher?  Or will central banks inflate away most of the current debt by printing even more money and/or by keeping rates very low for many more years?  No one really knows.

Economic Data is Very Noisy

Most economic data series are subject to revision.  Average GDP could be revised up to very high GDP or revised down to a recession.  Silver:

So we should have some sympathy for economic forecasters.  It’s hard enough to know where the economy is going.  But it’s much, much harder if you don’t know where it is to begin with.



Eliezer Yudkowsky of the Machine Intelligence Research Institute provides an excellent intuitive explanation of Bayes’s rule:

Yudkowsky begins by discussing a situation that doctors often encounter:

1% of women at age forty who participate in routine screening have breast cancer.  80% of women with breast cancer will get positive mammographies.  9.6% of women without breast cancer will also get positive mammographies.  A woman in this age group had a positive mammography in a routine screening.  What is the probability that she actually has breast cancer?

Most doctors estimate the probability between 70% and 80%, which is wildly incorrect.

In order to arrive at the correct answer, Yudkowsky asks us to think of the question as follows.  We know that 1% of women at age forty who participate in routine screening have breast cancer.  So consider 10,000 women who participate in routine screening:

  • Group 1: 100 women with breast cancer.
  • Group 2: 9,900 women without breast cancer.

After the mammography, the women can be divided into four groups:

  • Group A: 80 women with breast cancer, and a positive mammography.
  • Group B: 20 women with breast cancer, and a negative mammography.
  • Group C: 950 women without breast cancer, and a positive mammography.
  • Group D: 8,950 women without breast cancer, and a negative mammography.

So the question again:  If a woman out of this group of 10,000 women has a positive mammography, what is the probability that she actually has breast cancer?

The total number of women who had positive mammographies is 80 + 950 = 1,030.  Of that total, 80 women had positive mammographies AND have breast cancer.  In looking at the total number of positive mammographies (1,030), we know that 80 of them actually have breast cancer.

So if a woman out of the 10,000 has a positive mammography, the chances that she actually has breast cancer = 80/1030  or 0.07767 or 7.8%.

That’s the intuitive explanation.  Now let’s look at Bayes’ Rule:

P(A|B) = [P(B|A) P(A)] / P(B)

Let’s apply Bayes’ Rule to the same question above:

1% of women at age forty who participate in routine screening have breast cancer.  80% of women with breast cancer will get positive mammographies.  9.6% of women without breast cancer will also get positive mammographies.  A woman in this age group had a positive mammography in a routine screening.  What is the probability that she actually has breast cancer?

P(A|B) = the probability that the woman has breast cancer (A), given a positive mammography (B)

Here is what we know:

P(B|A) = 80% – the probability of a positive mammography (B), given that the woman has breast cancer (A)

P(A) = 1% – the probability that a woman out of the 10,000 screened actually has breast cancer

P(B) = (80+950) / 10,000 = 10.3% – the probability that a woman out of the 10,000 screened has a positive mammography

Bayes’ Rule again:

P(A|B) = [P(B|A) P(A)] / P(B)

P(A|B) = [0.80*0.01] / 0.103 = 0.008 / 0.103 = 0.07767 or 7.8%

Derivation of Bayes’ Rule:

Bayesians consider conditional probabilities as more basic than joint probabilities.  You can define P(A|B) without reference to the joint probability P(A,B).  To see this, first start with the conditional probability formula:

P(A|B) P(B) = P(A,B)

but by symmetry you get:

P(B|A) P(A) = P(A,B)

It follows that:

P(A|B) = [P(B|A) P(A)] / P(B)

which is Bayes’ Rule.



In the case of the age forty women who had a positive mammogram, we saw that only about 7.8% actually had cancer.  So there were many false positives.  Out of 10,000 age forty women tested, 950 tested positive but did not have cancer.

Silver explains how, in the Era of Big Data, if you look at published scientific results, there are likely to be many false positives.  Assume that 100 out of 1,000 hypotheses are actually true.  Further assume that 80% of true scientific hypotheses are correctly deemed to be true, while 90% of false hypotheses are correctly rejected.  So now we have four groups:

  • True positives: 80 of 100 hypotheses that are true are correctly deemed true
  • False negatives: 20 of 100 hypotheses that are true are incorrectly deemed false
  • False positives: 90 of 900 hypotheses that are false are incorrectly deemed true
  • True negatives: 810 of 900 hypotheses that are false are correctly deemed false

So you can see, under these assumptions, that we’ll have 80 true hypotheses correctly identified as true, but 90 false hypotheses incorrectly identified as true.  Silver comments:

…as we know from Bayes’ theorem, when the underlying incidence of something in a population is low (breast cancer in young women; truth in a sea of data), false positives can dominate the results if we are not careful.



Most of us, including scientists, are not very good at making probability estimates about future events.  But there are two pieces of good news, writes Silver:

  • First, our estimates are just a starting point. Bayes’ theorem will allow us to improve our estimates every time we get new information.
  • Second, with practice – and trial and error – we can get much better at making probability estimates in the first place. For instance, see:

Silver explains the importance of testing our ideas:

Bayes’ theorem encourages us to be disciplined about how we weigh new information.  If our ideas are worthwhile, we ought to be willing to test them by establishing falsifiable hypotheses and subjecting them to a prediction.  Most of the time, we do not appreciate how noisy the data is, and so our bias is to place too much weight on the newest data point…

But we can have the opposite bias when we become too personally or professionally invested in a problem, failing to change our minds when the facts do.  If an expert is one of Tetlock’s hedgehogs, he may be too proud to change his forecast when the data is incongruous with his theory of the world.

The more often you are willing to test your ideas, the sooner you can begin to avoid these problems and learn from your mistakes… It’s more often with small, incremental, and sometimes even accidental steps that we make progress.



An equal weighted group of micro caps generally far outperforms an equal weighted (or cap-weighted) group of larger stocks over time.  See the historical chart here:

This outperformance increases significantly by focusing on cheap micro caps.  Performance can be further boosted by isolating cheap microcap companies that show improving fundamentals.  We rank microcap stocks based on these and similar criteria.

There are roughly 10-20 positions in the portfolio.  The size of each position is determined by its rank.  Typically the largest position is 15-20% (at cost), while the average position is 8-10% (at cost).  Positions are held for 3 to 5 years unless a stock approaches intrinsic value sooner or an error has been discovered.

The mission of the Boole Fund is to outperform the S&P 500 Index by at least 5% per year (net of fees) over 5-year periods.  We also aim to outpace the Russell Microcap Index by at least 2% per year (net).  The Boole Fund has low fees.


If you are interested in finding out more, please e-mail me or leave a comment.

My e-mail:




Disclosures: Past performance is not a guarantee or a reliable indicator of future results. All investments contain risk and may lose value. This material is distributed for informational purposes only. Forecasts, estimates, and certain information contained herein should not be considered as investment advice or a recommendation of any particular security, strategy or investment product. Information contained herein has been obtained from sources believed to be reliable, but not guaranteed. No part of this article may be reproduced in any form, or referred to in any other publication, without express written permission of Boole Capital, LLC.