**Paper title**: Terrestrial, Habitable-Zone Exoplanet Frequency from Kepler**Authors**: Wesley A. Traub**First Author’s Institution:**Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA

The answer to the above question, according to a new analysis of data from NASA’s Kepler mission, may be roughly one-third. The author, Wesley Traub of Caltech, has examined data from the first 136 days of the Kepler mission and estimated the frequency of terrestrial planets in habitable zones (regions with the possibility of liquid water on the planet’s surface) around Sun-like stars. The author concludes that this frequency is (34 ± 14)%. At first glance, this value seems strikingly large: a rough, order-of-magnitude application of this frequency to the Milky Way yields *billions* of habitable, “Earth-like” planets, just in our own galaxy. Let’s take a closer look at the reasoning (and any potential biases) involved in the author’s estimate.

Kepler continuously monitors ~150,000 stars and attempts to detect transiting planets through the tiny dip (~1%) that the planets cause in the observed brightness of their host stars. For a great intro to the Kepler mission, check out this post, and for summaries of how we can apply statistics to the Kepler data set, check out this post. In order for an object to be considered a planetary candidate, the Kepler team requires the existence of at least three transit events. Traub examines data from the first 136 days of the Kepler observations, so he restricts the sample to planets with periods of less than 42 days. The potential problem with this fact is that the period of a planet in the habitable zone has a minimum length of 228 days, according to Traub’s definition of habitable zone, which is significantly longer than the planet periods in the current sample. Thus, extrapolation of the data to these longer periods is required, introducing a (possibly significant) bias. Traub is explicit about this problem: “The bias incurred by extrapolation is entirely unknown, so in the present paper we merely note this uncertainty but do not attempt to make any corrections.”

How exactly does the author reach the frequency estimate of 34%? Traub restricts his analysis to only Sun-like host stars, defined by the following temperature ranges: K (5000 – 5499 K), G (5500 – 5999 K), and F (6000 – 6499 K). For reference, our Sun is a G star with a temperature of about 5800 K. He then reviews all planets detected around these stars, characterizing the planets by their period and radius (see this astrobite for a summary of how the distribution of planets can be determined). The planet sample is divided by radius into three main categories: terrestrial (Earth-like), ice giants (like Uranus or Neptune), and gas giants (like Jupiter). The number of planets in the sample as a function of radius is shown in the figure below.

In order to statistically describe the frequency of habitable, Earth-like planets in the Universe, a conversion must be made between the sample of planets detected by Kepler and the actual number of planets in orbit around stars monitored by Kepler, which is referred to as the planet population. This conversion is performed by noting that the probability of a transit is given by *p = R(star)/a(orbit), *where *R(star)* is the radius of the host star and *a(orbit)* is the semi-major axis of the planet’s orbital ellipse. For planets of a given type, this probability translates to roughly 100 times more planets in orbit around stars than planets that are actually detected. The important statistic derived from the planet population is the ratio of terrestrial planets to total planets, which is found to be about 29%.

In order to describe the distribution of planets with periods long enough to be habitable, Traub must extrapolate the Kepler data in his sample to longer periods. He performs this extrapolation by fitting a model for the number of planets as a function of period, where the change in the number of planets with respect to period is modeled as a power law. The above figure shows this model (solid blue line) and the extrapolation to periods greater than 42 days (dashed blue line). This model yields the fraction of total planets (terrestrials, ice giants, and gas giants) with periods corresponding to the habitable zone, and multiplying by the fraction of expected terrestrial planets (29%) yields the estimate of 34%.

The final estimate for the frequency of habitable exoplanets depends heavily on the extrapolation of the data to longer periods. A fundamental assumption of this paper is that all Kepler data for planets in the current sample with periods longer than 42 days is invalid. The extrapolation from this data is shown as curve ‘b’ in the above figure, and it differs significantly from the extrapolation used (curve ‘a’). A similar analysis of the Kepler data, which included this long-period data, found a habitable planet frequency of about 1%. It is clear that the extrapolation of the data introduces large uncertainty, and this uncertainty will remain until enough time has passed that there exists Kepler data for planets with orbits significantly longer than 42 days. Nevertheless, this paper represents a valuable first step towards determining and understanding an extremely important quantity in planetary science. Luckily, data is continuously streaming in, and just recently, Kepler has released its third quarter of data, which includes data with longer period baselines. Be sure to keep an eye out for extensions of this analysis in the future.

## Discussion

## No comments yet.