I am Jubal, I love math, physics and cars!! I started this blog as I wanted to share my knowledge and foster discussion on these topics. Intermittently, the blog might digress to other topics like finance. Enjoy, more is yet to come! Have fun.
How does a Swing work
When a child swings on a swing, have you ever thought how it is possible without touching the ground. There is nothing to push off against.. a naive application of conservation of momentum might indicate that it is not even possible to start swinging!

So how does it work? The catch is twofold. The first point is gravity, of course. Conservation of momentum applies in the absence of external forces. The second point is the person on the swing does not act as a point mass. Instead the simplest model one can use is a two point mass model, where the legs are one mass element and the torso is the other mass element.
When at the apex or top of the backward motion swing, the person would kick out their legs. This launches their torso and the swing backwards while projecting the leg forwards/downwards. Here is the crux.. the torso then does not decelerate by pulling back on the legs and hence conserving momentum. Instead, gravity decelerates the torso, storing up potential energy. This adds amplitude to the swing as desired. When the swing direction reverses then at the apex, the body follows the legs finally.
This bring us to a limitation of this technique. The person cannot detach his legs of course. So there is no benefit to making the kick too aggressive. The optimal velocity of the kick would be such that the knees reach full extension in the duration that gravity decelerates the torso to reverse swing direction. Any faster and the legs get pulled back by the body trying to not break apart and the additional force is for naught.
The final point is that all this is applicable only when starting with some small non-zero swing amplitude. Thus, it’s really difficult (not sure if it is impossible) to start swinging when perfectly stationary as an initial condition. At that stage, the person might just touch and use the ground, to get started.
Godel’s (first) incompleteness theorem
I would like to describe a most amazing result in math, computer science, logic and even philosophy (kind of a stretch there but..).
Godel’s incompleteness theorem addresses the theory of natural numbers constructed from axioms. The results also apply to any equivalent theory and state that if,
- The axioms of the theory are consistent
- The axioms apply to a finite number of objects or if they (as likely) have infinite applicability, are recursively list-able/enumerable. This just means that there is an algorithmic rule to decide if a given statement in the theory is an axiom.
- The axioms give rise to a theory capable of basic integer arithmetic
..then, the list of axioms are not complete! In the precise sense, that it is possible to come up statements in the theory that cannot be proven or dis-proven using the axioms. Such statements are also referred to as being independent of the axioms by mathematicians.
Now we will proceed to describe the reason the theorem is true! We need two concepts first: 1) Types of infinity (to quantify all possible statements and all possible proofs) ; 2) Godel encoding (so that proofs can be dealt with as functions of input statements).
——-
Types of infinity
TBD
Godel encoding
TBD
——-
The fundamental reason is that the space of possible statements is uncountably infinite while the space of possible proofs or deciding functions in an arithmetic theory is only countably infinite. So, there simply does not exist enough information to prove/disprove or decide all possible statements that can be expressed in the language of the theory, since there are too many of them. The integer arithmetic caveat is important for resulting in a theory allowing only a countably infinite (list-able) variety of deciding functions over a countably infinite (list-able) domain, thus encoding only countably infinite information. If the domain is not over the natural numbers, rather over an uncountably infinite space, then the functions in that theory can indeed encode enough information to possibly decide all statements expressible in the language of the theory. The nuance of being expressible is important. For example, the theory of real numbers is complete but we cannot use that fact to claim that natural number theory is also complete as a consequence just because it is a subset of real numbers. There is no way to express natural numbers from real numbers in first-order logic! There is no effective way to pick out that subset in the language of the theory and axioms.
Now let us actually prove the theorem for natural numbers. Let us list all possible deciding functions fi(x) by row, where the function encodes the proof of the statement x. In the columns, let us have the values of the function by varying the input x over the integers. Since the domain is integers, the columns are list-able. It is sufficient to construct a statement which cannot be decided by the list of all possible deciding functions.
Obviously, we cannot prepare a statement beforehand and be sure it does not appear in the list. But given a list of all possible proofs/deciding functions, we can have a recipe for generating such a statement. The standard way to construct such a statement is by diagonal-ization. Define a property that is possessed by each number n as false when the nth element of the nth function is true. In essence, the property is defined as the negation of the diagonal elements. Now none of these functions match the property’s decision value atleast in one element of the column i.e, for one input case where the property has to be decided. So none of these functions in the list can be the exact decider or proof encoder of the so-constructed statement. Hence, any sufficiently complex theory of natural numbers is incomplete.
Noether’s theorem and Hamiltonian Formulation
This is a continuation to the previous post on Lagrange Formulation.
If the Lagrangian does not depend explicitly on time, then becuase the partial time derivative is zero, the change or total time derivative of the Lagrangian can be non-zero only because of motion (change in coordinate values; not to be confused with a coordinate transformation). We will use Dt to denote the total time derivative in order to distinquish it from the other partial derivatives.
Dt (L) = dL/dx (dx/dt) + dL/dv (dv/dt) = dL/dx (v) + dL/dv (dv/dt)
But, we are imposing and solving the equation of motion, dL/dx = d/dt (dL/dv)
Hence, DtL = d/dt ( v*dL/dv)
Rearranging, Dt (L – v*dL/dv) = 0.
This quantity in the parenthesis is called the Hamiltonian and we showed that its change is zero.In other words, it is conserved over time when the Langrangian has no explicit time dependence. Also, the Hamilton itself cannot have explicit time dependence since it has to always conserved, even when the system is not in motion. This is essentially a weak version of the famous Noether’s theorem that proves any invariance of the Langrangian w.r.t a dependent variable results in a conserved quantity. We will now provide a proof outline for Noether’s theorem for coordinate transformations.
Since dL/dx = d/dt (dL/dv), when any linear combination of the LHS is zero, it follows that the associated linear combination of the parenthetical quanitity on the RHS would have zero time derivative (by the given equation). Hence, that quanitity would be invariant and be conserved over time.
Now we will address the teleological aspects of the least action principle using the Hamiltonian. Let us define a quantity, W = S(q) + Et, where S is the action time-integral of the Lagrangian and E is the constant energy. Note that the conserved Hamiltonian when the Lagrangian does not have explicit time dependence is the total energy of the system and S depends only on the coordinate as the velocities are integrated over time and becuase L is not directly dependent on t.
By the stationary action principle, S needs to be minimized or made stationary. From the above definition, it is equivalent to demand energy conservation and minimize W. Note that energy conservation is not a given if one simply minimizes W as both S and W are defined for physically impossible trajectories as well, as noted in the previous post. However, if S itself is directly minimized then energy conservation takes care of itself, from the symmetry of the Lagrangian seen above.
W is a potential function that is only state dependent (on the coordinate) and not on the path taken to arrive at the current point or state.
Musings on Phase and Configuration Space
Phase space communicates both the motion and configuration (location with shape) of a system (x1,x2, x1‘,x2‘), wheras configuration space only mentions the values of the degrees of freedom of the system, therby decribing its location and configuration (x1,x2). Clearly the number of entries in the configuration space will be equal to the number of degrees of freedom of the system. The number of entries of phase space will be twice the number as it includes velocities as well.
Here is the key observation, shouldn’t we include accelerations as well or for that matter any number of higher order derivatives (x,x’,x”,x”’…) to make the description of motion complete? No! It is complete in phase space and any addition would be redunant. How?
It is because everything moves in Newtonian mechanics (or the equivalent Lagrangian formulation) according to second order differential equations. For example: x” = -bx’-kx. Hence, when we say the system is at a partciular point in phase space, the immediate future motion is determinied by the governing second order differential equation. In other words, there can be no intersection of trajectories in phase space. When the system arrives at a particular point, how the first derivative moves forward is dictated by the value of the second order derivative x”. Moreover, x” will evaluate to the same value at that point as long as the governing law does not change. And how x changes for the future trajectory is dictated by x’, which can be read off the current value of the point on the phase plot.
This is a good test if any imaginary system is truly second order. If the phase plot contains intersections with all the DOF accounted for, with their positions and velocities forming the axes then it points to a richer underlying (higher order) dynamics that is not captured adequately by the plot. In this case the plot is merely a lower dimension shadow or projection of the full dynamics in a higher dimension space.
Fermat’s principle – Least Action for Light
The principle of stationary action from the previous post on Lagrangian mechanics can also be used outside pure mechanics. For example, for traveling light, including when it undergoes reflection or refraction. In this case, the action is not an energy but rather the time taken to traverse the distance under consideration. The principle states that the path taken will be the one that reduces the time taken to the least possible out of all possible paths, while respecting the physcial constrains of the system.
Example 1: Point light source
The time taken will be least if light travels in the shortest path between two possible points (if the velocity of light is constant throughout the path) so it implies that the travel path will be straight lines. If it takes any other path, it has not reached the end point (or any of the points enroute) in the shortest possible time. So a point light source emits straight rays in all directions outwards. However, the light remains straight only if the speed of light in the medium does not change! For example, in the Feynman lectures it is pointed out that when light traveling from a source to a person encounters cold air, it will bend to pass through another layer of warmer air faster and then finally curve back to the person’s eye level to maintain the fixed destination of the path.
Example 2: Reflection on a smooth surface
Even if the surface is curved, it will locally be flat at the point where a ray strikes (since it is smooth), so the curvature makes no difference to the law of reflection itself (only to the image when a bunch of rays strike different points on the surface). The angle at which the ray strikes the normal to the point of contact on the surface is called the incidence angle.
Firstly note, after reflection, the light has to continue in a straight line for reasons similar to the previous example. Let’s call the angle of this reflected ray the angle of reflection, t. Let c be the speed of light in the medium and d be the normal distance between the tangent to the surface and a point source. Then the time to be minimized is the sum of travel times before and after reflection and the constraint is that the distance covered has to be constant. Since the principle holds for any point in the reflected ray, we can easily pick the point where the vertical distance covered is zero and the horizontal distance covered (along the tangent to surface) is the constraint.
T= d/c (sec(i) +sec(r)) ……(1)
tan(i)+tan(r) = constant/d = constant …(2)
Differentiating (2), we get
dr/di= -sec2(i)/sec2(r) … (3)
To find the extremum of (1), we differentiate w.r.t to “i” and set to zero,
sec(i)tan(i)+sec(r)tan(r) dr/di= 0
Using (3) in above, we get that sin(i) = sin(r), or the law of reflection that the angles are equal!
Example 3: Refraction between non-dispersive media
Since the light has different velocities c1 and c2 in the two media, at the surface separating the media, the light refracts in order to maintain the shortest time principle. The governing equations would be as below and the minimization method would be similar to example 2. Note that again, without loss of generality, we have chosen a point at a far enough normal distance in the second medium that the refractive surface is equidistant to the chosen point and the point light source in the first medium.
T= t1+t2 = d/c1*sec(i) + d/c2*sec(r)
c1*tan(i)+ c2*tan(r) = constant (x/d)
Snell’s law: sin(i)/sin(r) = c1/c2 would yield the desired solution.
The Lagrange Formulation
I would like to describe the powerful method of Langrange-Euler equation in classical physics as opposed to directly applying Newtons laws. It is inherently an energy formulation but the results contain a few nuances. In the Lagrange formulation, the energy used is not the total energy, rather the delta of forms of energy. Let T denote the kinetic energy and V the potential energy. Then, we define L = T-V. This quantity is called the Lagrangian and it is not yet clear what we do with it. But, let us take a deep breath before we discuss a few simple but powerful observations now.
- The quantity defined here is a difference of two scalar quantities. So, the Lagrangian is (a function which can be evaluated to ouput) a scalar quantity that can be meaninfully referred to as smaller or larger, positive or negative etc. (just like numbers we encounter in everyday life; this is part of what makes it easier than dealing with forces as in the Newtonian formulation)
- Energy is a state function and it depends on the current state of the system only, independent of the history. This holds not just for the total energy but for the forms of energy. So, it cannot directly depend on the value of the time. In fact, such a time dependence on the clock reading would violate the conservation of energy. In other words, L(x, x’, t) can be reduced to L(x(t), x'(t)). Of course, the kinetic energy and potential energy do indirectly vary with time t, as the energy depends on the position and velocity of the components of the system and the positions in turn depend on the trajectory that is traced out in time.
- The Lagrangian cannot be re-formulated using the conservation of total energy i.e, T+V = constant (E) . For example, L = 2T – E would be equivalent but we cannot claim E is constant in this case as the Lagrangian is defined not only for all possible trajectories but also all physically impossible trajectories where E is not conserved. The domain of definition needs to be this vast for solving the possible solutions finally, as we shall see.
The principle is that the system will make an effort to reduce the time averaged value of L, as much as possible, according to physical constraints. This is encapsulated by defining the action, A = integral[L(x(t),x'(t)) dt], over the time interval considered. In order to minimize A, we have to lower L over time. Now, to minimize L over the entire interval is not trivial, but the principle does not need us to do that. Rather the principle is that L tends to a local minimum or stationary point. So, all we have to do from a starting L (x,x’) is to change x(t) to nearby values (which will also affect x'(t)), until the wiggling around of the shape of the curve x(t) does not reduce L(x,x’) anymore. And remember that we can wiggle around the curve x(t) as locally as we want, since the integral is just a sum of the contributions from every local interval. So, when we reach the condition that we cannot decrease L(x,x’) by local wiggling, we satsify the condition that any infinitesimal change of x(t) results in a zero corresponding change in the Lagrangian := D(L) = 0.
Let us approximate everything as a set of discrete points using a uniform time step Ts over the interval and then take the limit of the step size going to zero, in the spirit of calculus. Consider three consecutive points of x(t), to form a local picture. If we increase the middle point of x but leave the other two points unchanged, we have done a “local wiggle”. This naturally results in the slope of x increasing at the first point and then decreasing at the middle point by the same amount (compared to the slope values before the wiggle). In other words, if we change x at only a particular time step, leaving the value of x at other time steps unchanged, then x'(t) changes at the previous time step due to the altered value of x. There is an equal and opposite change in x'(t) at the current time step (assuming the forward step differentiation rule, but our conclusions will be independent of the rule used).
If we consider the value of L(x(t),x'(t)) plotted over the time axis, then this plot moves around the spot where t=t0, as we change x(t) at t=t0. Since x’ has equal and opposite changes at two time steps the integral of L (action) does not have any change in the first order resulting from x’. But in the second order, if the dependency of L on x'(t) is getting stronger/weaker with time, the equal and opposite changes in x'(t) will not have the same impact on L since these affect L at different time steps.
D(L) = change due to increase of x at current point + change due to increase of x’ at previous point – change due to decrease of x’ at current point.
Let the change in x at the current point/time-step be dx. Setting v= x’ and denoting derivatives using subscripts, the change is:
D(L) = L(t)x dx + L(t-Ts)v dv – L(t)v dv
Using the definition of velocity we get:
D(L) = L(t)x dx + L(t-Ts)v * dx/Ts – L(t)v * dx/Ts
But from the definition of derivative,
L(t-Ts)v – L(t)v = – d/dt (Lv(t))* Ts
Using this fact in the prev equation,
D(L) = L(t)x dx – d/dt (Lv(t)) * Ts * dx/Ts= [L(t)x – d/dt (Lv(t))] dx
But, D(L) = 0, from our earlier discussion! This yields, the Euler-Lagrange equation for mechanics,
Lx = d/dt (Lv)
The power of the inverse square law
Pursuant to the preivous post on the pythagoras theorem, we continue asking ‘why’, in this series of posts. I attempt to share my feeble understanding of why electric forces and gravitational forces closely obey the inverse square law in classical physics.
I like the graivational wave interpretation. Simply put, this pins the cause on the world being three dimensional (disregarding sting theory). The power of any constant flux or flow outward from a central source (even a point source), will dissipate proportional to the area that it is spreading out over. For instance, a sound wave from a speaker suspended high up in the air would travel in all directions uniformly. At any given time, the wavefront would be a set distance away from the source. The distance itself would depend on the speed of sound. But the key point is that it would spread out in a sphere of radius equal to that distance. So the intensity of the sound energy drops proportional to the area of a sphere, whose radius is the distance from the source. Since the area of the sphere depends on the square of the radius, the wave is divided or distributed by a factor proportional to the square of that distance, leading to the inverse square law.
An intuitive proof of the Pythagoras theorem
There is a theorem so powerful in geometry, that it lies at the heart of not only geometry, but also coordinate geometry, trigonometry, algebra, vector physics and more! I am talking about an ancient theorem known ages ago, called in modern day by a name none other than the Pythagorean theorem.
I am going to try to explain the beauty of this theorem and the breadth of its application in this post. This brings me to a key difference between mathematics and physics. In mathematics, one may never ask why, only if. You can ask if 11 is prime and how to determine if it is prime. But you may never meaningfully pose the question .. why is 11 prime. Yet I deeply believe, that all of human endeavor into the sciences relies on understanding the why. This is why I like physics so much, it allows us to ask why. My effort here is not really to prove such a well known theorem, but rather ask why the Pythagorean theorem is true. In the process, of course, it will be a non-rigorous proof.
So you may ask, am I claiming that prime numbers or other areas of relatively abstract math are unnatural? Not at all! In other words, what I am hinting at is the fact that we cannot answer why 11 is prime is just a sign of incomplete understanding. Even abstract math is often rooted in our observations of reality, if not specifically in nature. In fact, this is often the litmus test for something being the right way to define things or pick axioms.
So, let us start with a right triangle of side lengths a, b,c. Assume c is the hypotenuse, without loss of generality. Let us consider a light source at each vertex of the triangle, casting a vertical shadow. This is called a projection in physics. Specifically here, we are using an orthogonal projection. If it helps, one may imagine the hypotenuse as a ladder and the other two sides as a wall and the ground. So, with light sources present at each vertex, both sides cast shadows on the hypotenuse and the hypotenuse casts shadows onto each side from the light source at the opposite vertex. (In our example of the ladder, the ladder casts a shadow on the wall and on the ground. It is important to note that the shadows from the ladder fall vertically as the light sources are at the foot of the ladder and the top of the ladder)
Now we switch our attention to the shadows that fall from the sides onto the hyptoenuse. (Imagine a light source just behind the point where the ground meets the wall, the ground prevents part of the light from reaching the ladder and the other part of light is blocked by the wall, so no light reaches the ladder). Then we notice that the shadow of a and b fall onto c without any overlap!!! This is the key observation. So to get the side length of c , we need to simply add up the length of the shadows of a and b. This leads us to the question of determining the projection effect.
Note that when the shadow of c falls onto side a, it exactly covers the side a, so the shadow has length a. This leads to a length contraction. Now when the shadow of a falls back onto c, it is a second projection of the same contraction factor. A slight nuance is implicit: The projection factor depends only on the angle between two lines and not on which line is projected onto another. This is provable by symmetry, as if we extend the two lines to infinity, then both lines are indistinguishable from each other… we can reverse the construction to use the other line to cast the projection or shadow and the factor has to remain the same.
If c—> a in the first projection, then by the unitary method, 1–> a/c and consequentially, a —> a*a/c. But this is what happens in the second projection, when a casts a shadow on c! This gives us the length of the shadow cast by a. Repeating for side b, we get, c—> b ==> b—> b*b/c. Now adding up both shadows, we get c = a*a/c + b*b/c. Multiplying through by c, we get the usual form of the theorem!
So what have we learnt? We learnt that a good name for this theorem might be … the double projection theorem. The same factor by which the hypotenuse projects its shadow onto the sides is again the physical factor when the sides are projected back onto the hypotenuse. And the sides project back with no overlap, so they add up to the hypotenuse. This double projection or factor is the whole reason the exponent in the Pythagoras theorem is 2, and not some other power.
Now let’s study the generation of triples, which are basically sets of whole numbers which can be sides of a right triangle as they fit the equation, like (3,4,5). In general any set {(u-v)2, 4uv, (u+v)2} will suffice, if 4uv is a perfect square. This condition is definitely possible to meet by starting with non-prime equal u,v and factorizing it. Obviously, with equal u,v the product is a square. Then, exchange some of the factors, to make them unequal and still keep the product the same .. i.e, a perfect square. For example, if we start with u=v=6, then exchange factors to set u=9 and v=4, then the triple (as per the above formula) would be 13, 5 and the square root of 144, which is 12. There is a proof that this is the only way to construct triples! To be precise, the claim is that this construction covers every possible set, upto a factor, so all the co-prime ones would be covered at the very least.
In terms of application, we can show that any directional quantity (known as vectors) for instance, velocity, adds up in this way. If an object is moving to the left at 3 units velocity and forward at 4 units velocity, how far will the object travel in total?
Let’s break it down in time: In unit time, the object moves 3 units to the left and 4 units forward. There is no impact due to the order of translation, as the distance covered is the same (regardless of whether the translation in both left and forward direction is simultaneous or one direction is covered first followed by the other). But following the sequential step of motion, essentially the sides of a right triangle are traced out, if we mark the object’s original position and trace the path to a new position (And if the motion is simulatenous, we know that the starting position and final position must be the same as the sequential motion). The final translation will be the shortest distance between the initial and final position, which is the straight line betwen the points, which in turn is the hypotenuse of the right triangle with aforemntioned sides. THe hypotenuse has length 5, covered in unit time. So, the effective velocity is 5 units, equivalent to adding the impact of the two velocities by the Pythagoras theorem.
Let me also take the opportunity to explain why orthogonal directions are considered independent in physics. (Independent things/directions are called dimensions). It is because we can meaningfully specify any vector value (like velocity) for one direction separately from the other. On the other hand, if we have two directions at 60 degrees with each other, then 5 units velocity along one line is not consistent with, for example, zero velocity in the other direction. An object moving along one of the lines is by default also moving along the other direction. (An object having zero velocity in a direction is defined as: if we draw a line parallel to the direction, the closest point on the line to the object does not change). This is however possible, if the lines are orthogonal. The closest point on the orthogonal line does not change and there is no motion in the direction that the line is pointing toward.
This is how the Pythagorean theorem derives it’s great importance and utility. It is a formula to combine actions along multiple dimensions. It is more than just a triangle! (… in some sense, but in another sense, it is all just a triangle)
Home Value
Debate: A personal home is an investment
My take : I agree it is an investment, but not a good investment always. Let me explain.
If you are personally living in the home, you are consuming all your potential rental income. Consumption does not equal investment. If you save money on rent that you would have otherwise spent, then it is a great investment. But that is exactly the point. The catch is that you have to guarantee you would have otherwise spent that same amount of money or greater on rent. There is no other class of investment, where if you invest more your consumption automatically increases. If you move to a bigger house because it is an investment, you have not made a good investment as you have increased your own rent. In other words, the only way to propely account this is to create a You Inc. and lease the house to yourself and pay rent to yourself. You will see the new bigger house does really well as an investment off the back of the high expenses of the tenant, who is.. you guessed it, yourself.
This brings us to the second facet of investment consideration, that is price appreciation. People get home appreciation facts confused all the time. So here goes some much needed clarification.
Home values are based on both the structure and land values. Structural values appreciate only with the residential building cost index. Land values are what truly appreciate in high demand/growth areas, like the Dallas area in the last decade. Land prices are set by supply demand market dynamics. Structure pricing can also be driven up by demand to a degree but is shielded from the sometimes irrational returns land owners enjoy, as it is capped by the cost of demolition plus new construction cost (including the time value of construction time).
I do think that land can be a great investment in some areas, but market selection success is not a given. Particularly, if one does not have a choice in the area they buy their personal home due to jobs, kids, roots etc. Morevover, I do not think the structure itself is a great investment unless it is generating rental income. Something that only grows with labor and material inflation is not a great investment! (*Granted that inflation rate is often higher than the consumer inflation measured by CPI, since electronics, manufactured goods and apparel generally bring the average CPI down)
Further nuances
Demand is affected by the overall economic climate of course, including where people want to live and what people are able to pay. Current supply is just another word for existing inventory. Increases to supply are driven by demand, of course, and affected by zoning laws, land availability for new construction etc.
Interest rates do not significantly affect nominal home prices contrary to popular belief. The ability to borrow more and repay the same (as borrowing lesser previously) is only a perceived reality. In fact, the present value of the future payments increases when the rates drop. So, even at a lower nominal payment consumers are paying the same real value as before the rate drop. This happens with the dual accompanying forces of lower inflation adjustement to wages and lower expected market rate of return on investments. I added the word significantly as a qualifier, because the uninformed first time home-buyer might indeed borrow a larger amount at lower rates and bid a higher price on a home. But if land prices appreciate temporarily due to this activity, this will be normalized over time as the sustained lower rates will reduce nominal wealth creation and people’s expectations on what is affordable will re-adjust. If anything, people with existing mortgages might re-finance at lower rates and if they ever change houses later, their expectations would be based off the reference of their new lower nominal payment. (Nominal = dollar amount face value ; Real = Spending power usually calculated as Inflation adjusted value)
These two claims (building cost index, interest rates) are also borne out by data. For example, the median price of homes sold in 2008 pre-recession was around $240k while in 2019 pre-pandemic it was $320k. (It has been flat across 2020). But in fact, the building cost index for new residential construction did go from 90 to 120 in the same timeframe, representing a factor of 1.33. This would mean housing price increases are in line with the cost index. This also signifies that overall land in the US has not appreciated much beyond inflation. Between 1990 and 2000, the median home price went from 120k to 160k, which is also a factor of 1.33 in a much higher interest rate environment. It must be conceded that the price appreciation in the last decade in such a low rate environment is comparatively better than the same appreciation in the 90s when even treasury bonds were yielding high rates!
As aformentioned, in high growth areas like SFO, the land does actually appreciate and that is due to market forces playing out. The word “playing out” is important, as the future growth potential to a certain horizon is priced into homes in these markets similar to a growth stock (there is no free lunch). As time passes, if things keep looking up, the confidence time horizon moves forward and the prices go up even further.
In conclusion, a personal home can in some cases be an okay non-diversifed investment, but try to lease a less expensive home from yourself and actually invest the savings!
The Growth Valuation Formula
What is the value of a stock (price)? It comes from the earnings it generates! Let us first consider constant earnings and then growing earnings.
Constant Earnings
P = E/r , where P is the price of the stock determined by the earnings potential E.
This formula follows from the definition of the expected rate of return ‘r’, as the capital used to purchase the share is the market price of the stock. For our purposes, we can take this as an annual rate of return.
But to be clear, let us work out the math in a different way.
The earnings E in a future year has a present value that is determined by how much you have to invest now to have a total (capital +profit) of E in the future. This amount works out to be E/(1+r), as if you invested this amount, you would have the original investment and a return of r, a year later. So, if a stock generated E every year, its value would be:
P = E ( present value of year1) + E (pv year 2) +E (pv year 3)……
P = E/(1+r) + E/(1+r)2+E/(1+r)3+….
This can be rewritten to include year 0 as,
P +E= E+ E/(1+r) + E/(1+r)2+E/(1+r)3+….
Using the formula for goemetric progressions with a factor of 1/(1+r),
P+E = E * (1-xn)/(1-x), where x = 1/(1+r)
Since x <1 (as r>0), as n becomes larger, xn approaches zero.
Hence, P = E/(1-x) – E, which simplfies to P = E/r
Growing Earnings
Let us consider earnings E at present and a continuous growth of E by a set amount of E* every year… forever
Then, P = E/r + E*/r2
Let’s discuss this formula and then derive it. Firstly no company grows forever, so we need a terminal valuation. But that consideration aside, we see how in low growth – low interest rate economies like the US at present, growth stocks are very attractive. As r <<1, 1/r becomes large and 1/r2 becomes really large, making the value/price increase greatly. For example, if the rates drop in half, the earnings value becomes double but the earnings growth value becomes four times!
Now to derive, we have to consider the new price with the increased earning:
P = E ( present value of year1) + E (pv year 2) +E (pv year 3)…
P = E/(1+r) + (E+E*)/(1+r)2+(E+E*+E*)/(1+r)3+….
Let’s separate the E terms, then keep the E* terms separate by the year in which the growth of E* was achieved, set x = 1/(1+r):
P = Ex+ Ex2+Ex3+…. (constant earning value)
+ E*x2+E*x3+… (year 1 growth increases earnings in all future years)
+E*x3+… (year 2 growth increases earnings of all future years in present value)
Let’s name these terms in each line above for clarity,
P = Earnings value + Present value of Growth in year 1 + P.V.G in year 2+ P.V.G in year 3…
P.V.G (year 1) = E*x2+E*x3+… = E*x [x +x2+…] = E*x [1/r]
P.V.G (year 2) = E*x2 [1/r]
Adding them all together, Total PVG = (E*/r) [x +x2 +…]
Total PVG = E*/r [1/r] = E*/r2
P = Earnings value + Total PVG
P = E/r + E*/r2
In recent decades, it is such a well established truth, that I won’t even bother to cite references, that value investing has underperformed growth stock investing. This is not necessarily always the case but recent US market history is so skewed in this direction that it is a truth universally accepted that a growing company with a high share price is in need of a higher share price. (Jane Austen reference)
I had the fair opportunity of reading on another blog (and I agree with it) that when growth is prevalent in the market place, value is at a premium and when growth is scarce (like in low GDP growth economies), growth stocks are at a premium. I must also qualify the word growth as “realized growth rate”, since market conditions are dynamic. Else, the price would not appreciate over time (instead the market would price in the future value of expected growth from day 1). But as is the case, the performance of the company has to make a case for a growth trajectory with some proof from its history in past quarters/years. This accompanied by suitable market conditions and a vision from management, is usually necessary for increased share price. In recent decades, the US has seen relatively low GDP growth. This macro condition would explain the great performance of many growth fund managers in the past two decades to a degree.