Abhimanyu Pallavi Sudhir
http://www.rssmix.com/
This feed was created by mixing existing feeds from various sources.RSSMixAnswer by Abhimanyu Pallavi Sudhir for Example contradicting the first postulate of special relativity
https://physics.stackexchange.com/questions/439895/example-contradicting-the-first-postulate-of-special-relativity/440017#440017
-1<p>How exactly are you defining "the physical laws"? One could certainly define physical laws to be general enough so as to work for non-inertial reference frames.</p>
<p>The point of the first postulate is that inertial reference frames <em>exist</em> -- alternatively, that there are reference frames that are linear transformations of each other.</p>Fri, 09 Nov 2018 23:27:24 GMThttps://physics.stackexchange.com/questions/439895/-/440017#440017Abhimanyu Pallavi Sudhir2018-11-09T23:27:24ZUnderstanding variable substitutions and domain splitting in integrals
https://thewindingnumber.blogspot.com/2018/10/understanding-variable-substitutions.html
0Often when I'm reading a computation of some weird integral that contains some kind of a "trick" for some variable substitution and can't help but think "How could I have thought of that?" And even when introducing these at schools, these are usually taught as "tricks", and the strategy to decide which "trick" to use is memorised -- you see $1+x^2$? Well, that's either $\tan x$ or $\cot x$. And sure, for such simple ones, that kind of a trick might make sense. You know, you have something that really looks like a trig identity, so let's just make it one...<br /><br />But I tend to find often that these kinds of "tricks" can be motivated and made to make sense, and I think that there usually is such a way to come up with one from mathematical insight (and I think so, because someone's had to actually come up with the tricks).<br /><br />Here's the Cauchy-Schwarz inequality for functions on [0, 1]:<br /><br />\[{\left[ {\int_0^1 {f(t)g(t)dt} } \right]^2} \le \int_0^1 {f{{(t)}^2}dt} \,\int_0^1 {g{{(t)}^2}dt} \]<br />How would we go about proving this?<br /><br />Well, perhaps you recall what the proof of the Cauchy-Schwarz inequality for ordinary vectors in $\mathbb{R}^n$ looks like. Here's a standard proof:<br /><br />\[{\left( {{x_1}{y_1} + {x_2}{y_2} + ... + {x_n}{y_n}} \right)^2} \le \left( {{x_1}^2 + {x_2}^2 + ... + {x_n}^2} \right)\left( {{y_1}^2 + {y_2}^2 + ... + {y_n}^2} \right)\]<br />\[\left( {\begin{array}{*{20}{c}}{{x_1}^2{y_1}^2 + {x_1}{y_1}{x_2}{y_2} + ... + {x_1}{y_1}{x_n}{y_n} + }\\\begin{array}{l}{x_2}{y_2}{x_1}{y_1} + {x_2}^2{y_2}^2 + ... + {x_2}{y_2}{x_n}{y_n} + \\... + \\{x_n}{y_n}{x_1}{y_1} + {x_n}{y_n}{x_2}{y_2} + ... + {x_n}^2{y_n}^2 + \end{array}\end{array}} \right) \le \left( {\begin{array}{*{20}{c}}{{x_1}^2{y_1}^2 + {x_1}^2{y_2}^2 + ... + {x_1}^2{y_n}^2 + }\\\begin{array}{l}{x_2}^2{y_1}^2 + {x_2}^2{y_2}^2 + ... + {x_2}^2{y_n}^2 + \\... + \\{x_n}^2{y_1}^2 + {x_n}^2{y_2}^2 + ... + {x_n}^2{y_n}^2\end{array}\end{array}} \right)\]<br /><br />And now we simply need the fact that $2{x_i}{y_i}{x_j}{y_j} \le {x_i}^2{y_j}^2 + {x_j}^2{y_i}^2$, which is of course true since squares are nonnegative.<br /><br />Why on Earth would I walk you through this inane proof, which I'd rather be flogged to death than have to write? Because you might get the idea that the same principle can be applied for functions.<br /><br />What exactly would be the analogy? Well, let's first "expand out" the product of the two integrals, like we expanded out the product of two sums -- this just means rewriting the product as a double-integral.<br /><br />\[\iint_{{[0,1]}^2}{f(s)g(s)f(t)g(t)\,ds\,dt} \leq \iint_{{[0,1]}^2} {{f{{(s)}^2}g{{(t)}^2}\,ds\,dt}}\]<br />This is essentially the same as our double summation on $[1,n]^2$ from earlier -- and like before, the diagonals of the summations are exactly identical (this idea should itself tell you when the inequality becomes an equality) -- and we'd like to prove, as before, that the inequality holds for each sum of corresponding elements across the diagonal.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/WfuSV.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="320" data-original-width="800" height="128" src="https://i.stack.imgur.com/WfuSV.png" width="320" /></a></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;">(Why does the principal diagonal look oriented different from that for the vectors in $\mathbb{R}^n$?) But how would you actually write down, on paper, this technique of summing up stuff across the principal diagonal? Well, you'll need to split your domain into two, then "reflect" one domain across the principal diagonal so the two integrals can be on the same (new triangular) domain.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">So we start with:</div><div class="separator" style="clear: both; text-align: left;"><br /></div>\[\int\limits_0^1 {\int\limits_0^1 {f(s)g(s)f(t)g(t){\kern 1pt} ds{\kern 1pt} dt} } \le \int\limits_0^1 {\int\limits_0^1 {f{{(s)}^2}g{{(t)}^2}{\kern 1pt} ds{\kern 1pt} dt} } \]<br />Where we're integrating first on $s$ (let's say this is the x-axis) and then on $t$ (the y-axis). To reflect anything, we need to actually be dealing with that thing, so split the domain of $s$ (which we can do, since $t$ is still a variable) into $[0,t]$ and $[t,1]$. This is equivalent to splitting the entire domain into the two triangles (convince yourself that this is the case if you don't see it immediately).<br /><br />\[\int\limits_0^1 {\int\limits_0^t {f(s)g(s)f(t)g(t){\kern 1pt} ds{\kern 1pt} dt} } + \int\limits_0^1 {\int\limits_t^1 {f(s)g(s)f(t)g(t){\kern 1pt} ds{\kern 1pt} dt} } \]\[\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \le \int\limits_0^1 {\int\limits_0^t {f{{(s)}^2}g{{(t)}^2}{\kern 1pt} ds{\kern 1pt} dt} } + \int\limits_0^1 {\int\limits_t^1 {f{{(s)}^2}g{{(t)}^2}{\kern 1pt} ds{\kern 1pt} dt} } \]<br />Where the split integrals represent the top-left and bottom-right squares respectively. Now how do we "reflect" the second part-integral on each side to match the domain of the first-part integral? The reflection is just:<br /><br />\[s' = t\]\[t' = s\]<br />If we transform the second part-integrals under this transformation:<br /><br />\[\int\limits_0^1 {\int\limits_0^t {f(s)g(s)f(t)g(t)\,ds\,} dt} + \int\limits_0^1 {\int\limits_{s'}^1 {f(t')g(t')f(s')g(s')\,dt'\,} ds'} \]\[\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \le \int\limits_0^1 {\int\limits_0^t {f{{(s)}^2}g{{(t)}^2}{\kern 1pt} ds{\kern 1pt} } dt} + \int\limits_0^1 {\int\limits_{s'}^1 {f{{(t')}^2}g{{(s')}^2}{\kern 1pt} dt'{\kern 1pt} } ds'} \]<br />(Don't mind the $x'$ notation for the new co-ordinates -- you should think of $x'$ as matching up with $x$) But our transformation isn't really over. The two part integrals are now integrating over the same <i>domain</i> -- the top-left triangle -- but in different ways. To see this, just consider the "way we were integrating" before the transformation and see how it transforms under our reflection:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/k3g5y.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="320" data-original-width="800" height="128" src="https://i.stack.imgur.com/k3g5y.png" width="320" /></a></div><br />... which are different parameterisations of the same region. So we just reparameterise the second part-integrals (shown in green) to match that of the blue integrals, leaving the integrand the same:<br /><br />\[\int\limits_0^1 {\int\limits_0^t {f(s)g(s)f(t)g(t){\kern 1pt} ds{\kern 1pt} } dt} + \int\limits_0^1 {\int\limits_0^{t'} {f(t')g(t')f(s')g(s'){\kern 1pt} ds'{\kern 1pt} } dt'} \]\[\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \le \int\limits_0^1 {\int\limits_0^t {f{{(s)}^2}g{{(t)}^2}{\kern 1pt} ds{\kern 1pt} } dt} + \int\limits_0^1 {\int\limits_0^{t'} {f{{(t')}^2}g{{(s')}^2}{\kern 1pt} ds'{\kern 1pt} } dt'} \]<br />And then we can add the integrals:<br /><br />\[\int\limits_0^1 {\int\limits_0^t {\left[ {2f(s)g(s)f(t)g(t)} \right]\,{\kern 1pt} ds{\kern 1pt} } dt} \,\, \le \,\,\,\int\limits_0^1 {\int\limits_0^t {\left[ {f{{(s)}^2}g{{(t)}^2} + f{{(t)}^2}g{{(s)}^2}} \right]\,{\kern 1pt} ds{\kern 1pt} } dt} \]<br />Which is true as it is true locally, i.e.<br /><br />\[2f(s)g(s)f(t)g(t) \le f{(s)^2}g{(t)^2} + f{(t)^2}g{(s)^2}\]<br />Which proves our result.<br /><br /><hr /><br />What's the point of going through all of this? Well, the point is that if I'd just thrown the substitutions at you -- or worse, the reparameterisation of the region, or the splitting in the first place -- without any motivation, then it would take about 20 days before there'd be murder charges on you and a tombstone on me. The reason you make them is because you want to unify the integrands -- but this motivation comes at the <i>very beginning</i>, before you start doing any substitutions, because that's why you're doing the substitutions in the first place, <i>that's how you come up with them</i>.<br /><br /><b>Exercise: </b>Motivate the substitutions and changes in the Gaussian integral, $\int_{-\infty}^\infty e^{-x^2}dx=\sqrt{\pi}$. Hint : what's the significance of the two-variable normal distribution?antiderivativescalculusdomain splittingintegrationlinear basismathematicsvariable substitutionSat, 27 Oct 2018 14:06:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7515239367411347178Abhimanyu Pallavi Sudhir2018-10-27T14:06:00ZComment by Abhimanyu Pallavi Sudhir on Newton's Third Law and conservation of momentum
https://physics.stackexchange.com/questions/435941/newtons-third-law-and-conservation-of-momentum/436015#436015
@DvijMankad Fair point. My answer was mostly in reference to classical physics. Indeed conservation of momentum becomes the only possible (or at least best) way to phrase things in quantum mechanics.Wed, 24 Oct 2018 10:06:24 GMThttps://physics.stackexchange.com/questions/435941/newtons-third-law-and-conservation-of-momentum/436015?cid=980063#436015Abhimanyu Pallavi Sudhir2018-10-24T10:06:24ZComment by Abhimanyu Pallavi Sudhir on Mate in 0 moves
https://puzzling.stackexchange.com/questions/74086/mate-in-0-moves/74093#74093
@FabianRöling Pawns have directions.Mon, 22 Oct 2018 09:27:58 GMThttps://puzzling.stackexchange.com/questions/74086/mate-in-0-moves/74093?cid=221467#74093Abhimanyu Pallavi Sudhir2018-10-22T09:27:58ZAnswer by Abhimanyu Pallavi Sudhir for Newton's Third Law and conservation of momentum
https://physics.stackexchange.com/questions/435941/newtons-third-law-and-conservation-of-momentum/436015#436015
3<p>As far as the actual physics is concerned, it is meaningless to talk of whether conservation of momentum is "more fundamental" than Newton's third law -- you can axiomatise classical physics in either way -- from Newton's laws, from conservation laws, from symmetry laws, from an action principle, whatever. You can prove the resulting theories are equivalent, in the sense that all the alternative axiomatic systems imply each other.</p>
<p>In terms of understanding, it makes sense to have multiple different frameworks in your head -- a symmetry-based framework is really good intuitively, especially once you understand Noether's theorem, while an action principle is the most powerful and also more useful when you leave the realm of classical physics. Treating Newton's laws as axioms isn't a great idea -- it's mostly just historically relevant.</p>
<p>When you learn more advanced physics, conservation of momentum <em>will</em> start "feeling" more fundamental -- this is simply because momentum is an interesting quantity to talk about.</p>Sun, 21 Oct 2018 21:20:57 GMThttps://physics.stackexchange.com/questions/435941/-/436015#436015Abhimanyu Pallavi Sudhir2018-10-21T21:20:57ZComment by Abhimanyu Pallavi Sudhir on If force is a vector, then why is pressure a scalar?
https://physics.stackexchange.com/questions/429998/if-force-is-a-vector-then-why-is-pressure-a-scalar/430000#430000
So? One could just ask "Why not define it as $\vec F = \vec{p}A$?" Your answer doesn't <i>justify</i> the definition, just states it.Sun, 21 Oct 2018 20:35:21 GMThttps://physics.stackexchange.com/questions/429998/if-force-is-a-vector-then-why-is-pressure-a-scalar/430000?cid=978845#430000Abhimanyu Pallavi Sudhir2018-10-21T20:35:21ZComment by Abhimanyu Pallavi Sudhir on If force is a vector, then why is pressure a scalar?
https://physics.stackexchange.com/questions/429998/if-force-is-a-vector-then-why-is-pressure-a-scalar/430001#430001
This answer simply lacks insight -- one could just ask "well, then what's wrong with $\vec{F}/A$? What did it ever do?" and the answer would be "Nothing, that's the point."Sun, 21 Oct 2018 20:33:53 GMThttps://physics.stackexchange.com/questions/429998/if-force-is-a-vector-then-why-is-pressure-a-scalar/430001?cid=978844#430001Abhimanyu Pallavi Sudhir2018-10-21T20:33:53ZDerive simple logical laws in a structure with not and implies
https://math.stackexchange.com/questions/2962525/derive-simple-logical-laws-in-a-structure-with-not-and-implies
4<p>We can define an abstract system with the following three axiom schemes that define <span class="math-container">$\to$</span> and <span class="math-container">$\lnot$</span> as follows:</p>
<p>ax1. <span class="math-container">$P\to(Q\to P)$</span></p>
<p>ax2. <span class="math-container">$(\lnot Q \to \lnot P)\to(P\to Q)$</span></p>
<p>ax3. <span class="math-container">$(P\to(Q\to R))\to((P\to Q)\to(P\to R))$</span></p>
<p>And any logical expressions may be substituted for <span class="math-container">$P, Q, R$</span>. Now obviously, you can't assume anything else (not even any definition of <span class="math-container">$\lnot$</span>, etc.) these two objects <span class="math-container">$\to$</span> and <span class="math-container">$\lnot$</span> need not have anything at all to do with the standard implication and negation we know of, they just happen to satisfy the above properties. But with the above, we can prove basic "logical laws" like:</p>
<p><span class="math-container">$$P\to P$$</span></p>
<p>(which we can prove by applying ax3 on ax1, showing that <span class="math-container">$(P\to Q)\to(P\to P)$</span>, so we just need to construct a <span class="math-container">$Q$</span> for any <span class="math-container">$P$</span> to imply, and such a <span class="math-container">$Q$</span> is provided by ax1, <span class="math-container">$Q:= (R\to P)$</span>.)</p>
<p>Is it possible to prove this? :--</p>
<p><span class="math-container">$$P\to \lnot\lnot P$$</span></p>
<p>The person who gave me this problem insists it is provable, although it seems to me that such a proof is impossible, as none of the axioms increase the depth of <span class="math-container">$\lnot$</span>s across the <span class="math-container">$\to$</span> (i.e. none of them have a more knotty right-hand-side than left-hand-side).</p>logicpropositional-calculusaxiomshilbert-calculusFri, 19 Oct 2018 19:51:34 GMThttps://math.stackexchange.com/q/2962525Abhimanyu Pallavi Sudhir2018-10-19T19:51:34ZDiscovering the Fourier transform
https://thewindingnumber.blogspot.com/2018/10/discovering-fourier-transform.html
0Consider a function with period 1 -- computing its Fourier series, you write it as:<br /><br />\[f(x) = \sum\limits_{n = - \infty }^\infty {{a_n}{e^{i2\pi \,\,nx}}} \]<br />Where<br /><br />\[{a_n} = \int_{-1}^1 {f(x){e^{ - 2\pi inx}}dx} \]<br />That's all standard and trivial. But suppose you wanted to study a function with a higher period (we will tend this period to infinity) -- what would that look like? Well, consider $g(x)=f(x/L)$, which is this function we're looking for -- then we can rewrite the above identities as:<br /><br />\[g(xL) = \sum\limits_{n = - \infty }^\infty {{a_n}{e^{i2\pi {\kern 1pt} {\kern 1pt} nx}}} \Rightarrow g(x) = \sum\limits_{n = - \infty }^\infty {{a_n}{e^{i2\pi {\kern 1pt} {\kern 1pt} nx/L}}} \]<br />\[{a_n} = \int_{-1}^1 {g(xL){e^{ - 2\pi inx}}dx} \Rightarrow {a_n} = \int_{-L}^L {g(x){e^{ - 2\pi inx/L}}dx/L} \]<br /><br />Where we transformed $x\to x/L$.<br /><br />This seems all too trivial and useless, and maybe you're looking for a little trick to turn this into something interesting. But tricks must typically also arise from some sort of insight. Let's assume for a moment that we didn't know anything about variable substitutions or transformations like the kind we did above (and indeed, the idea behind variable substitutions also comes from a geometric understanding of the corresponding transformation) and think about how we may re-think the Fourier transform in its context.<br /><br />Well, if the function's period is $P$, in other words it is stretched out by $P$, the same logic must be used to derive the Fourier series for the new function as for the function with period 1 -- specifically, sines and cosines with <i>longer periods </i>than $P$ don't matter (their coefficient must be zero, because otherwise you've introduced an element into the function that doesn't repeat with that period), but those with <i>shorter, divisible periods </i>matter, because they influence the value of the function within the period, perturbing it by little bits to get to the right function.<br /><br />So when dealing with our new period $L$, one would expect periods that are fractions of $L$, i.e. $L/n$, as opposed to just $1/n$. So $n/L$ is "more important" than $n$, and indeed it seems very easy to transform the summation into one in terms of this new variable, which we will still call $n$ (i.e. transform $n/L\to n$):<br /><br />\[g(x) = \sum\limits_n^{} {{a_n}{e^{i2\pi nx}}} \]<br />\[{a_n} = \frac{1}{L}\int_{-L}^L {g(x){e^{ - 2\pi inx}}dx} \]<br /><br />Where we labeled $a_{nL}$ as just $a_n$, because that's just a subscript, the labeling doesn't matter. Just remember that $n$ is no longer just an integer/multiple of 1, but a multiple of any fraction $1/L$.<br /><br />Now note how a non-periodic function is just a function with infinite period, i.e. $L\to\infty$. So $n$ stops being a discrete integer and starts approaching a continuous variable, which we'll call $s$, writing $a_n$ as $a(s)ds$ (why the $ds$? because the increment in $n$ is just $1/L$, which appears in the expression for $a_n$).<br /><br />\[g(x) = \int_{ - \infty }^\infty {ds\,\,a(s){e^{i2\pi sx}}} \]<br />\[a(s) = \int_{ - \infty }^\infty {dx\,\,g(x){e^{ - i2\pi sx}}} \]<br /><br />Which is just a pretty satisfying result.<br /><br /><hr /><br />Recall again the expressions we got for the Fourier transform and its inverse:<br /><br />\[f(t) = \int_{ - \infty }^\infty {ds\,\,\hat f(s){e^{i2\pi ts}}} \]<br />\[\hat f(s) = \int_{ - \infty }^\infty {dt{\kern 1pt} {\kern 1pt} f(t){e^{ - i2\pi st}}} \]<br />(We typically say the Fourier transform maps time-domain functions to frequency-domain ones, so we consider the latter to be the Fourier transform and the first equation to be its inverse.) Note how you can easily turn the first one into an actual Fourier transform, by transforming $s\to -s$:<br /><br />\[f(t) = \int_{ - \infty }^\infty {ds\,\,\hat f( - s){e^{ - i2\pi ts}}} \]<br />In other words:<br /><br />\[{\mathcal{F}^{ - 1}}\left\{ {f(s)} \right\} = \mathcal{F}\left\{ {f( - s)} \right\}\]<br />And of course that means ${\mathcal{F}^4} = I$, the identity operator (kind of like the derivative on complex exponentials/sine and cosine, is it not?).calculusfinite-domain fourier transformsfourierfourier seriesfourier transformsintegral transformmathematicsWed, 10 Oct 2018 21:12:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1844418461803314162Abhimanyu Pallavi Sudhir2018-10-10T21:12:00ZQuaternion introduction: Part I
https://thewindingnumber.blogspot.com/2018/09/quaternion-introduction-part-i.html
0I generally really like the content produced at <a href="https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw">3blue1brown</a>, but their <a href="https://www.youtube.com/watch?v=d4EgbgTm0Bg">recent video on quaternions</a> was just downright terrible. It entirely lacked Grant Sanderson's signature "discover it for yourself" approach, i.e. motivating the idea from the ground-up, and focused too much on an arbitrary formalism (stereographic projections aren't necessary for visualising anything).<br /><br />The right way to motivate quaternions is to start by thinking about generalising complex numbers to higher dimensions. Complex numbers are a remarkable and elegant idea -- if you don't understand why I'm saying this, you could either get off the grid and spend the rest of your life as a circus monkey, or you could read my posts "<a href="https://thewindingnumber.blogspot.com/2017/08/symmetric-matrices-null-row-space-dot-product.html">Null and row spaces, transpose and the dot product</a>" and "<a href="https://thewindingnumber.blogspot.com/2016/11/making-sense-of-eulers-formula.html">Making sense of Euler's formula</a>".<br /><br />The key idea behind complex numbers is that they are an alternate, simple representation of a specific set of linear transformations, namely: two-dimensional spirals (scaling and rotations). Note, similarly, that the real numbers can also be considered an alternate representation of e.g. scaling in one dimension.<br /><br />The natural way to generalise complex numbers to more than two dimensions may seem to be to have an imaginary unit for each possible rotation (or more precisely, each "basis rotation"). In three dimensions, the basis has three planes of rotation, and could be e.g. rotations in the <i>xy</i>-plane, rotations in the <i>yz</i>-plane and rotations in the <i>zx</i> plane (you may have heard these as rotations "around" the <i>z</i>, <i>x</i> and <i>y</i> axes respectively, referring to the axes that remain invariant during the rotation -- however, as it turns out, in a greater number of dimensions $n$, the number of dimensions held invariant is $n-2$, which is only equal to 1 -- i.e. a single axis -- in 3 dimensions. e.g. in 4 dimensions, an $xy$-rotation would leave the $zw$ plane invariant.)<br /><br />So let's try out this formalism, because it seems promising. We could write, e.g. <i>i</i> for the <i>yz</i> rotation, <i>j</i> for the <i>zx</i> rotation and <i>k</i> for the <i>xy</i> rotation. Try to work out some of the algebra here for yourself. What does $ij=?$ equal? What does $jk = ?$ What does $i^2=?$ equal?<br /><br />As it turns out, none of these transformations result in anything very interesting. It would have certainly been elegant if you'd gotten nice results, like $ij=k$, or something, but you don't. One of the neat things about the complex number system is that not only do all complex numbers together, or all unit complex numbers together, form a group -- even $\{1,i,-1,-i\}$ forms a group under multiplication. But $\{1,-1,,i,j,k,-i,-j,-k\}$ <i>do not</i> form a group.<br /><br />How would one solve this problem? Well, the reason $i^2$ doesn't equal minus 1 is that it only offers a reflection across the $x$-axis. The matrix representing $i^2$ is:<br /><br />$${\left[ {\begin{array}{*{20}{c}}1&0&0\\0&0&{ - 1}\\0&1&0\end{array}} \right]^2} = \left[ {\begin{array}{*{20}{c}}1&0&0\\0&{ - 1}&0\\0&0&{ - 1}\end{array}} \right]$$<br />(If you can't come up with the matrix for $i$, you should review the linear algebra series -- or the circus monkey thing.) What if you reflected across all three axes, in some order? You'd have:<br /><br />$${i^2}{j^2}{k^2} = \left[ {\begin{array}{*{20}{c}}1&0&0\\0&{ - 1}&0\\0&0&{ - 1}\end{array}} \right]\left[ {\begin{array}{*{20}{c}}{ - 1}&0&0\\0&1&0\\0&0&{ - 1}\end{array}} \right]\left[ {\begin{array}{*{20}{c}}{ - 1}&0&0\\0&{ - 1}&0\\0&0&1\end{array}} \right] = \left[ {\begin{array}{*{20}{c}}1&0&0\\0&1&0\\0&0&1\end{array}} \right]$$<br />In other words, ${i^2}{j^2}{k^2} = 1$. Additionally you may have observed while crunching the numbers above that ${i^2}{j^2} = {k^2}$.<br /><br />This may give you an idea*. Here's another thing that may give you an idea: the reason you had $i^2=-1$ with complex numbers was that $i$ rotated <i>all</i> the axes in the plane. By contrast, $i,j,k$ only each rotate two of the three axes in 3-dimensional space.<br /><br />*the idea being that perhaps combinations of two rotations can give us more interesting results<br /><br />Well, how do you solve this problem? How do you create a rotation that "rotates all the axes"? Seemingly, you can't. Sure, you can define a rotation that rotates all three of the x, y and z axes, but that would still leave some other axis invariant, which we call "the axis of rotation". <b>Can we define a rotation that leaves no axis invariant?</b><br /><b><br /></b> In three dimensions, the answer is no. Any rotation leaves one axis invariant, and trying to rotate this axis requires rotating it with another axis, and the resulting product rotation still leaves some, calculable axis invariant.<br /><br /><div class="twn-furtherinsight">Calculate this axis.</div><br />The key is to extend our thinking to <i>four</i> dimensions. Here, you can have pairs of rotations <i>acting simultaneously</i> on two different pairs of axes. Since there are only four dimensions in four dimensions, all four axes are transformed.<br /><br />Now, the obvious thing to do here may be to define an imaginary number for each pair of rotations in four dimensions -- there are $\left( {\begin{array}{*{20}{c}}4\\2\end{array}} \right)=6$ rotations, and $\left( {\begin{array}{*{20}{c}}6\\2\end{array}} \right) = 15$ such pairs. But this would be too many "basis rotations", and the rotations would not be independent of each other, since rotations in 4 dimensions can be described with only 6 basis rotations.<br /><br />So how could we make use of our idea of using pairs of rotations as our basis for describing rotations?<br /><br />The key is to make one of our four axes "special" -- call this axis $t$, and the other three axes $x, y, z$. Instead of considering all 15 rotation-pairs, we only consider the following three:<br /><br />$$\begin{array}{l}i = (tx,yz)\\j = (ty,\overline{xz})\\k = (tz,xy)\end{array}$$<br /><div class="twn-furtherinsight">This is not the only possible representation of the quaternions, of course. Even among complex numbers, you have two possible representations -- you could make $i$ a counter-clockwise rotation, as is conventional, or a clockwise one, i.e. there is a symmetry between $i$ and $-i$. For quaternions, it turns out there are 48 different possible representations -- prove this.</div><br />Where $tx$ represents a rotation that sends $t$ to $x$ (i.e. a counter-clockwise rotation on a plane where $t$ is the x-axis and $x$ is the y-axis) and $\overline{xz}$ represents a rotation that sends $z$ to $x$, i.e. the clockwise rotation on a plane where $x$ is the x-axis and $z$ is the y-axis.<br /><br />It turns out that these pairs -- called <i>quaternions</i> -- in fact allow the representation of 3-dimensional rotations, since you need only a $\left( {\begin{array}{*{20}{c}}3\\2\end{array}} \right)=3$-dimensional basis to represent rotations in 3 dimensions.<br /><br /><div class="twn-furtherinsight">Think: Are there any other dimensions that allow such a system to be defined? Can you have, e.g. "hexternions"?</div><br /><div class="twn-pitfall">Note that however tempting it may seem, there is no known natural description of special relativity in terms of quaternions. Sorry.</div><br />One may work through the algebra of these new quaternions by tracking the position of each axis through the multiplication, and as it turns out, it is indeed much more elegant than the more obvious representation detailed earlier:<br /><br />$$\begin{array}{l}j = k,jk = i,ki = j\\{i^2} = {j^2} = {k^2} = - 1\\ijk = - 1\end{array}$$<br />In the next several articles, we will look at exactly how 3 dimensional rotations can be represented with quaternions, the relation between quaternions and the dot and cross products through the commutative and anti-commutative parts, and further extensions of the quaternions to higher dimensions.complex algebracomplex numberscross productgroup theorylinear algebramathematicsquaternionsFri, 21 Sep 2018 16:41:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-6898563123316782451Abhimanyu Pallavi Sudhir2018-09-21T16:41:00ZAnswer by Abhimanyu Pallavi Sudhir for If force is a vector, then why is pressure a scalar?
https://physics.stackexchange.com/questions/429998/if-force-is-a-vector-then-why-is-pressure-a-scalar/430008#430008
4<p>Pressure is a scalar because it does not behave as a vector -- specifically, you can't take the "components" of pressure and take their Pythagorean sum to obtain its magnitude. Instead, pressure is actually proportional to the <em>sum</em> of the components, <span class="math-container">$(P_x+P_y+P_z)/3$</span>.</p>
<p>The way to understand pressure is in terms of the stress tensor, and pressure is equal to the trace of the stress tensor. Once you understand this, the question becomes equivalent to questions like "why is the dot product a scalar?" (trace of the tensor product), "why is the divergence of a vector field a scalar?" (trace of the tensor derivative), etc. </p>
<p>There is no physical significance to taking the diagonal components of a tensor and putting them in a vector -- there <em>is</em> a physical significance to adding them up, and the invariance properties of the result tells you that it is a scalar.</p>
<p>See also: <a href="https://physics.stackexchange.com/questions/186045/why-do-we-need-both-dot-product-and-cross-product/419873#419873">Why do we need both dot product and cross product?</a></p>Fri, 21 Sep 2018 08:57:17 GMThttps://physics.stackexchange.com/questions/429998/-/430008#430008Abhimanyu Pallavi Sudhir2018-09-21T08:57:17ZComment by Abhimanyu Pallavi Sudhir on Can any element be a metal?
https://physics.stackexchange.com/questions/428748/can-any-element-be-a-metal
Hydrogen is special because its the only member of its period and has a half-filled valence shell, so it can both lose or gain electrons. This is not the case with all non-metals.Fri, 14 Sep 2018 15:32:33 GMThttps://physics.stackexchange.com/questions/428748/can-any-element-be-a-metal?cid=962154Abhimanyu Pallavi Sudhir2018-09-14T15:32:33ZComment by Abhimanyu Pallavi Sudhir on How can the solutions to equations of motion be unique if it seems the same state can be arrived at through different histories?
https://physics.stackexchange.com/questions/426445/how-can-the-solutions-to-equations-of-motion-be-unique-if-it-seems-the-same-stat/426453#426453
@JohnRennie How does it not? The point is that even something as simple as a Taylor series does not let you predict the function based on just the value of the function at present.. If you have an objection to the answer, phrase it precisely, don't copy and paste standard templates please.Mon, 03 Sep 2018 19:03:21 GMThttps://physics.stackexchange.com/questions/426445/how-can-the-solutions-to-equations-of-motion-be-unique-if-it-seems-the-same-stat/426453?cid=956802#426453Abhimanyu Pallavi Sudhir2018-09-03T19:03:21ZComment by Abhimanyu Pallavi Sudhir on How can the solutions to equations of motion be unique if it seems the same state can be arrived at through different histories?
https://physics.stackexchange.com/questions/426445/how-can-the-solutions-to-equations-of-motion-be-unique-if-it-seems-the-same-stat/426453#426453
@PrakharGupta You don't really need that. You can always take a cross-section of the function across the t-axis.Mon, 03 Sep 2018 19:01:16 GMThttps://physics.stackexchange.com/questions/426445/how-can-the-solutions-to-equations-of-motion-be-unique-if-it-seems-the-same-stat/426453?cid=956800#426453Abhimanyu Pallavi Sudhir2018-09-03T19:01:16ZAnswer by Abhimanyu Pallavi Sudhir for How can the solutions to equations of motion be unique if it seems the same state can be arrived at through different histories?
https://physics.stackexchange.com/questions/426445/how-can-the-solutions-to-equations-of-motion-be-unique-if-it-seems-the-same-stat/426453#426453
1<p>"The jar is empty at present" just tells you $f(0)$. You also need $f'(0)$, $f''(0)$, etc.</p>Mon, 03 Sep 2018 09:46:25 GMThttps://physics.stackexchange.com/questions/426445/-/426453#426453Abhimanyu Pallavi Sudhir2018-09-03T09:46:25Zhtaccess – Deny all access to files?
https://stackoverflow.com/questions/52136524/htaccess-deny-all-access-to-files
0<p>I am using <a href="http://www.htaccesseditor.com/en.shtml" rel="nofollow noreferrer">this</a> .htaccess editor, which tells me that the following code to "Deny all access to files" is "strongly recommended" (and apparently removing the minus from "-Indexes" allows access):</p>
<pre><code><Files ~ "^\.(htaccess|htpasswd)$">
deny from all
</Files>
Options -Indexes
order deny,allow
</code></pre>
<p>I'm new to .htaccess files, and honestly don't really know what this code does (I haven't yet set up my web server to try things out for myself), but the first line seems to suggest it only denies access to the htaccess and htpasswd files. </p>
<p><strong>Does it only deny access to the htaccess file or to literally "all files" as the description states?</strong> </p>
<p>Obviously, I don't want the latter -- in fact, I don't want to deny access to any file except if there are security concerns.</p>.htaccessSun, 02 Sep 2018 11:23:48 GMThttps://stackoverflow.com/q/52136524Abhimanyu Pallavi Sudhir2018-09-02T11:23:48ZWhy are negative temperatures hot?
https://thewindingnumber.blogspot.com/2018/08/why-are-negative-temperatures-hot.html
0You've probably heard the statement "negative temperatures are <i>hot</i>!", referring of course to negative absolute temperatures.<br /><br />But why are they hot? Well, a common explanation is that it's not really the temperature $T$ that is the fundamental quantity, but rather the statistical beta, or "coldness" $\beta=1/T$. So negative temperatures have <i>negative</i> coldness, which is hotter than any positive temperature, since even the hottest positive temperature is only going to give you a small, but positive coldness. So the fact that negative temperatures are hot is a result of the fact that $1/x$ is not really decreasing everywhere, due to its discontinuity.<br /><br />But why? Why is $\beta$ the fundamental quantity? Why should we arbitrarily consider this to be our metric of hotness and coldness, and not $T$?<br /><br />This is a really interesting example to teach people to think in a positivist way in physics, and to operationalise things. What does it mean for something to be hot?<br /><br />Well, you touch it and you say "Ouch!"<br /><br />Seriously, that's all there is -- if you touch something hot, you say "Ouch!", if you touch something cold, you say "Whee!", or something. That's the fundamental, positivist definition of hotness -- "Does it feel hot?"<br /><br />Well, why would something feel hot? <i>Because it transfers heat to you</i>. And this is our operational, positivistic definition of hotness -- if one body transfers heat to another body, it is said to be hotter than the other body.<br /><br />So we need to find out a criterion to decide the direction of heat flow between two bodies. In the past, you've probably taken for granted that heat is transferred from a body with higher temperature to that with lower temperature, but that's just a crappy high school definition. What really causes heat diffusion? Well, when there are a lot of fast-moving particles in one place and slow-moving particles in another, it turns out that a state where the particles are more uniformly spread-out is more likely to happen in future. This is just the requirement that entropy must increase -- it's the second law of thermodynamics.<br /><br />So if we have body 1 with temperature $T_1$ and body 2 with temperature $T_2$, with heat flow of $Q$ from body 1 to body 2, then the second law of thermodynamics is stated as:<br /><br />$$\Delta S_1+\Delta S_2>0$$<br />$$-\Delta Q/T_1+\Delta Q/T_2>0$$<br />$$\Delta Q\left(\frac1{T_2}-\frac1{T_1}\right)>0$$<br />In other words -- if $\Delta Q>0$, i.e. if the heat flow is really from body 1 to body 2, then we require $1/T_2>1/T_1$, and if the heat flow is from body 2 to body 1 ($\Delta Q<0$), we require $1/T_1>1/T_2$.<br /><br />And there you have it! Heat does <i>not</i> flow from the body with higher temperature to the body with lower temperature -- it flows from the body with lower $1/T$ to the body with higher $1/T$. For positive temperatures, these are the same thing -- but negative temperatures have the lowest $1/T$, and are thus hotter.<br /><br /><hr /><br />So those of you want the U.S. to switch to Celsius, or those who report temperatures in Kelvin for no good reason except intellectual signalling... perhaps start reporting <i>statistical betas</i> in 1/Kelvins instead.<br /><br />...<br />"Hey, Alexa, is it chilly outside?"<br />"The coldness in your area is 0.00375 anti-Kelvin."<br /><br />"...I think I'll just risk freezing to death."entropymetric systemnegative temperaturesphysicsstatistical betastatistical physicsthermal physicsthermodynamicsSat, 18 Aug 2018 19:17:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-3164462372727249499Abhimanyu Pallavi Sudhir2018-08-18T19:17:00ZComment by Abhimanyu Pallavi Sudhir on Why is velocity defined as 4-vector in relativity?
https://physics.stackexchange.com/questions/423360/why-is-velocity-defined-as-4-vector-in-relativity/423364#423364
@AccidentalFourierTransform No, it is not -- like I said, you can take simple examples, like $\mu,\nu=0$, or $\mu=i,\,\nu=0$, or $\mu=0,\,\nu=i$ -- these give $m$, $mv$ and $m/v$ respectively, none of which are conserved.Sat, 18 Aug 2018 18:02:09 GMThttps://physics.stackexchange.com/questions/423360/why-is-velocity-defined-as-4-vector-in-relativity/423364?cid=949581#423364Abhimanyu Pallavi Sudhir2018-08-18T18:02:09ZAnswer by Abhimanyu Pallavi Sudhir for From the speed of light being an invariant to being the maximum possible speed
https://physics.stackexchange.com/questions/331119/from-the-speed-of-light-being-an-invariant-to-being-the-maximum-possible-speed/423423#423423
0<p>A simple thought experiment does the trick -- consider a train moving faster than light, and it has headlights (it's a glass train). According to a stationery observer (stationery in a reference frame where the train is faster than light), the train must always be in front of the light, but according to an observer hanging out of the train, the light must be in front of him, since light speed is still $c$.</p>
<p>It might not seem like this relativeness of the order of the two objects is a problem, but it is -- say, for instance, the train is moving towards a high-tech wall which is trained to do this when switched ON:
(1) if hit by a train, make world explode
(2) if light is incident, switch OFF.
The wall is currently switched ON. According to one observer, the world explodes, whereas according to another, it doesn't. This is an inconsistency.</p>
<p>Why wouldn't this argument apply to <em>any</em> speed and prohibit all motion? For example, why can't the wall be programmed to switch off a certain amount of time after which light is incident? Relativity says this is okay, because time can dilate and transform scale between reference frames. </p>
<p>But in order to make FTL speeds okay, you need to allow time to flip direction -- this is why the real condition is "to go faster than light, you must forgo causality", or simply, "locality = causality".</p>Sat, 18 Aug 2018 12:28:48 GMThttps://physics.stackexchange.com/questions/331119/-/423423#423423Abhimanyu Pallavi Sudhir2018-08-18T12:28:48ZAnswer by Abhimanyu Pallavi Sudhir for Link between Special relativity and Newtons gravitational law
https://physics.stackexchange.com/questions/123243/link-between-special-relativity-and-newtons-gravitational-law/423379#423379
0<p>Consider three theories:</p>
<p>$$L_A=1$$
$$L_B=1+h$$
$$L_C=1+h+h^2$$</p>
<p>Theory A is a special case of Theory C when $h$ is small, Theory B is a special case of C when $h$ is small, doesn't this mean A and B are the same?</p>
<p>This is not a perfect analogy, but an example as to why this sort of reasoning breaks down.</p>Sat, 18 Aug 2018 07:13:36 GMThttps://physics.stackexchange.com/questions/123243/-/423379#423379Abhimanyu Pallavi Sudhir2018-08-18T07:13:36ZComment by Abhimanyu Pallavi Sudhir on Why is velocity defined as 4-vector in relativity?
https://physics.stackexchange.com/questions/423360/why-is-velocity-defined-as-4-vector-in-relativity/423364#423364
@GauthamAP The conservation argument is easy to see -- for conservation of a tensor to hold, each component must be conserved, including, e.g. $m\,dx^0/dx^0=m$. But mass <i>isn't</i> conserved in relativity. Neither is, for example $m\,dx^1/dx^0=mv$ (without the Lorentz factor).Sat, 18 Aug 2018 06:50:11 GMThttps://physics.stackexchange.com/questions/423360/why-is-velocity-defined-as-4-vector-in-relativity/423364?cid=949398#423364Abhimanyu Pallavi Sudhir2018-08-18T06:50:11ZAnswer by Abhimanyu Pallavi Sudhir for Why is velocity defined as 4-vector in relativity?
https://physics.stackexchange.com/questions/423360/why-is-velocity-defined-as-4-vector-in-relativity/423364#423364
5<p>"It should transform like a four-vector under a Lorentz transformation" is a generalisation of several intuitions you typically have regarding how natural objects/tensors should behave in special relativity -- an obvious one is "no special status to any individual dimension, since space and time are inherently symmetric. That $dx^\mu/dx^0$ doesn't transform like a four-vector is obvious from the fact that it gives special preference to time.</p>
<p>The conventional way to define four-velocity in relativity is as $dx^\mu/ds$. Your 2-tensor idea is cute -- it is similar to the angle tensor generalised to four-dimensions -- but it doesn't satisfy the uses we have of the standard four-velocity (e.g. how would the four-momentum be defined? $m\,dx^\mu/dx^\nu$? That wouldn't be conserved.)</p>Sat, 18 Aug 2018 06:11:14 GMThttps://physics.stackexchange.com/questions/423360/why-is-velocity-defined-as-4-vector-in-relativity/423364#423364Abhimanyu Pallavi Sudhir2018-08-18T06:11:14ZAnswer by Abhimanyu Pallavi Sudhir for Sheldon Cooper Primes
https://math.stackexchange.com/questions/1024969/sheldon-cooper-primes/2877436#2877436
1<p>I created a <a href="https://www.khanacademy.org/computer-programming/run-testssheldon-cooper-primes/5927601331011584" rel="nofollow noreferrer">[script]</a> you can play with here to test this out. Note that the answer depends on your numerical base -- among all bases I've tried, 10 seems to be the <em>only</em> base in which there's a Sheldon Cooper prime. </p>
<p>Base 16 seems promising, however -- it has a large number of "special emirps", and actually provides primes with the appropriate product of digits, which very few bases provide. </p>
<p>Can someone try base = 16, convbase = 2 (and perhaps other bases in multiple tabs) with a large uppercap (e.g. 10,000,000) using fastcount = false? It would take ~15 hours for an upper cap of 10 million -- or just 90 minutes for an uppercap of 1 million -- but I can't leave my laptop on for so long (the fan is malfunctioning).</p>Thu, 09 Aug 2018 16:47:08 GMThttps://math.stackexchange.com/questions/1024969/-/2877436#2877436Abhimanyu Pallavi Sudhir2018-08-09T16:47:08ZComment by Abhimanyu Pallavi Sudhir on Does Wave-Particle duality exist at high speed?
https://physics.stackexchange.com/questions/421341/does-wave-particle-duality-exist-at-high-speed
Note that the question you should be asking is "at high momentum", not at "high speed". Particles traveling at the speed of light have zero mass, so the de Broglie wavelength doesn't become zero.Mon, 06 Aug 2018 06:27:52 GMThttps://physics.stackexchange.com/questions/421341/does-wave-particle-duality-exist-at-high-speed?cid=944238Abhimanyu Pallavi Sudhir2018-08-06T06:27:52ZComment by Abhimanyu Pallavi Sudhir on Perturbation theory
https://physics.stackexchange.com/questions/75422/perturbation-theory/75428#75428
@nielsnielsen Yes, this is correct. For example, you can get this complexity as a result of Feynman diagrams that store an infinite number of interactions, like in QCD, where gluons interact themselves through the strong force.Wed, 01 Aug 2018 17:12:19 GMThttps://physics.stackexchange.com/questions/75422/perturbation-theory/75428?cid=942490#75428Abhimanyu Pallavi Sudhir2018-08-01T17:12:19ZComment by Abhimanyu Pallavi Sudhir on How can time be a dimension?
https://physics.stackexchange.com/questions/75625/how-can-time-be-a-dimension/75628#75628
The $c^2$ is redundant, it is an artefact of using different units for space and time for no good reason (except practicality).Sun, 29 Jul 2018 16:14:24 GMThttps://physics.stackexchange.com/questions/75625/how-can-time-be-a-dimension/75628?cid=941125#75628Abhimanyu Pallavi Sudhir2018-07-29T16:14:24ZA curious infinite sum arising from an elementary geometric argument
https://thewindingnumber.blogspot.com/2018/07/a-curious-infinite-sum-arising-from.html
0A well-known elementary geometric argument for the sum of an infinite geometric progression proceeds as follows: consider a Euclidean triangle $\Delta ABC$ with angles $A=\alpha$, $B=\beta$, $C=2\beta$ and bisect $C$ to create a point $C'$ on $AB$. Then $\Delta ABC \sim \Delta ACC'$. Record the area of $\Delta C'BC$ to a counter. Repeat the same bisection with $C'$, $C''$, ad infinitum, each time adding to the counter the area of the piece of the triangle that <i>isn't</i> similar to the parent triangle and bisecting the triangle that is.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-b9Xt5bxIYFI/W11MmoHSv6I/AAAAAAAAFBc/a9Fv2xTtFYgM7SGtqDZRlV19sVA39ZoCgCLcBGAs/s1600/tribasic.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="129" data-original-width="230" src="https://3.bp.blogspot.com/-b9Xt5bxIYFI/W11MmoHSv6I/AAAAAAAAFBc/a9Fv2xTtFYgM7SGtqDZRlV19sVA39ZoCgCLcBGAs/s1600/tribasic.png" /></a></div><br />Suppose the area of the original triangle $\Delta ABC$ is 1, and the piece $ACC'$ has area $x$ (thus each succeeding similar copy has area a fraction of $x$ of the preceding triangle). Then the total value of our counter, which approaches 1, is:<br /><br />$$(1-x)+x(1-x)+x^2(1-x)+...=1$$<br />$$1+x+x^2+...=\frac1{1-x}$$<br />Where $x$ depends on the actual angle $\beta$.<br /><br />It is interesting, however, to consider the case of a general scalene triangle $\Delta ABC$ where $C$ is not necessarily twice of $B$. Here each successive triangle wouldn't be similar to the last, thus we won't be dealing with a geometric series.<br /><br />Let the angles of $\Delta ABC$ be $A=\alpha$, $B=\beta$,$C=\pi-\alpha-\beta$. We bisect angle $C$, as before, adding to our counter the piece that contains the angle $B$. The remaining triangle has angles $\alpha$, $\frac{\pi-\alpha-\beta}{2}$ and $\pi-\alpha-\frac{\pi-\alpha-\beta}{2}$.<br /><br />We keep repeating the process, each time bisecting the angle that is neither $\alpha$ nor the angle formed as half the angle that was just bisected, and adding to our counter the area of the piece that does not contain the angle $A$, while splitting the piece that does.<br /><br />To keep track of the angles in each successive triangle, we define three series:<br /><br />$$\begin{gathered}<br />{\alpha _n} = \alpha\\<br />{\beta _n} = {\gamma _{n - 1}}/2\\<br />{\gamma _n} = \pi - {\alpha _n} - {\beta _n}\\<br />\end{gathered}$$<br />These are defined recursively, of course, so we calculate the explicit form by substituting $\gamma_n$ into $\beta_n$ to get a recursion within $\beta_n$ -- then with the simple initial-value conditions $\alpha_0=\alpha$, $\beta_0=\beta$, etc. we get:<br /><br />$$\begin{gathered}<br />{\alpha _n} = \alpha\\<br />{\beta _n} = \frac{{\pi - \alpha }}{3} + {\left( { - \frac{1}{2}} \right)^n}\left( {\beta - \frac{{\pi - \alpha }}{3}} \right)\\<br />{\gamma _n} = \frac{{2(\pi - \alpha )}}{3} - {\left( { - \frac{1}{2}} \right)^n}\left( {\beta - \frac{{\pi - \alpha }}{3}} \right)\\<br />\end{gathered}$$<br />The area ratio of the piece we're keeping at each stage is $\frac{{\sin {\alpha _n}}}{{\sin {\alpha _n} + \sin {\beta _n}}}$, therefore the convergence of their sum of their areas to 1 implies:<br /><br />$$\begin{gathered}<br />\frac{{\sin \alpha }}{{\sin \alpha + \sin \beta }} + \frac{{\sin \beta }}{{\sin \alpha + \sin \beta }}\frac{{\sin \alpha }}{{\sin \alpha + \sin {\beta _1}}} \hfill \\<br />\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, + \frac{{\sin \beta }}{{\sin \alpha + \sin \beta }}\frac{{\sin {\beta _1}}}{{\sin \alpha + \sin {\beta _1}}}\frac{{\sin \alpha }}{{\sin \alpha + \sin {\beta _2}}} + ... = 1 \hfill \\ <br />\end{gathered} $$<br />Or more compactly:<br /><br />$$\sum\limits_{k = 0}^\infty {\left[ \left(1-x_k(\alpha,\beta)\right)\prod\limits_{j = 0}^{k - 1} {x_j(\alpha,\beta)} \right]} = 1$$<br />Where:<br /><br />$${x_k}(\alpha ,\beta ) = \frac{{\sin \left( {\frac{{\pi - \alpha }}{3} + {{\left( { - \frac{1}{2}} \right)}^k}\left( {\beta - \frac{{\pi - \alpha }}{3}} \right)} \right)}}{{\sin \alpha + \sin \left( {\frac{{\pi - \alpha }}{3} + {{\left( { - \frac{1}{2}} \right)}^k}\left( {\beta - \frac{{\pi - \alpha }}{3}} \right)} \right)}}$$<br />For all values of $\alpha$ and $\beta$.<br /><br /><hr /><br />Well, have we truly discovered something new? <br /><br />Turns out, no. It doesn't even matter what $x_k(\alpha,\beta)$ is, really -- the identity $\sum\limits_{k = 0}^\infty {\left[ \left(1-x_k(\alpha,\beta)\right)\prod\limits_{j = 0}^{k - 1} {x_j(\alpha,\beta)} \right]} = 1$ will always be true. Indeed, it is a telescoping sum:<br /><br />$$\begin{gathered}<br />1 - {x_0} + \hfill \\<br />\left( {1 - {x_1}} \right){x_0} + \hfill \\<br />\left( {1 - {x_2}} \right){x_0}{x_1} + \hfill \\<br />\left( {1 - {x_3}} \right){x_0}{x_1}{x_2} + \hfill \\<br />... = 1 \hfill \\ <br />\end{gathered} $$<br />All that is required is that the final term, $x_0x_1x_2x_3...x_k$ approaches 0 as $k\to\infty$ -- <a href="https://thewindingnumber.blogspot.com/2018/07/intuition-to-convergence.html">this ensures sum convergence</a>. (So I suppose I was not completely right when I said it doesn't matter what $x_k$ is -- but considering renormalisation and stuff, I kinda was.)<br /><br />This raises two interesting questions:<br /><ol><li>How would this "telescoping sum" argument work for the simple geometric series?</li><li>Can we get interesting incorrect (? perhaps renormalisations) sums by choosing an $x_k$ sequence whose product doesn't approach zero?</li></ol><br />Well, for the geometric series we had $\beta = (\pi - \alpha )/3$ so that ${x_k}(\alpha ,\beta ) = x(\alpha,\beta)=\frac{{\sin \beta }}{{\sin \alpha + \sin \beta }}$. Indeed, one may confirm that setting $x_0=x_1=x_2=...$ yields the product of the geometric series and $1-x$, and that happens to be telescoping. This is really just our standard proof of the series, where we multiply the sum by $x$, subtract this from the original sum, etc.<br /><br />As for the second question -- consider, for example, $x_k=k+1$. It gives you the sum $1!\cdot1+2!\cdot2+3!\cdot3+...=-1$. Of course, this is just the identity $n\cdot n!=(n+1)!-n!$, and the telescope doesn't really cancel out so you're left with $\infty!-1$.blogdivergent sumsgeometric proofgeometric seriesinfinite seriesrenormalizationtelescoping sumSun, 29 Jul 2018 06:24:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1357005204593986817Abhimanyu Pallavi Sudhir2018-07-29T06:24:00ZProbability of immortality for a transhuman being
https://thewindingnumber.blogspot.com/2018/07/probability-of-immortality-transhuman.html
0Some species in nature, including the <i>Turritopsis dohrnii </i>"the immortal jellyfish" are <i>biologically immortal</i>. This means that they do not die due to biological reasons -- however, they obviously may die due to other physical reasons, like getting smashed with a hammer. If I asked you to calculate what the probability of a biologically immortal species being truly immortal -- i.e. of it <i>never dying</i> (ever) -- would be, what'd you answer?<br /><br />Well, obviously the probability is zero. Provided there is any chance at all of the jellyfish getting squashed by a hammer this year, with a sufficient amount of time you can be as certain as you want -- the probability can be as close to 1 as you want -- that the jellyfish will get squashed by a hammer.<br /><br />But what if the probability of getting smashed by a hammer in that year was <i>decreasing </i>with time? Perhaps this is not the case with jellyfish, but it certainly would be true for, e.g. a transhuman society where technological innovation continually decreases the probability of dying (to be precise, the probability density of being dead in the next interval of time $\Delta t$ given that you haven't already died).<br /><br />Let $p(t)\Delta t$ be the probability of our transhuman dying between $t$ and $t+\Delta t$. Then the probability of the transhuman <i>never</i> dying any time from 0 to infinity is:<br /><br />\[\begin{gathered}<br /> P = \left( {1 - p(0)\Delta t} \right)\left( {1 - p(\Delta t)\Delta t} \right)\left( {1 - p(2\Delta t)\Delta t} \right)... \\<br /> = \coprod\limits_{t = 0}^\infty {\left( {1 - p(t)dt} \right)} \\<br />\end{gathered} \]<br />Of course, we need to take the limit as $\Delta t \to 0$.<br /><br />The reason this problem is so interesting is because it introduces the idea of <i>multiplicative calculus</i>. If the product had been a sum, the solution would've been utterly, ridiculously straightforward. But since it's not, it's only really ridiculously staightforward. The natural way (no pun intended) to convert a product (we use the symbol \(\coprod {} \) to refer to the <i>multiplicative integral</i>) into a sum (or rather an integral) is to take the logarithm:<br /><br />\[\begin{gathered}<br />\ln P = \ln \left( {1 - p(0)\Delta t} \right) + \ln \left( {1 - p(\Delta t)\Delta t} \right) + \ln \left( {1 - p(2\Delta t)\Delta t} \right)... \\<br />= \int_0^\infty {\ln \left( {1 - p(t)dt} \right)} \\<br />\end{gathered} \]<br />This may look awkward to you -- and indeed, the standard form of the multiplicative integral typically has the $dt$ differential as the exponent of the integrand so as to obtain after taking the logarithm the additive integral in its standard form.<br /><br />But you might remember that<br /><br />\[\ln (1 - x) = - x - \frac{{{x^2}}}{2} - \frac{{{x^3}}}{3} - ...\]<br />Or to first-order in $x$ (since the "x" here, $p(t)dt$ approaches 0), $\ln (1 - x) \approx - x$. Thus:<br /><br />\[\ln P = - \int_0^\infty {p(t)dt} \]<br />Or:<br /><br />\[P = {e^{ - \int_0^\infty {p(t)dt} }}\]<br />Which is pretty neat! Interestingly, this means that if the integral of $p(t)$ diverges (e.g. if $p(t)\sim1/t$), you are <i>guaranteed</i> to eventually die. So this gives mankind a manual of how fast technological progress on this issue needs to be in the transhuman age to guarantee immortality. Internalise it in your demand, fellow robot!<br /><hr/><div class="twn-furtherinsight">Here, we've calculated the probability of <em>immortality</em>. The probability of eventual <em>mortality</em> is of course 1 minus this, but could also be calculated from the get go -- try this out. You'll get $P'=1-\int_0^\infty p(t) e^{-\int_0^t p(\tau) d\tau}dt$, which you can then simplify with a variable substitution. Perhaps this gives you some insight into variable substitutions in integrals of this sort.</div>biological immortalityimmortalitymultiplicative calculusprobabilitytranshumanismSat, 28 Jul 2018 18:05:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1083934993852776864Abhimanyu Pallavi Sudhir2018-07-28T18:05:00ZComment by Abhimanyu Pallavi Sudhir on Equation of everything
https://physics.stackexchange.com/questions/77663/equation-of-everything/77722#77722
What the heck is this targeted downvoting whenever I edit my old answers? How does this not count as serial downvoting?Wed, 25 Jul 2018 19:01:57 GMThttps://physics.stackexchange.com/questions/77663/equation-of-everything/77722?cid=939694#77722Abhimanyu Pallavi Sudhir2018-07-25T19:01:57ZIntuition to convergence
https://thewindingnumber.blogspot.com/2018/07/intuition-to-convergence.html
2We've all seen these kinds of sums. You start with something obviously divergent, like:<br /><br />$$S = 1 + 2 + 4 + 8 + ...$$<br />And then apply standard manipulations on it to obtain a bizarrely finite result:<br /><br />$$\begin{gathered}<br /> \Rightarrow S = 1 + 2(1 + 2 + 4 + ...) \hfill \\<br /> \Rightarrow S = 1 + 2S \hfill \\<br /> \Rightarrow S = - 1 \hfill \\<br />\end{gathered} $$<br />How exactly is this result to be interpreted? Surely the definition of an infinite sum is as a limit of a finite sum as the upper limit increases without bound -- by this definition it would seem that $S$ evidently doesn't approach $-1$, it diverges to infinity. Is there, then, something wrong with the form of our argument? And if so, why does it seem to work for so many other sums, like convergent geometric progressions?<br /><br /><hr /><br />We'll get to all that in a moment, but first, let's talk about how to fold a tie into thirds. We know how to fold a tie -- or a strip of paper or a rope or whatever -- into halves, into quarters, into any power of two. But how would one fold it into thirds? Sure, we can approximate it by trial and error, but is there a more efficient algorithm to approximate it?<br /><br />Here's one way: start with some approximation to 1/3 of the tie -- any approximation, however good or bad. Now consider the rest of the tie (~2/3) and fold it in half. Take one of these halves -- <i>this is demonstrably a better approximation to 1/3 than your original</i>. In fact, the error in this approximation is exactly half the error in the original approximation. You can keep repeating this process, and approach an arbitrarily close value to 1/3.<br /><br />Why does it work? Well, it's obvious why it works. More interestingly, how could one have come up with this technique from scratch?<br /><br />The key insight here is that if you had started from exactly 1/3 and performed this algorithm, defined as $x_{n+1}=\frac12(1-x_n)$, the sequence would be constant -- it would be 1/3s all the way down.<br /><br />However, this is <i>not</i> a sufficient argument. For instance, here's another sequence of which 1/3 is a fixed point: the algorithm $x_{n+1}=1-2x_n$. However here, if you were to start with <i>any other number but 1/3, the sequence would not approach 1/3</i>, but rather diverge away. While 1/3 is still a fixed point, this is an <i>unstable</i> fixed point, while in the previous case it was a <i>stable</i> fixed point.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.imgur.com/U6M2g19.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="333" data-original-width="800" height="166" src="https://i.imgur.com/U6M2g19.png" width="400" /></a></div><br />But what exactly is wrong with extending the same argument to $x_{n+1}=1-2x_n$? Well, perhaps we should state the argument precisely in the case of $x_{n+1}=\frac12(1-x_n)$. The reason we know this converges to 1/3 regardless of the initial value is that 1/3 is the <i>only</i> value which stays the same in the algorithm (i.e. is a steady-state solution). Convergence of the sequence requires that the sequence the fluctuations get smaller, i.e. the sequence approaches a value that doesn't fluctuate around, it approaches a steady state.<br /><br />But this reveals our central assumption -- we <i>assumed</i> that the sequence is convergent at all! If it is convergent, then 1/3 is the only value it could converge to, because convergence means approaching a steady state, and 1/3 is the only steady state.<br /><br /><hr /><br />The same principle applies to our original problem -- an infinite series is also a sequence, a sequence of partial sums. Our mistake is really in this step:<br /><br />$$\begin{gathered}<br /> ... \hfill \\<br /> 1 + 2(1 + 2 + 4 + 8 + ...) = 1 + 2S \hfill \\<br />\end{gathered} $$<br />By declaring that this is the same $S$, we have assumed that this sum really has a value. To be even clearer, consider this (taking $n\to\infty$):<br /><br />$$\begin{gathered}<br /> S = 1 + 2 + 4 + ... + {2^n} \hfill \\<br /> S = 1 + 2(1 + 2 + 4 + ... + {2^{n - 1}}) = 1 + 2S? \hfill \\<br />\end{gathered} $$<br />In other words, we assumed that $S$ reaches a steady state, that removing the last term $2^n$ wouldn't change the value of the summation. This would've been true if we were dealing with $(1/2)^n$ instead, because then the partial sum does reach a steady state, since its "derivative", $(1/2)^n$, approaches approaches 0.<br /><br /><b>With that said,</b> the sum $1+2+4+8+...=-1$ (and other such surprising results) <i>can</i> in fact be correct. What we've proven here is that <i>if the sum converges, it converges to -1</i>. Otherwise, it's $2^\infty -1$. If you can construct an axiomatic system in which the sum does converge, where 0 behaves like $2^\infty$ in some specific sense, then the identity would be true. Such a system does in fact exist, it's called the 2-adic system.<br /><br /><hr /><br />You know, there is a sense in which you can understand the 2-adic system. When you take partial sums of $1+2+4+8+...$, you always get sums that are "1 less than a power of 2". $1+2+4+8+16=2^5-1$, for example -- what's the significance of $2^5$? Well, it's a number which 2 divides into 5 times. What's a number that 2 divides into an infinite number of times? Well, it's zero, and $0-1=-1$. Ths might sound like a ridiculous argument, and indeed it is false in our conventional algebra system, but the foundation of the 2-adic system.<br /><br /><div class = "twn-furtherinsight">Explain similarly why $1+3+9+27+...=-1/2$ in the 2-adic system.</div><br /><hr /><br />The understanding of convergence we gained here -- from the tie example -- was pretty fantastic. It applies to all sorts of infinite sequences -- ordinary recurrences, (such in the form of) infinite series, continued fractions, etc. The idea of stable and unstable fixed points is a general one, and a very important one. Recommended watching:<br /><br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/CfW845LNObM/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/CfW845LNObM?feature=player_embedded" width="320"></iframe></div>calculusconvergencedivergent sumsinfinite sequencesinfinite seriesmathematicsrenormalizationSun, 22 Jul 2018 16:26:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7661817356581994768Abhimanyu Pallavi Sudhir2018-07-22T16:26:00ZThe validity of Newton's three laws today
https://thewindingnumber.blogspot.com/2018/07/the-validity-of-newtons-three-laws-today.html
0It's often said that Newtonian physics is outdated and its laws are in fact incorrect. That's true, but it's intriguing to think about in what way exactly Newton's three laws have been replaced or generalised in relativity.<br /><ol><li>There are two ways to think about the first law -- the first is "inertial reference frames exist". This is unchanged in special relativity, but general relativity generalises the notion with that of geodesics. The law as it is typically stated -- "stuff moves in straight lines on spacetime unless forced" is generalised to the geodesic equation, $\frac{{{d^2}{x^\mu }}}{{d{s^2}}} = - {\Gamma ^\mu }_{\alpha \beta }\frac{{d{x^\alpha }}}{{ds}}\frac{{d{x^\beta }}}{{ds}}$.</li><li>$F=dp/dt$ is generalised to $F=dp/d\tau$ in special relativity, and is replaced by a covariant derivative in general relativity. $F=ma$ has some weirder changes.</li><li>The third law is the conservation of momentum. This is replaced in General Relativity by the statement $\nabla^\mu T_{\mu\nu}=0$ ($\nabla$ instead of $\partial$).</li></ol><ol></ol>bloggeneral relativitynewton's lawsnewtonian mechanicsrelativityspecial relativitySun, 22 Jul 2018 05:24:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7737004677109282224Abhimanyu Pallavi Sudhir2018-07-22T05:24:00ZWhat is calculus-based physics?
https://thewindingnumber.blogspot.com/2018/07/what-is-calculus-based-physics.html
0I dislike this whole “non-calculus physics”/”calculus physics” distinction created in schools, because it degrades mathematics to some kind of a weird tool used in physics.<br /><br />Physics is just the study of the mathematical stuff we do observe — every physical system is a mathematical system on a fundamental level, which for pedagogical purposes and stuff, we often approximate with other mathematical systems (e.g. modelling stuff as rigid bodies, not considering the motion of every single particle within an extended body, neglecting gravity in particle physics, etc.). So <i>of course</i> you will find math being “used” in physics, because physics is mathematics!<br /><br />Physics uses math in the same way that mathematics uses math — like how you “use” differentiability in defining lie groups, or how you “use” calculus and linear algebra in differential geometry, or how you “use” matrices in describing linear transformations, or whatever. Neither the physics, nor the mathematics should be classified or segregated by what mathematical methods, or “math” is used in describing or defining it.<br /><br />You shouldn’t divide physics as “calculus-based” and “non-calculus” for the same reason you don’t divide it into “partial fractions-based” and “non-partial fractions”, or “elementary algebraic” and “non-elementary algebra”, or at a little higher level, “differential geometry-based” and “non-differential geometry-based”.<br /><br />Use whatever tools you have to use! The point of physics is to describe what we observe — aka the universe — as efficiently and conveniently as possible, not to do elementary calculus.<br /><br />There are other, more sensible ways to divide physics — experimental, theoretical and phenomenology — “mathematical physics”, which is basically physics done with as much rigor as you find in the mathematics literature, so you ensure everything you know about physics is consistent and stuff (the physics exists), you know what your underlying assumptions/axioms/postulates (that you must verify empirically) are, etc. — you could define it as “symmetry-based physics” and “non-symmetry based physics”, where the good physics is symmetry-based and the bad physics isn’t, but since Einstein, all physics is symmetry-based, so this is irrelevant today.blogcalculuseducationmath as a fieldmath as a toolphysicsphysics educationscience educationSun, 22 Jul 2018 05:11:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7484346114034678201Abhimanyu Pallavi Sudhir2018-07-22T05:11:00ZNo, Einstein is not overrated
https://thewindingnumber.blogspot.com/2018/07/no-einstein-is-not-overrated.html
0No, this is ridiculous. Einstein is not overrated, and neither is Newton — the general public gets this issue completely right, because Newton and Einstein are remembered not for their direct contributions to physics, but the effect they had on <i>how physics is done</i>.<br /><br />Einstein’s impact in this regard can be compared only to Newton, who turned physics from a field of philosophy into a field of mathematics. There were a few good physicists in Ancient Greece, like Archimedes and Apollonius — and also similar folks in India, China and the Arab civilisation — but it was disorganised, and it got destroyed by the Romans, in the case of Greece. It was due to Newton that the good folks got taken seriously, and the idiots, like Aristotle, got discarded.<br /><br />Einstein had a very similar effect on physics, by forcing people to accept logical positivism. By force I don’t mean taking a gun to people’s heads and forcing them, or taxing people to fund pro-logical positivism posters or whatever, but you can’t do relativity without accepting logical positivism.<br /><br />If I remember correctly, this is done in the very first section of his 1905 paper on special relativity (read it — it’s remarkable, even if you don’t understand physics — you can read it either as a contemporary work or a historical one, which is very rare for any paper, even Newton’s Principia), where he rejects all meaningless babbling about “is it really ____ or do we just <i>see/feel/</i>… ____?” etc. You don’t need to actually read Carnap, because philosophy is a trivial field, and positivism can be learned in three simple sentences: observers do observe. agents should act. everything else is nonsense. But if you need more convincing, just search for “the elimination of metaphysics by the logical analysis of language”, and you’ll get it.<br /><br />Perhaps his most important contribution in this regard, though, was the establishment of symmetry laws as a defining pillar of physics. You can divide physics as “symmetry-based” and “non-symmetry-based”, and all physics since Einstein onwards is symmetry-based in some form or another. Until Einstein, symmetry was just a cool heuristic you derived from some physical laws — since Einstein, we accept some symmetries (or generalise them, e.g. in the case of Poincaire invariance in GR) and the theory gains its elegance from this. Emmy Noether is also crucial in this aspect, for Noether’s theorem.<br /><br />Einstein is also remembered because relativity, along with quantum mechanics, put the final nail in the coffin for elementary physical intuition. This is a similar role as what Bertrand Russell played in the demise of naive intuition and the adoption of rigor in mathematics.<br /><br />The celebration of Einstein by the general public often seems like giggling over random factoids, like E = m, time dilation or there being a supremum possible speed, but it’s really just their subconscious telling them the above.<br /><br />blogeinsteingeneral relativitylogical positivismrelativityspecial relativitySun, 22 Jul 2018 05:10:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-6092579095080943472Abhimanyu Pallavi Sudhir2018-07-22T05:10:00ZWhy are calculus and linear algebra taught early?
https://thewindingnumber.blogspot.com/2018/07/why-are-calculus-and-linear-algebra.html
0Linear algebra and function theory are related — you can construct plenty of accurate analogies here, like functions and vectors, linear transforms and integral transforms, etc. In addition, the elementary techniques of calculus allow you to talk about non-linear transformations in a pretty nice manner — e.g. the Jacobian matrix as a change-of-basis matrix for non-linear co-ordinate transformations.<br /><br />In general, calculus is just a special case and a “constructivist” kind of way of understanding the much deeper mathematical field of analysis. The calculus of variations, basic complex analysis, matrix calculus, etc. are other examples of this. It’s taught, despite its non-fundamental nature, not only because it locally linearises things with infinitesimals, allowing us to study non-linear things, e.g. in differential geometry, but also because a lot of its results are special cases of purer results in advanced mathematics. Some elementary examples: the chain rule, a special case of a change-in-basis-variables/the Jacobian matrix; the fundamental theorem of calculus and Stokes’ theorem, special cases of the generalised Stokes’ theorem in differential geometry.<br /><br />Linear algebra is taught for similar reasons — it introduces you to a lot of things in algebra, much like how calculus introduces you to a lot of things in analysis. Together, they also introduce you to a lot of things in geometry — largely because the “ideas” behind the two allow us to describe a lot of things in a linear way — completing the algebra-analysis-geometry trinity.<br /><div><br /></div>blogcalculuseducationlinear algebramathematicsmathematics educationscience educationSun, 22 Jul 2018 05:09:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1551319580487629464Abhimanyu Pallavi Sudhir2018-07-22T05:09:00ZComment by Abhimanyu Pallavi Sudhir on Are contravariant basis vectors and basis 1-forms identical?
https://physics.stackexchange.com/questions/418149/are-contravariant-basis-vectors-and-basis-1-forms-identical/418281#418281
What's your point? You can define any term to mean anything, I don't see how this refutes anything I said. But this wasn't exactly the point of your original question.Fri, 20 Jul 2018 14:36:36 GMThttps://physics.stackexchange.com/questions/418149/are-contravariant-basis-vectors-and-basis-1-forms-identical/418281?cid=937543#418281Abhimanyu Pallavi Sudhir2018-07-20T14:36:36ZComment by Abhimanyu Pallavi Sudhir on Are contravariant basis vectors and basis 1-forms identical?
https://physics.stackexchange.com/questions/418149/are-contravariant-basis-vectors-and-basis-1-forms-identical/418281#418281
OF COURSE they are vectors in the abstract mathematical sense, in this sense tensors are also vectors. But "vector" in the sense of relativity typically means a (1, 0) tensor.Fri, 20 Jul 2018 12:03:53 GMThttps://physics.stackexchange.com/questions/418149/are-contravariant-basis-vectors-and-basis-1-forms-identical/418281?cid=937491#418281Abhimanyu Pallavi Sudhir2018-07-20T12:03:53ZComment by Abhimanyu Pallavi Sudhir on Are contravariant basis vectors and basis 1-forms identical?
https://physics.stackexchange.com/questions/418149/are-contravariant-basis-vectors-and-basis-1-forms-identical/418281#418281
Basis vectors and basic one-forms are just the bases for vectors and one-forms respectively. It's not at all uncommon to see vectors and one-forms interact, and they indeed interact differently when the metric isn't Euclidean.Fri, 20 Jul 2018 03:41:35 GMThttps://physics.stackexchange.com/questions/418149/are-contravariant-basis-vectors-and-basis-1-forms-identical/418281?cid=937395#418281Abhimanyu Pallavi Sudhir2018-07-20T03:41:35ZComment by Abhimanyu Pallavi Sudhir on Ubuntu 17.04 Chromium Browser quietly provides full access to Google account
https://askubuntu.com/questions/915556/ubuntu-17-04-chromium-browser-quietly-provides-full-access-to-google-account
Me too. This is weird. Even if it's just the Chrome browser, I don't see why they'd need <i>full</i> access to my Google account. Windows doesn't do this.Sat, 14 Jul 2018 17:11:46 GMThttps://askubuntu.com/questions/915556/ubuntu-17-04-chromium-browser-quietly-provides-full-access-to-google-account?cid=1726608Abhimanyu Pallavi Sudhir2018-07-14T17:11:46ZComment by Abhimanyu Pallavi Sudhir on How to create folder shortcut in Ubuntu 14.04?
https://askubuntu.com/questions/486461/how-to-create-folder-shortcut-in-ubuntu-14-04/691976#691976
@jave.web Yes -- use the application menu (either at the top left of your screen or a colourful icon next to the window controls) to go to your Nautilus preferences, then under "Behavior" enable link creation.Fri, 13 Jul 2018 11:49:02 GMThttps://askubuntu.com/questions/486461/how-to-create-folder-shortcut-in-ubuntu-14-04/691976?cid=1724793#691976Abhimanyu Pallavi Sudhir2018-07-13T11:49:02ZComment by Abhimanyu Pallavi Sudhir on Minimum $e$ where $a,b,c,d,e$ are reals such that $a+b+c+d+e=8$ and $a^2+b^2+c^2+d^2+e^2=16$
https://math.stackexchange.com/questions/2841325/minimum-e-where-a-b-c-d-e-are-reals-such-that-abcde-8-and-a2b2c2
How does finding a higher maximum value amount to a sharper inequality? If anything, it's the opposite.Thu, 05 Jul 2018 09:39:34 GMThttps://math.stackexchange.com/questions/2841325/minimum-e-where-a-b-c-d-e-are-reals-such-that-abcde-8-and-a2b2c2?cid=5860055Abhimanyu Pallavi Sudhir2018-07-05T09:39:34ZComment by Abhimanyu Pallavi Sudhir on Solving the infinite radical $\sqrt{6+\sqrt{6+2\sqrt{6+3\sqrt{6+...}}}}$
https://math.stackexchange.com/questions/2839527/solving-the-infinite-radical-sqrt6-sqrt62-sqrt63-sqrt6/2839607#2839607
Yes, I know, but there you would actually have a linear function. Here it's only an approximation.Tue, 03 Jul 2018 16:33:44 GMThttps://math.stackexchange.com/questions/2839527/solving-the-infinite-radical-sqrt6-sqrt62-sqrt63-sqrt6/2839607?cid=5856107#2839607Abhimanyu Pallavi Sudhir2018-07-03T16:33:44ZComment by Abhimanyu Pallavi Sudhir on Solving the infinite radical $\sqrt{6+\sqrt{6+2\sqrt{6+3\sqrt{6+...}}}}$
https://math.stackexchange.com/questions/2839527/solving-the-infinite-radical-sqrt6-sqrt62-sqrt63-sqrt6
@user90369 I ran a <a href="https://www.khanacademy.org/computer-programming/run-tests/5147953190961152" rel="nofollow noreferrer">quick program</a> to check it out (and the value it gives is right to all the given decimal places) -- unfortunately, yours goes wrong at the third decimal place. It's definitely not $\sqrt{10}$, though.Tue, 03 Jul 2018 16:14:45 GMThttps://math.stackexchange.com/questions/2839527/solving-the-infinite-radical-sqrt6-sqrt62-sqrt63-sqrt6?cid=5856049Abhimanyu Pallavi Sudhir2018-07-03T16:14:45ZComment by Abhimanyu Pallavi Sudhir on Solving the infinite radical $\sqrt{6+\sqrt{6+2\sqrt{6+3\sqrt{6+...}}}}$
https://math.stackexchange.com/questions/2839527/solving-the-infinite-radical-sqrt6-sqrt62-sqrt63-sqrt6/2839607#2839607
I know you can do this, but this is just an approximation, and isn't particularly more useful than just cutting off the nesting at $6+100\sqrt{6}$, or something like that.Tue, 03 Jul 2018 16:05:51 GMThttps://math.stackexchange.com/questions/2839527/solving-the-infinite-radical-sqrt6-sqrt62-sqrt63-sqrt6/2839607?cid=5856009#2839607Abhimanyu Pallavi Sudhir2018-07-03T16:05:51ZComment by Abhimanyu Pallavi Sudhir on Solving the infinite radical $\sqrt{6+\sqrt{6+2\sqrt{6+3\sqrt{6+...}}}}$
https://math.stackexchange.com/questions/2839527/solving-the-infinite-radical-sqrt6-sqrt62-sqrt63-sqrt6/2839651#2839651
This is evidently incorrect -- to use that expression, you need $ax+(n+a)^2=6$, $a(x+n)+(n+a)^2=6$, $a(x+2n)+(n+a)^2=6$, etc. -- only the first of these is true when you set $a=1$ (or any non-zero number). The infinite radical you are evaluating is $\sqrt {6 + 2\sqrt {7 + 3\sqrt {8 + 4\sqrt {9...} } } }$ ... This is why the question is tougher than Ramanujan's original problem, $a$ must equal zero for the term "6" to remain constant, but that isn't possible.Tue, 03 Jul 2018 16:04:45 GMThttps://math.stackexchange.com/questions/2839527/solving-the-infinite-radical-sqrt6-sqrt62-sqrt63-sqrt6/2839651?cid=5856007#2839651Abhimanyu Pallavi Sudhir2018-07-03T16:04:45ZSolving the infinite radical $\sqrt{6+\sqrt{6+2\sqrt{6+3\sqrt{6+...}}}}$
https://math.stackexchange.com/questions/2839527/solving-the-infinite-radical-sqrt6-sqrt62-sqrt63-sqrt6
20<p>$$\sqrt{6+\sqrt{6+2\sqrt{6+3\sqrt{6+\cdots}}}}$$</p>
<p>This is a modification on the well-known Ramanujan infinite radical, $\sqrt{1+\sqrt{1+2\sqrt{1+3\sqrt{1+\cdots}}}}$, except it cannot be solved by the conventional method -- the functional equation $F(x)^2=ax+(n+a)^2+xF(x+n)$, since setting $n=1$ with $a=0$ requires having $(n+a)^2=1$, not $6$.</p>
<p>Here are some alternative methods I've tried:</p>
<ul>
<li>The functional equation we have instead for this infinite radical is $F(x)^2=6+xF(x+1)$. I've tried to solve this, but unfortunately it's easy to demonstrate that $F(x)$ cannot be a simple linear function $F(x)=ax+b$. I've tried some slightly more complicated versions -- the equation for a hyperbola, etc. -- but nothing seems to work.</li>
<li>I've tried factoring stuff out from the radical to bring it to a more tenable form. Perhaps not a satisfactorily rigorous approach, I thought of factoring out $\sqrt{6^{N/2}}$ where $N\to\infty$, which allows us to transform the radical into $6^{-N/2}\sqrt{6^{N+1}+\sqrt{6^{2N+1}+2\sqrt{6^{4N+1}+\cdots}}}$, which can be treated as having each term a power of $6^{N/2}$ in the limit. For a radical of the form $\sqrt{\alpha^2+\sqrt{\alpha^4+2\sqrt{\alpha^8+\cdots}}}$ we have the functional equation $F(x)^2=\alpha^{2^x}+xF(x+1)$, or upon letting $F(x)=\alpha^{2^x}p(x)$, you get $p(x)^2-xp(x+1)=\alpha^{-2^x}$, but I'm stuck there.</li>
<li>Similarly, I tried factoring out some arbitrary $N$ then factoring out a term from each radical inside such that the coefficients go from being $1,2,3,\cdots$ to a constant $1/N,1/N,1/N...$, transforming the radical into $N\sqrt{\frac6{N^2}+\frac1N\sqrt{\frac6{N^2}+\frac1N\sqrt{\frac{24}{N^2}+\frac1N\sqrt{\frac{864}{N^2}+\frac1N\sqrt{\frac{1990656}{N^2}+\cdots}}}}}$ where the added terms go as $k_1=6$, $k_{n+1}=\frac{n^2}6k_n^2$. But how might one proceed?</li>
<li>I considered differentiating the function $G(x)=\sqrt{x+\sqrt{x+2\sqrt{x+3\sqrt{x+\cdots}}}}$. But all I got was an equally weird differential equation:</li>
</ul>
<p>$$\frac{df}{dx}=\frac{1+\frac{1+\frac{1+\frac{{\mathinner{\mkern2mu\raise1pt\hbox{.}\mkern2mu \raise4pt\hbox{.}\mkern2mu\raise7pt\hbox{.}\mkern1mu}}}{\frac23\frac{\left(\frac{\left(f(x)^2-x\right)^2-x}{2}\right)^2-x}{3}}}{\frac22\frac{\left(f(x)^2-x\right)^2-x}{2}}}{\frac21\left(f(x)^2-x\right)}}{2f(x)}$$</p>
<p>Any ideas as to how I might proceed?/Any alternative (hopefully less tedious, but regardless) methods that might work?</p>
<hr>
<p>I created a <a href="https://www.khanacademy.org/computer-programming/run-tests/5147953190961152" rel="nofollow noreferrer">small program</a> to play with this. The exact answer (perhaps as an infinite series) <em>may</em> contain $\sqrt{6}+1/2+...$ somewhere in it, because as you increase the number replacing 6, the radical approaches $\sqrt{x}+1/2$. Of course, this term just comes from the binomial series for $\sqrt{6+\sqrt{6}}$.</p>
<p>I also got nothing on the inverse symbolic calculator.</p>functional-equationsnested-radicalsTue, 03 Jul 2018 11:34:34 GMThttps://math.stackexchange.com/q/2839527Abhimanyu Pallavi Sudhir2018-07-03T11:34:34ZComment by Abhimanyu Pallavi Sudhir on How to customize (add/remove folders/directories) the "Places" menu of Ubuntu 13.04 "Files" application?
https://askubuntu.com/questions/285313/how-to-customize-add-remove-folders-directories-the-places-menu-of-ubuntu-13/292727#292727
This works. If you also want to remove the folders from the home directory, edit user-dirs.defaults, or make a copy of it in .config and edit there (for your local user).Sat, 30 Jun 2018 09:00:13 GMThttps://askubuntu.com/questions/285313/how-to-customize-add-remove-folders-directories-the-places-menu-of-ubuntu-13/292727?cid=1716388#292727Abhimanyu Pallavi Sudhir2018-06-30T09:00:13ZComment by Abhimanyu Pallavi Sudhir on How to safely remove default folders?
https://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964#140964
See <a href="https://askubuntu.com/questions/285313/how-to-customize-add-remove-folders-directories-the-places-menu-of-ubuntu-13">here</a> for a working solution. If you also want to remove the folders from the home directory, edit user-dirs.defaults, or make a copy of it in .config and edit there (for your local user).Mon, 25 Jun 2018 06:08:26 GMThttps://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964?cid=1713336#140964Abhimanyu Pallavi Sudhir2018-06-25T06:08:26ZComment by Abhimanyu Pallavi Sudhir on How to safely remove default folders?
https://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964#140964
Doesn't work -- even if you don't run the update command, it gets updated upon the next reboot. There must be a more fundamental file in which these directory names are kept.Mon, 25 Jun 2018 05:32:16 GMThttps://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964?cid=1713326#140964Abhimanyu Pallavi Sudhir2018-06-25T05:32:16ZAnswer by Abhimanyu Pallavi Sudhir for Intuition behind Chebyshev's inequality
https://math.stackexchange.com/questions/1344734/intuition-behind-chebyshevs-inequality/2814492#2814492
0<p>First of all, putting $\mu$'s and $\sigma$'s all over is ridiculous -- like natural units in physics, let's set $\mu=0$ and $\sigma=1$. Then Chebyshev's inequality states that:</p>
<p>$$P(|X|>k)\leq1/k^2$$</p>
<p>I.e. for a distribution to have a unit standard deviation, there is a natural limit on how much of the distribution can be some given amount $k>1$ away from the mean. This isn't particularly surprising -- as you add stuff to the distribution outside standard deviation range, you inevitably increase the standard deviation. In order to keep the standard deviation at 1, you need to squeeze the things inside the standard deviation range and reduce <em>their</em> deviation, so the overall deviation stays at 1.</p>
<p>But there's got to be a limit on how much you can reduce the total deviation, right? You can't make the contribution of the things inside to the deviation <em>negative</em> -- it can only go down to zero. So what exactly is this limiting case?</p>
<p>Well, at the limiting case, you have a Dirac delta at $X=0$ and two Dirac deltas at $X=k$ and $X=-k$. What's the maximum height of these two Dirac deltas? Answer this, and you have Chebyshev's inequality.</p>
<p><a href="https://i.stack.imgur.com/xBf2P.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/xBf2P.png" alt="enter image description here"></a></p>
<p>The standard deviation of this distribution is --</p>
<p>$$\sqrt{\frac{p}2(-k)^2+\frac{p}2k^2}=k\sqrt{p}$$</p>
<p>We set this equal to 1, and we get $p=1/k^2$ as the maximum amount of stuff you can pile at $k$ and beyond even if you use the distribution with the least possible standard deviation. </p>
<p>(A full proof would consider the possibility of an asymmetric distribution with a $p$ stick at $X=k$ and a $1-p$ stick at $X=-\frac{p}{1-p}k$, but it turns out the limit on $p$ is even stricter, at $p\leq\frac1{k^2+1}$.)</p>Sun, 10 Jun 2018 11:50:04 GMThttps://math.stackexchange.com/questions/1344734/-/2814492#2814492Abhimanyu Pallavi Sudhir2018-06-10T11:50:04ZAnswer by Abhimanyu Pallavi Sudhir for units in math, cross product
https://math.stackexchange.com/questions/1449396/units-in-math-cross-product/2813670#2813670
0<p>Yes -- using a vector to write the cross product is just a 3D convention to remind you it has 3 independent components. It's more naturally represented as a bivector or as a rank-2 tensor, but there's a duality between these and vectors in 3D.</p>Sat, 09 Jun 2018 15:36:37 GMThttps://math.stackexchange.com/questions/1449396/-/2813670#2813670Abhimanyu Pallavi Sudhir2018-06-09T15:36:37ZAnswer by Abhimanyu Pallavi Sudhir for What makes the Cauchy principal value the "correct" value for a integral?
https://math.stackexchange.com/questions/2450848/what-makes-the-cauchy-principal-value-the-correct-value-for-a-integral/2806611#2806611
0<p>It isn't the "correct value" for the integral any more than the principal root is the correct value for a root or the principal logarithm is the correct value for a logarithm, or setting $C=0$ gives you the correct value of the antiderivative. There are plenty of other values the integral can take, depending on how you take the limit. See my answer to <a href="https://math.stackexchange.com/a/2805722/78451">Why can't $\int_{-1}^1\frac{dx}x$ be evaluated?</a></p>Sun, 03 Jun 2018 14:09:34 GMThttps://math.stackexchange.com/questions/2450848/-/2806611#2806611Abhimanyu Pallavi Sudhir2018-06-03T14:09:34ZAnswer by Abhimanyu Pallavi Sudhir for Why can't $\int_{-1}^1{\frac{dx}{x}}$ be evaluated?
https://math.stackexchange.com/questions/1511181/why-cant-int-11-fracdxx-be-evaluated/2805722#2805722
1<p>I don't know about you, but when I was first introduced to the antiderivative of $1/x$, I was pretty confused. It made sense that the answer was $\ln(x)+C$, but changing this to $\ln|x|+C$ seems to make no sense. It's justified as being just the addition of a constant, $i\pi$, and indeed you can verify the derivative is the same (still $1/x$), but it seems instead that you're only adding $i\pi$ to the function for $x<0$, and nothing when $x>0$. In other words, you're not actually adding a constant at all, but rather the function $i\pi\left(1-H(x)\right)$ (where $H(x)$ is a unit step at 0).</p>
<p>Indeed, this kind of thing wouldn't be OK if we were dealing with ordinary continuous functions -- if you added a constant to one point of the function, that would affect the values of every other point in the function (one way of demonstrating this is the Taylor series). But since $\ln(x)$ has a singularity at $x=0$, the derivative is not defined at that point <em>anyway</em>, so one side of the function can be independent of the other.</p>
<hr>
<p>Why is this important? Well, consider evaluating the integral</p>
<p>$$\int_{-1}^1\frac{dx}x$$</p>
<p>Now, the presence of the singularity would not <em>directly</em> make it a bad idea to use the fundamental theorem of calculus to evaluate this integral (look <a href="https://math.stackexchange.com/questions/2193723/how-to-fix-int-11-frac-dxx2-with-complex-numbers/2804637#2804637">here</a> to understand where it would directly be a bad idea) -- or at least it wouldn't necessarily be, if you choose $\ln|x|$ as the antiderivative, since then the antiderivative would come back from infinity in the same direction, so the same overall path is transversed.</p>
<p><a href="https://i.stack.imgur.com/LtUEi.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/LtUEi.png" alt="enter image description here"></a>
<em>If you don't understand this argument, go look at the link above!</em></p>
<p>But because there are multiple consistent antiderivatives, we <em>still</em> can't apply the fundamental theorem of calculus. For instance,</p>
<p>$$\begin{array}{l}\ln \left| 1 \right| - \ln \left| { - 1} \right| = 0\\\ln \left( 1 \right) - \ln \left( { - 1} \right) = - i\pi \end{array}$$</p>
<p>So instead you define a <em>principle value</em>, skirting around the singularity by taking</p>
<p>$$\int_{-1}^{-\epsilon}\frac{dx}x+\int_{\epsilon}^{1}\frac{dx}x=\ln(\epsilon)-\ln(\epsilon)=0$$</p>
<p>This is okay, because regardless of whether you use $\ln(x)$ or $\ln|x|$ as the antiderivative, the arbitrary constants cancel out on each side of the singularity. I.e. if you have a $+i\pi$ term to the left of the singularity, this exists for both $-1$ and $-\epsilon$, and thus cancels out as in any ordinary integral.</p>
<p>In this sense, $\ln\left|x\right|$ can be considered the "principal antiderivative" of $1/x$. But there's nothing particularly special about this value. One may take the limit a little differently, so you don't approach 0 at the same rate from both sides, for instance --</p>
<p>$$\int_{ - 1}^{ -\epsilon } {\frac{{dx}}{x}} + \int_{n\epsilon}^1 {\frac{{dx}}{x}} = \ln \left( \epsilon \right) - \ln \left( {n\epsilon} \right) = \ln \left( {\frac{1}{n}} \right)
$$</p>
<p>This is equivalent to cancelling out your areas in a different "order" on the graph, which allows you to leave some remainder area.</p>Sat, 02 Jun 2018 17:59:02 GMThttps://math.stackexchange.com/questions/1511181/why-cant-int-11-fracdxx-be-evaluated/2805722#2805722Abhimanyu Pallavi Sudhir2018-06-02T17:59:02ZComment by Abhimanyu Pallavi Sudhir on Verifying This Proof for Alternating Harmonic Series
https://math.stackexchange.com/questions/2804184/verifying-this-proof-for-alternating-harmonic-series/2804190#2804190
@JoséCarlosSantos Yes, this is the binomial series for $(1+1)^n$. It becomes alternating after a finite number of terms (once $k > n$), and the absolute value is then decreasing and approaches 0, so it converges. This is fairly trivial -- "trivial" has no magical properties, but true nonetheless.Sat, 02 Jun 2018 04:50:07 GMThttps://math.stackexchange.com/questions/2804184/verifying-this-proof-for-alternating-harmonic-series/2804190?cid=5783530#2804190Abhimanyu Pallavi Sudhir2018-06-02T04:50:07ZComment by Abhimanyu Pallavi Sudhir on Verifying This Proof for Alternating Harmonic Series
https://math.stackexchange.com/questions/2804184/verifying-this-proof-for-alternating-harmonic-series/2804190#2804190
That what converges? The binomial series clearly converges for all $n$ close to 0 when the base is 2.Fri, 01 Jun 2018 18:21:14 GMThttps://math.stackexchange.com/questions/2804184/verifying-this-proof-for-alternating-harmonic-series/2804190?cid=5782501#2804190Abhimanyu Pallavi Sudhir2018-06-01T18:21:14ZComment by Abhimanyu Pallavi Sudhir on Verifying This Proof for Alternating Harmonic Series
https://math.stackexchange.com/questions/2804184/verifying-this-proof-for-alternating-harmonic-series
I agree in general, but the proof in the question is quite clearly straightforward and correct -- showing that this is true is just a question of some degree of formalisation.Fri, 01 Jun 2018 18:17:17 GMThttps://math.stackexchange.com/questions/2804184/verifying-this-proof-for-alternating-harmonic-series?cid=5782489Abhimanyu Pallavi Sudhir2018-06-01T18:17:17ZAnswer by Abhimanyu Pallavi Sudhir for How to "fix" $\int_{-1}^1 \frac {dx}{x^2}$ with complex numbers?
https://math.stackexchange.com/questions/2193723/how-to-fix-int-11-frac-dxx2-with-complex-numbers/2804637#2804637
0<p>When you first looked at an integral like $\int_{-1}^{1}dx/x^2$, your instinct was to apply the fundamental theorem of calculus, and evaluate $[-1/x]|_{-1}^1$. This answer was clearly <em>wrong</em>, but why? Why does having a singularity in between screw up the fundamental theorem of calculus?</p>
<p>Well, you might have the intuition for the fundamental theorem of calculus as having to do with, e.g. a disk expanding outwards, and the derivative of the area being the circumference -- so with some $dr$ added to the radius, $2\pi r\cdot dr$ is added to the area. And with a lot of $dr$'s getting added, the total addition to the circumference -- the integral of $2\pi r$ across the total length of $dr$'s that got added -- is the difference in area over the expansion. </p>
<p><a href="https://i.stack.imgur.com/YPTu4.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/YPTu4.png" alt="enter image description here"></a></p>
<p>But you might imagine a set-up where there's a singularity somewhere in the expansion, so the area suddenly blows up to infinity somewhere in between the expansion, then starts back from 0 <sup><a href="https://i.stack.imgur.com/YPTu4.png" rel="nofollow noreferrer">1</a></sup>. Something's not quite wrong here. Our intuition just broke -- the actual amount of area that got added isn't the same as the total area change here.</p>
<p>Like in our exercise above, let's look at the antiderivative of $1/x^2$ between -1 and 1. And for comparison, we'll keep another function -- a normal, continuous function -- and its antiderivative.</p>
<p><a href="https://i.stack.imgur.com/LuZkI.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/LuZkI.png" alt="enter image description here"></a></p>
<p>And that's the fundamental problem here -- the integral is taking a different path (and if you wrapped the xy-plane around a sphere, you'd actually be able to visualise this path as going all the way to infinity then coming back from behind) from the $F(b)-F(a)$ calculation. $F(b)-F(a)$ is -2, but the path taken by the integral is in fact, $\infty-2$.</p>
<p>On the real line, there's no other path you can take (besides going to infinity and coming back) on which the integral is valid. But if you could just budge the path a bit "out" of the xy-plane, it would be valid, because you wouldn't be going to infinity -- just a really high number. An example way of doing this is to use complex numbers, since the added dimension allows you to draw the path between -1 and 1 a little out of the plane, and take the limit as this "little out" approaches 0. As an example, you can look at the integral:</p>
<p>$$\int_{-1}^1 \frac1{x^2+\epsilon}dx$$</p>
<p>(Explain why this integral may be interpreted as using the complex plane.)</p>
<p>In fact, it is quite natural to use the complex numbers as a way to "poke out" of the real line, since "integrals are done along curves, not between limits" is a central insight from complex calculus.</p>
<p><sup><a href="https://i.stack.imgur.com/YPTu4.png" rel="nofollow noreferrer">1</a></sup> obviously, to be clear on this, we'll need to introduce some concept of time/a parameterisation <em>t</em>, and differentiate with respect to it instead, and claim that there is a singularity in $r(t)$.</p>Fri, 01 Jun 2018 18:12:14 GMThttps://math.stackexchange.com/questions/2193723/how-to-fix-int-11-frac-dxx2-with-complex-numbers/2804637#2804637Abhimanyu Pallavi Sudhir2018-06-01T18:12:14ZComment by Abhimanyu Pallavi Sudhir on Verifying This Proof for Alternating Harmonic Series
https://math.stackexchange.com/questions/2804184/verifying-this-proof-for-alternating-harmonic-series
Yes, this is correct, modulo some rigour. These kinds of things don't just randomly "get lucky".Fri, 01 Jun 2018 14:32:23 GMThttps://math.stackexchange.com/questions/2804184/verifying-this-proof-for-alternating-harmonic-series?cid=5781950Abhimanyu Pallavi Sudhir2018-06-01T14:32:23ZComment by Abhimanyu Pallavi Sudhir on Verifying This Proof for Alternating Harmonic Series
https://math.stackexchange.com/questions/2804184/verifying-this-proof-for-alternating-harmonic-series/2804190#2804190
This is not correct. You can use Newton's binomial theorem, and you'll get the same result. In fact, this is implied in the question, since the series on the right is an infinite series.Fri, 01 Jun 2018 14:24:44 GMThttps://math.stackexchange.com/questions/2804184/verifying-this-proof-for-alternating-harmonic-series/2804190?cid=5781926#2804190Abhimanyu Pallavi Sudhir2018-06-01T14:24:44ZComment by Abhimanyu Pallavi Sudhir on Explaining the Main Ideas of Proof before Giving Details
https://mathoverflow.net/questions/301085/explaining-the-main-ideas-of-proof-before-giving-details
Because good proofs are just a formalisation of the intuitive understanding -- rather than wasting space explaining the insights, you can just give them the proof, and an even somewhat experienced reader can re-create the details.Sun, 27 May 2018 04:28:36 GMThttps://mathoverflow.net/questions/301085/explaining-the-main-ideas-of-proof-before-giving-details?cid=750004Abhimanyu Pallavi Sudhir2018-05-27T04:28:36ZComment by Abhimanyu Pallavi Sudhir on Do eigenvalues depend on the choice of basis?
https://math.stackexchange.com/questions/2795340/do-eigenvalues-depend-on-the-choice-of-basis/2795541#2795541
How is it not, though? The answer is in "the eigenvalues are the same as in the diagonalised form, thus they must be the same as each other".Sat, 26 May 2018 05:04:51 GMThttps://math.stackexchange.com/questions/2795340/do-eigenvalues-depend-on-the-choice-of-basis/2795541?cid=5765604#2795541Abhimanyu Pallavi Sudhir2018-05-26T05:04:51ZAnswer by Abhimanyu Pallavi Sudhir for Do eigenvalues depend on the choice of basis?
https://math.stackexchange.com/questions/2795340/do-eigenvalues-depend-on-the-choice-of-basis/2795541#2795541
8<p>The whole point of eigenvalues and eigenvectors is to produce a bunch of axes that define your skewy transformation, so that your skewy transformation becomes a scaling transformation on these axes. If anything, this <em>gives you a nice basis</em> (one in which your matrix is diagonal, i.e. scaling). Your eigenvalues are clearly the same in the eigenbasis as in any other basis (they're across the diagonal), so the eigenvalues are the same in all bases.</p>Fri, 25 May 2018 12:01:54 GMThttps://math.stackexchange.com/questions/2795340/-/2795541#2795541Abhimanyu Pallavi Sudhir2018-05-25T12:01:54ZComment by Abhimanyu Pallavi Sudhir on $2\times2$ matrices are not big enough
https://math.stackexchange.com/questions/577887/2-times2-matrices-are-not-big-enough/577924#577924
That's a fairly obvious one, though -- rotations happen in planes, and there's only one plane in two dimensions.Sat, 19 May 2018 10:10:35 GMThttps://math.stackexchange.com/questions/577887/2-times2-matrices-are-not-big-enough/577924?cid=5747913#577924Abhimanyu Pallavi Sudhir2018-05-19T10:10:35ZAnswer by Abhimanyu Pallavi Sudhir for Intuition behind speciality of symmetric matrices
https://math.stackexchange.com/questions/1788911/intuition-behind-speciality-of-symmetric-matrices/2780461#2780461
2<p>When you were first learning about null spaces in linear algebra, your guess for the null space -- assuming you had some reasonable geometric intuition into the field -- was that the null space was orthogonal to the column space. After all, that makes sense. If your singular transformation collapses/projects $\mathbb{R}^2$ into a line, then the vectors that get mapped to the origin are the ones perpendicular to the column space.</p>
<p><a href="https://i.stack.imgur.com/2tgry.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/2tgry.png" alt="Is the column space perpendicular to the row space?"></a></p>
<p>Or at least, so it seems -- in reality, though, the projection doesn't need to be so nice and orthogonal. You could, for instance, <em>rotate</em> all vectors in the space by some angle and then collapse it onto a line.</p>
<p>It turns out the null space isn't perpendicular to the column space, but in fact to the <em>row space</em> instead -- these two spaces are only identical for matrices which do not perform a rotation.</p>
<p>This is a very important observation, because it tells you something about the character of matrices -- <strong>asymmetry in a matrix is a measure of how rotation-ish it is</strong>. Specifically, an antisymmetric matrix is the result of 90-degree rotations (like imaginary numbers) and a symmetric matrix is the result of scaling and skews (like real numbers). </p>
<p>$$A = \underbrace {\frac{1}{2}(A + {A^T})}_{\scriptstyle{\rm{symmetric }}\atop\scriptstyle{\rm{part}}} + \underbrace {\frac{1}{2}(A - {A^T})}_{\scriptstyle{\rm{antisymmetric }}\atop\scriptstyle{\rm{part}}}$$</p>
<p>All matrices can bet written as the sum of these two kinds -- a symmetric part and an anti-symmetric part -- much like all complex numbers can be written as the sum of a real part and an imaginary part. And this is fundamentally why symmetric matrices are "special" -- for the same reason that real numbers are special.</p>
<hr>
<p>Notes:</p>
<p>(1) Scaling and skews are actually essentially the same thing, which is why it makes sense to include skews in the group of things that are "essentially real numbers", even though you can't really represent skews with any complex number -- real or otherwise. Skews are just scaling across a different set of axes, called "eigenvectors" (this is also why symmetric matrices have eigenvectors).</p>
<p>(2) My explanation of the analogy (between matrices and complex numbers) is oversimplified -- antisymmetric matrices actually represent 90 degree rotations only, and these rotations can actually be spirals, which means they do scaling too. But the analogy still holds, because this applies to imaginary numbers too (e.g. the complex number $8i$ is a rotation by 90 degrees followed by a scaling by 8). </p>
<p>(3) A more accurate way to phrase the analogy is "the <strong>antisymmetric part</strong> of the matrix operates in a sub-space orthogonal to the vector being transformed while the <strong>symmetric part</strong> operates in the direction of the vector itself, so their sum spans all possible vectors of the target space". In other words, the analogy is to the <strong>Cartesian form</strong> of complex numbers -- you get to represent transformations as linear combinations of the vector itself and vectors orthogonal to it.</p>
<p>(4) It is possible to deal with at least some matrices in a way that corresponds to the <strong>polar forms</strong> of complex numbers -- this is done by representing matrices as products of <strong>symmetric matrices and orthogonal matrices</strong>, much like $re^{i\theta}$ represents complex numbers as products of real numbers and unit complex numbers.</p>Mon, 14 May 2018 08:42:41 GMThttps://math.stackexchange.com/questions/1788911/-/2780461#2780461Abhimanyu Pallavi Sudhir2018-05-14T08:42:41ZComment by Abhimanyu Pallavi Sudhir on Why is 1 not a prime number?
https://math.stackexchange.com/questions/120/why-is-1-not-a-prime-number/1069794#1069794
I don't understand the connection either -- how does this difference matter?Mon, 14 May 2018 04:57:29 GMThttps://math.stackexchange.com/questions/120/why-is-1-not-a-prime-number/1069794?cid=5733543#1069794Abhimanyu Pallavi Sudhir2018-05-14T04:57:29ZAnswer by Abhimanyu Pallavi Sudhir for $xe^{\lambda x}$ is a solution to a simple second order differential equation only if the auxiliary equation has a repeated root, why?
https://math.stackexchange.com/questions/1195644/xe-lambda-x-is-a-solution-to-a-simple-second-order-differential-equation-on/2776859#2776859
0<p>The derivation of this solution is actually so insanely trivial, it's a complete mystery why it isn't taught in most typical introductions to these sorts of polynomial-like differential equations.</p>
<p>In this proof, I'll consider, for simplicity, the differential equation </p>
<p>$$(D-I)(D-rI)y(t)=0$$</p>
<p>And take the limit as $r\to1$. But you can repeat the same process for any arbitrary roots and set them approach each other in a limit, and you'll get the right result.</p>
<p>Most people have the right idea, that you need to take the solution for non-repeated roots, and take the limit as the roots approach each other. This is correct, but it's a mistake to take the limit of the <em>general solution</em> $c_1e^{r_1t}+c_2e^{r_2t}$, which is what most people try to do when they see this problem, and are then puzzled since it gives you a solution space of the wrong dimension.</p>
<p>This is wrong, because $c_1$ and $c_2$ are arbitrary mathematical labels, and have no reason to stay the same as the roots approach each other. You can, however, take the limit while representing the solution in terms of your initial conditions, because these can stay the same as you change the system. You can think of this as a physical system where you change the damping and other parameters to create a repeated-roots system as the initial conditions remain the same -- this is a simple process, but if you instead try to ensure $c_1$ and $c_2$ remain the same, you'll run into infinities and undefined stuff. This is exactly what happens here, <strong>there simply isn't a repeated-roots solution with the same $c_1$ and $c_2$ values, but you obviously do have a system/solution with the same initial conditions</strong>.</p>
<p>So you instead let your initial conditions be $y(0)=a$ and $y'(0)=b$, and express $c_1$ and $c_2$ in terms of them, so you have</p>
<p>$$y(t)=\frac{ra-b}{r-1}e^t-\frac{a-b}{r-1}e^{rt}$$</p>
<p>Then take the limit as $r\to1$. This is the proof itself, and it's simply algebraic manipulation and a little limits -- </p>
<p>$$\begin{array}{c}y(t) = \frac{{\left( {ra - b} \right){e^t} - \left( {a - b} \right){e^{rt}}}}{{r - 1}} = \frac{{\left( {ra - b} \right) - \left( {a - b} \right){e^{(r - 1)t}}}}{{r - 1}}{e^t}\\ = \frac{{(r - 1)a + \left( {a - b} \right) - \left( {a - b} \right){e^{(r - 1)t}}}}{{r - 1}}{e^t}\\ = \left[ {a + \left( {a - b} \right)\frac{{1 - {e^{(r - 1)t}}}}{{r - 1}}} \right]{e^t}\\ = \left[ {a - \left( {a - b} \right)\frac{{{e^{(r - 1)t}} - {e^{0t}}}}{{r - 1}}} \right]{e^t}\\ = \left[ {a - \left( {a - b} \right){{\left. {\frac{d}{{dx}}\left[ {{e^{xt}}} \right]} \right|}_{x = 0}}} \right]{e^t}\\ = \left[ {a - \left( {a - b} \right)t} \right]{e^t}\end{array}$$</p>
<p>Which takes exactly the form $y(t) = \left( {{c_1} + {c_2}t} \right){e^t}$ with $c_1$, $c_2$ that satisfy the initial conditions.</p>
<p>Here's a GIF that confirms that the solution indeed <em>is</em> a limit of the general expression, and not some sudden departure from the solutions for all other roots:</p>
<p><a href="https://i.stack.imgur.com/KFA30.gif" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/KFA30.gif" alt="enter image description here"></a></p>
<p>(this isn't a particularly surprising visualisation, though -- you've seen it in diagrams showing critical damping and stuff.)</p>Fri, 11 May 2018 16:01:38 GMThttps://math.stackexchange.com/questions/1195644/-/2776859#2776859Abhimanyu Pallavi Sudhir2018-05-11T16:01:38ZMinkowski everything -- spacetime vectors, rapidity
https://thewindingnumber.blogspot.com/2018/05/minkowski-everything-four-vectors-rapidity.html
0<b>Four-vectors and energy-momentum analogies</b><br /><b><br /></b>Let's look once more at the equation<br /><br />$$E=\frac{m}{\sqrt{1-v^2}}$$<br />This looks an awful lot like the equation for time dilation. $E$ is the mass as measured by someone who sees the object moving at $v$ whereas $m$ is the mass as measured by someone who sees the object at rest, e.g. by the object itself.<br /><br />Similarly, we have the equation $p=vE$, which looks an awful lot like the equation $x=vt$. It therefore makes sense to wonder how far this analogy goes. We could start with analysing the invariant.<br /><br />Even if I measure the mass of a 1kg rock as 10kg because of my reference frame, I know that if I brought the bag to rest, I would measure it as 1kg. Much like I can tell people's biological age or look at their clocks to determine their proper time, I can look at the moving thing's mass balance and determine its proper mass $m$.<br /><br />If we just wanted $m$ in terms of the "co-ordinates" $E$ and $p$,<br /><br />$$m = E\sqrt {1 - {v^2}} = \sqrt {{E^2} - {v^2}{E^2}} = \sqrt {{E^2} - {p^2}}$$<br />$${m^2} = {E^2} - {p^2}$$<br />Or in 4 dimensions,<br /><br />$${m^2} = {E^2} - p_x^2 - p_y^2 - p_z^2$$<br />We call $m$ the "proper mass". In general, "proper" means "as measured in the rest frame" -- proper time, proper length, proper mass, whatever. This equation is also useful because unlike the previous thing, this also works when $v=1$ (i.e. for light), and reduces to $E=pc$.<br /><br />But this looks an awful lot like the spacetime interval.<br /><br />That's not all. Consider an object with mass $E$, momentum $p$ and velocity $w=p/E$ in our reference frame $O$. Now boost to a reference frame $O'$ with relative velocity $v$ to $O$. Then the velocity of the object has transformed from $w$ to $\frac{{w - v}}{{1 - wv}}$. So<br /><br />$$\begin{array}{c}E' = \frac{m}{{\sqrt {1 - {{\left( {\frac{{w - v}}{{1 - wv}}} \right)}^2}} }}\\ = \frac{{m(1 - vw)}}{{\sqrt {{{(1 - wv)}^2} - {{(w - v)}^2}} }}\\ = \frac{{m(1 - vw)}}{{\sqrt {(1 - {w^2})(1 - {v^2})} }}\\ = \gamma (v)\left( {1 - vw} \right)\gamma (w)m\\ = \gamma \left( {1 - vw} \right)E\\ = \gamma (E - vwE)\\E' = \gamma (E - vp)\end{array}$$<br />And<br /><br />$$\begin{array}{c}p' = \frac{{m\left( {\frac{{w - v}}{{1 - wv}}} \right)}}{{\sqrt {1 - {{\left( {\frac{{w - v}}{{1 - wv}}} \right)}^2}} }}\\ = \left( {\frac{{w - v}}{{1 - wv}}} \right)E'\\ = \left( {\frac{{w - v}}{{1 - wv}}} \right)\gamma \left( {1 - wv} \right)E\\ = \gamma (wE - vE)\\p' = \gamma (p - vE)\end{array}$$<br />Or alternatively<br /><br />$$\left[ \begin{array}{l}{E'}\\{p'}\end{array} \right] = \gamma \left[ {\begin{array}{*{20}{c}}1&{ - v}\\{ - v}&1\end{array}} \right]\left[ \begin{array}{l}E\\p\end{array} \right]$$<br />In 4 dimensions,<br /><br />$$\left[ \begin{array}{l}{E'}\\{{p'}_x}\\{{p'}_y}\\{{p'}_z}\end{array} \right] = \left[ {\begin{array}{*{20}{c}}1&{ - v}&{}&{}\\{ - v}&1&{}&{}\\{}&{}&1&{}\\{}&{}&{}&1\end{array}} \right]\left[ \begin{array}{l}E\\{p_x}\\{p_y}\\{p_z}\end{array} \right]$$<br />Which is precisely the transformation for time and position.<br /><br />We call vectors that transform like this <b>spacetime vectors</b> or <b>four-vectors</b>. Four-vectors all share the same algebraic properties -- they transform in the same way, they follow vector addition, their norms and in general their dot products are invariant, etc. -- but not necessarily other properties. E.g. energy and momentum have conservation laws, but position and time do not.<br /><br />The norm of a spacetime vector is taken as:<br /><br />$${\left| {\left[ {\begin{array}{*{20}{c}}{{q_0}}\\{{q_1}}\\{{q_2}}\\{{q_3}}\end{array}} \right]} \right|^2} = q_0^2 - q_1^2 - q_2^2 - q_3^2$$<br />Which is distinct from the Euclidean norm, once again telling us that the geometry of spacetime is not Euclidean.<br /><br />Four-vectors are perhaps the most beautiful example of the symmetry between space and time. They essentially allow you to replace ordinary pre-relativistic vectors like momentum with vectors that also have a time component alongside three spatial components, because the world is 4-dimensional. You just need to find a quantity that behaves with the vector like time behaves with position -- i.e. you need to show the two quantities transform between each other in a Lorentz transformation sort of way.<br /><br />You end up with truly mind-boggling results -- we already saw that mass is the time-component of momentum, which explains why mass produces inertia -- an object with mass already devotes a lot of its momentum to moving forward in time, so the more the mass, the more of this momentum you need to transform into the spatial direction. This is really what is meant by the transformation law $p'=\gamma(p-vE)$ for mass $E$, generalising the Galilean $p'=p-vE$ (change $E$ to $M$ if that makes you happy). It also explains why massless (meaning zero rest mass) things can move at the speed of light.<br /><br />Other such four-vectors include:<br /><ul><li>Four-force (time-component: $dE/dt$)</li><li>Four-current (time-component: charge density, space-component: current density)</li><li>Electromagnetic four-potential</li></ul><div>Other quantities, like the electric and magnetic fields, even though they follow similar invariants (in the electromagnetic field example $E^2-B^2$), do not combine to form four-vectors, but instead objects called "tensors", which we will eventually talk about.</div><br />Note that during this transformation (giving something momentum), both mass and momentum increase. Similarly, time dilates when you move something around. This is again because $E^2-p^2$, not $E^2+p^2$ is invariant. The latter would correspond to a circular rotation, with invariant circles, whereas the former corresponds to a skew (a "hyperbolic rotation"), with invariant hyperbolae.<br /><br /><hr /><br /><b>Rapidity and hyperbolic rotations</b><br /><b><br /></b> <br /><div style="text-align: center;"><img src="https://upload.wikimedia.org/wikipedia/commons/8/8a/HyperbolicAnimation.gif" /></div><br /><div class="twn-furtherinsight">Points $(\cos\theta,\sin\theta)$, $(1,\tan\theta)$, $(\cosh\xi,\sinh\xi)$ and $(1,\tanh\xi)$ plotted for varying $\theta$ and $\xi$. While only $\theta$ can be interpreted as an angle too, both $\theta$ and $\xi$ can be interpreted as areas.</div><br />This will be a bit of a DIY section, with some guidance.<br /><br /><b>QUESTION 1</b><br /><b><br /></b><b>(a)</b> Consider the equation $v' = \frac{{v - w}}{{1 - vw}}$. What trigonometric identity does this remind you of? Could you resolve the differences somehow? (Hint: $v=\tanh\xi$)<br /><br /><b>(b) </b>Prove that the Lorentz transformations can be written as<br /><br />$$\begin{array}{l}t' = t\cosh \xi - x\sinh \xi \\x' = x\cosh \xi - t\sinh \xi \end{array}$$<br /><b>(c) </b>Use the hyperbolic analog of angle-addition formulae to show that this is equivalent to, where $\phi=\mathrm{artanh}(x/t)$ is the rapidity of the point $(t,x)$ in the original reference frame.<br /><br />$$\begin{array}{l}t' = s\cosh (\phi - \xi )\\x' = s\sinh (\phi - \xi )\end{array}$$<br /><b>(d) </b>The above result means that rapidity transforms as $\phi ' = \phi - \xi $ (which is itself nice, because it tells you that velocity at low speeds is approximately equal to rapidity by a factor of $c$) and $(t,x) = (s\sinh \phi ,s\cosh \phi )$. Relate the former to the idea of invariant hyperbolae and the interpretation of rapidity as an area (hint, hint: area sweeped out by a conic section... Kepler).<br /><b><br /></b><b>QUESTION 2</b><br /><b><br /></b><b>(a) </b>Results 1(b) and 1(c) are very similar to the effect of rotations on co-ordinate transformations. Here the linear transformations are skews, not rotations, which is why the formulae are different. Draw as many analogs as you can between rotations and skews in linear algebra. Refer to Article <a href="https://thewindingnumber.blogspot.in/2017/08/1103-006.html" target="_blank">1103-006</a>. Think about the rotational transformation matrix, etc.<br /><br /><b>(b) </b>Consider (a) directly in the context of special relativity. Pretending that Lorentz boosts are simply rotations (which would imply a metric signature (+,+,+,+) and treat time exactly like space), explain transformations between time and position, etc. Relate this to the actual, skew-y Lorentz transformations. Describe how relativity would behave in this theory.<br /><br /><b>(c) </b>Write as many relativistic things as you can in the language of rapidity -- the Lorentz factor, the Doppler factor, components of a four-vector (how do $E$ and $p$ look in terms of rapidity), etc.<br /><br /><b>(d) </b>Graph the hyperbolic functions and explain why the graphs make the results in 2(b) make sense.<br /><br /><b>(e) </b>How does rapidity interpretation make certain things, like $c$ being the maximum speed, natural?<br /><br /><b>QUESTION 3</b><br /><b><br /></b><b>(a) </b>Consider once again the transformation $\phi ' = \phi - \xi $. What does this tell you about the relative rapidity $\Delta\phi$? Is this invariant, i.e. do all observers agree on what the relative rapidity between two objects is, like observers did on relative velocity in Galilean relativity?<br /><br /><b>(b) </b>Explain why it would be foolish to expect the quantity $\arctan{v}$, the Euclidean angle (as opposed to rapidity, which we may call the "Minkowskian angle"), to have any physical significance. Think about the quantity $r\arctan{v}$ where $r^2=\Delta t^2+\Delta x^2$ (no minus sign).<br /><br />It's therefore reasonable to define the dot product on spacetime as $\vec a \cdot \vec b = |\vec a||\vec b|\cosh \Delta \phi $ where $\Delta\phi$ is the relative rapidity/Minkowskian angle/difference in rapidity. This expression implies that $|\vec a|^2=\vec a\cdot\vec a$is manifestly (i.e. obviously) Lorentz invariant, since both norms and relative rapidity are invariant.<br /><br /><b>(c) </b>Translate this out of rapidity language, i.e. into a language where rapidity is not used as a parameterisation. You should get $a_0b_0-a_1b_1$ (where 0 and 1 are the temporal and spatial components respectively) in two dimensions.<br /><br />The fact that this modified dot product is invariant under a skew is analogous to how the standard dot product is invariant under rotations ("complex skews"). Indeed, it turns out see that the 4-dimensional Minkowski dot product<br /><br />$${a_0}{b_0} - {a_1}{b_1} - {a_2}{b_2} - {a_3}{b_3}$$<br />Is invariant under skews (between the time axis and some other axis) as well as spatial rotations (and all combinations thereof -- i.e. a general Lorentz transformation), as it contains both a "skew-y" part and a "standard dot product-y" part.<br /><br /><hr /><br />Some interesting things regarding 2(b):<br /><br />A circular Lorentz transformation would transform position and time something similar to this:<br /><br />$$\begin{array}{l}x' = \eta (x - vt)\\t' = \eta (t + vx)\end{array}$$<br />One can also talk about transforming the positive and negative sides of the axes separately.<br /><br />$$\begin{array}{l}x' = \eta (x - vt)\\t' = \eta (t + vx)\\ - x' = \eta ( - x - v( - t))\\ - t' = \eta ( - t - v( - x))\end{array}$$<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/p8cEX.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="496" data-original-width="800" height="247" src="https://i.stack.imgur.com/p8cEX.png" width="400" /></a></div>Whereas with hyperbolic functions, there is no sign difference, so you only need to transform twice to return. This is linked to you having to differentiate circular functions four times to return, as opposed to twice for hyperbolic functions, all the sign differences between trigonometric and hyperbolic identities, the whole $ie^{i\theta}$ proof of Euler's formula, etc.four-vectorshyperbolic functionsinvarianceinvariantslinear algebralorentz transformationsminkowski spacetimerapidityrelativityskewsspacetimespecial relativityFri, 11 May 2018 06:21:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-3640812652580919131Abhimanyu Pallavi Sudhir2018-05-11T06:21:00ZComment by Abhimanyu Pallavi Sudhir on Are there any statistics texts which give both intuition AND justifications for the equations/methods?
https://math.stackexchange.com/questions/274883/are-there-any-statistics-texts-which-give-both-intuition-and-justifications-for
I've recently started a <a href="https://thewindingnumber.blogspot.in/p/contents.html" rel="nofollow noreferrer">resource</a> that seems to answer some of your concerns. You may want to check the "Statistics" course -- it's still <i>very</i> incomplete, though.Sun, 06 May 2018 12:27:13 GMThttps://math.stackexchange.com/questions/274883/are-there-any-statistics-texts-which-give-both-intuition-and-justifications-for?cid=5710458Abhimanyu Pallavi Sudhir2018-05-06T12:27:13ZAnswer by Abhimanyu Pallavi Sudhir for Motivation behind standard deviation?
https://math.stackexchange.com/questions/4787/motivation-behind-standard-deviation/2769047#2769047
0<p>You can define whatever measure of deviation you want -- standard deviation is one of them, and turns out to be particularly useful. Why? Some reasons are covered in the other answers, but fundamentally, it's because <a href="https://thewindingnumber.blogspot.in/2018/02/random-variables-and-their-properties.html" rel="nofollow noreferrer">it allows the interpretation of statistical variables as vectors</a> in a space equipped with a nice Pythagorean norm (standard deviation) and dot product (covariance), which is more mathematically interesting than a space with a square-shaped norm, which is what you get with the mean deviation.</p>Sun, 06 May 2018 12:14:11 GMThttps://math.stackexchange.com/questions/4787/-/2769047#2769047Abhimanyu Pallavi Sudhir2018-05-06T12:14:11ZAnswer by Abhimanyu Pallavi Sudhir for The relationship between the complex exponential and trig functions?
https://math.stackexchange.com/questions/2580251/the-relationship-between-the-complex-exponential-and-trig-functions/2757711#2757711
0<p>The supposed mysteriousness of Euler's formula is overrated, and actually harmful if you want to get through the light and smoke and actually learn the basic math. </p>
<p>Euler's formula relates exponentials to periodic functions. Although the two kinds of functions look superficially very different (exponentials diverge really quickly, periodic functions keep oscillating back and forth), any serious math student would have noted a curious relation between the two -- periodic functions arise whenever you do some negative number-ish stuff with exponentials. </p>
<p>For instance --</p>
<ul>
<li><strong>Simple harmonic motion --</strong> the differential equation $F=kx$ represents exponential motion when $k>0$, periodic motion when $k<0$. This is just a special case of the idea that the derivatives of the trigonometric functions match up with what you'd expect from $e^{ix}$</li>
<li><strong>Negative exponential bases --</strong> although exponential functions like $e^x$ and $a^x$ for any positive $a$ would seem to diverge nuttily at some infinity, it turns out that $(-1)^x$ is actually a periodic function, at least for integer $x$ (other negative integers give you a periodic function times a crazy diverging function).</li>
<li><strong>Conic sections --</strong> Trigonometric functions are defined on the unit circle, if you defined similar functions on the unit rectangular hyperbola, you'll get linear combinations of exponentials.</li>
</ul>
<p>There are others, based on trigonometric identities (I wrote a blog post covering some examples recently, see "<a href="https://thewindingnumber.blogspot.in/2016/11/making-sense-of-eulers-formula.html" rel="nofollow noreferrer">Making sense of Euler's formula</a>"), but the point is that this relationship is really natural, something you should expect, not some bizarre coincidence that arises from manipulating Taylor series around.</p>Sat, 28 Apr 2018 17:01:15 GMThttps://math.stackexchange.com/questions/2580251/-/2757711#2757711Abhimanyu Pallavi Sudhir2018-04-28T17:01:15ZComment by Abhimanyu Pallavi Sudhir on If $a$ is proportional to $b$ does it imply that $b$ is proportional to $a$?
https://math.stackexchange.com/questions/497714/if-a-is-proportional-to-b-does-it-imply-that-b-is-proportional-to-a/497716#497716
Personally, I use "linear" to mean $y=mx$, and the word "affine" to mean "$y=mx+c$".Sat, 28 Apr 2018 16:17:13 GMThttps://math.stackexchange.com/questions/497714/if-a-is-proportional-to-b-does-it-imply-that-b-is-proportional-to-a/497716?cid=5688448#497716Abhimanyu Pallavi Sudhir2018-04-28T16:17:13ZAnswer by Abhimanyu Pallavi Sudhir for Why is 1 not a prime number?
https://math.stackexchange.com/questions/120/why-is-1-not-a-prime-number/2757408#2757408
4<p>1 isn't a prime number for the same reason 0 isn't a basis vector.</p>
<p>Positive integers can be written as "almost a linear algebra" of the vector space <span class="math-container">$\mathbb{Z}^+$</span> over the scalar field <span class="math-container">$\mathbb{Z^+}\cup\{0\}$</span> with:</p>
<ul>
<li>Primes as the "unit basis vectors" </li>
<li>Multiplication as "vector addition" </li>
<li>Exponentiation as "scalar multiplication" (e.g. <span class="math-container">$p^k$</span> represents the scalar <span class="math-container">$k$</span> multiplied by the vector <span class="math-container">$p$</span>)</li>
<li>1 as the vector 0</li>
<li>1 as the scalar 1</li>
<li>0 is the scalar 0</li>
</ul>
<p>One may check this obeys all the axioms of linear algebra, except the existence of negatives (of vectors).</p>
<p>The reason you don't call the zero vector a basis vector is that it doesn't really add anything to the formalism if you consider "<span class="math-container">$0 + e_1 + e_2$</span>" to be the same representation as "<span class="math-container">$e_1+e_2$</span>", and if you consider it to be a different representation, you're violating the idea of each vector having a unique representation in a basis. Instead, 0 is just what you have when you haven't added anything, similarly 1 is just the empty product.</p>
<p>Note that this formalism has a lot of other interesting analogies -- for an example, co-primeness is "orthogonality". You could also extend the formalism to rationals <span class="math-container">$\mathbb{Q}$</span> over the scalar field <span class="math-container">$\mathbb{Z}$</span> -- then it would be an actual linear algebra -- although co-primeness would be more complicated (e.g. 18 would be co-prime to 3/4).</p>Sat, 28 Apr 2018 12:52:57 GMThttps://math.stackexchange.com/questions/120/why-is-1-not-a-prime-number/2757408#2757408Abhimanyu Pallavi Sudhir2018-04-28T12:52:57ZLimiting cases II: repeated roots of a differential equation
https://thewindingnumber.blogspot.com/2018/03/repeated-roots-of-differential-equations.html
0The solution to a polynomial-ish differential equation (the formal name being "linear homogenous time-invariant differential equation") with repeated roots is not completely unintuitive. While it's not immediately obvious where the solution to $(D-rI)^2y(t)=0$<br /><br />$$y=(c_1+c_2t)e^{rt}$$<br />comes from, it is pretty clear in the case $r=0$, where $D^2y(t)=0$ is solved by<br /><br />$$y=c_1+c_2t$$<br />... so it seems that the linear function comes from integrating twice, or more correctly, inverting the same differential operator twice.<br /><br />Let's try to derive our desired equation $y=(c_1+c_2t)e^{rt}$ via a limit. It doesn't seem like this would arise in the limit of an equation like $y=c_1e^{r_1t}+c_2e^{r_2t}$, but once again -- this is an arbitrary-constant-problem. Much like how we switched to definite integrals (i.e. fixed the limits/boundary conditions of the integral) before taking the limit in <a href="http://thewindingnumber.blogspot.com/2018/03/limiting-integral-of-eax.html">Part 1</a>, we must fix the initial conditions here too.<br /><br /><div class="twn-furtherinsight">For those new to this series, here's the reason we switch to an initial conditions approach/co-ordinate system:<br /><blockquote>Most people have the right idea, that you need to take the solution for non-repeated roots, and take the limit as the roots approach each other. This is correct, but it's a mistake to take the limit of the <i>general solution</i> $c_1e^{r_1t}+c_2e^{r_2t}$, which is what most people try to do when they see this problem, and are then puzzled since it gives you a solution space of the wrong dimension.<br /><br />This is wrong, because $c_1$ and $c_2$ are arbitrary mathematical labels, and have no reason to stay the same as the roots approach each other. You can, however, take the limit while representing the solution in terms of your initial conditions, because these can stay the same as you change the system. <br /><br />You can think of this as a physical system where you change the damping and other parameters to create a repeated-roots system as the initial conditions remain the same -- this is a simple process, but if you instead try to ensure $c_1$ and $c_2$ remain the same, you'll run into infinities and undefined stuff. <br /><br />This is exactly what happens here, <b>there simply isn't a repeated-roots solution with the same $c_1$ and $c_2$ values, but you obviously do have a system/solution with the same initial conditions.</b></blockquote>Taken from <a href="https://math.stackexchange.com/a/2776859/78451">my answer on Math Stack Exchange</a>.</div><br />We consider the differential equation<br /><br />$$(D-I)(D-rI)y(t)=0$$<br />And tend $r\to1$. The solution to the equation in general is<br /><br />$$y(t) = {c_1}{e^t} + {c_2}{e^{rt}}$$<br /> If we let $y(0) = a,\,\,y'(0) = b$, then it shouldn't be hard to show that the solution we're looking for is<br /><br />$$y(t)=\frac{ra-b}{r-1}e^t-\frac{a-b}{r-1}e^{rt}$$<br />This is where we must tend $r\to1$. Doing so is simply algebraic manipulation and a bit of limits:<br /><br />$$\begin{array}{c}y(t) = \frac{{\left( {ra - b} \right){e^t} - \left( {a - b} \right){e^{rt}}}}{{r - 1}} = \frac{{\left( {ra - b} \right) - \left( {a - b} \right){e^{(r - 1)t}}}}{{r - 1}}{e^t}\\ = \frac{{(r - 1)a + \left( {a - b} \right) - \left( {a - b} \right){e^{(r - 1)t}}}}{{r - 1}}{e^t}\\ = \left[ {a + \left( {a - b} \right)\frac{{1 - {e^{(r - 1)t}}}}{{r - 1}}} \right]{e^t}\\ = \left[ {a - \left( {a - b} \right)\frac{{{e^{(r - 1)t}} - {e^{0t}}}}{{r - 1}}} \right]{e^t}\\ = \left[ {a - \left( {a - b} \right){{\left. {\frac{d}{{dx}}\left[ {{e^{xt}}} \right]} \right|}_{x = 0}}} \right]{e^t}\\ = \left[ {a - \left( {a - b} \right)t} \right]{e^t}\end{array}$$<br />Which indeed takes the form<br /><br />$$y(t) = \left( {{c_1} + {c_2}t} \right){e^t}$$<br />With $c_1,\,\,c_2$ such that $y(0)=a,\,\,y'(0)=b$.<br /><br />Here's a visualisation of the limit, with varying values of $r$:<br /><br /><center><iframe frameborder="0" height="500px" src="https://www.desmos.com/calculator/yndnturnuc?embed" style="border: 1px solid #ccc;" width="500px"></iframe></center><br />And here's an <a href="https://www.desmos.com/calculator/pvwogbdtzs" target="_blank">interactive version with a slider for <i>r</i></a>.<br /><br />characteristic equationsdifferential equationshomogenous differential equationslimiting caseslinear differential equationsThu, 08 Mar 2018 12:22:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1402369630137593999Abhimanyu Pallavi Sudhir2018-03-08T12:22:00ZLimiting cases I: the integral of e^(ax) and the finite-domain Fourier transform
https://thewindingnumber.blogspot.com/2018/03/limiting-integral-of-eax.html
0The integral<br /><br />$$\int_{}^{} {{e^{ax}}dx} = \frac{{{e^{ax}}}}{a}+C$$<br />Unless $a=0$, in which case we're integrating $1$, and the answer is $x+C$.<br /><br />This discontinuity is jarring, and seemingly odd. If we were to just substitute $a=0$ into $\frac{{{e^{ax}}}}{a}$, we don't get an indeterminate form that could possibly turn into $x$ if you took the limit instead — you just get infinity, which is nuts.<br /><br />The key to solving this weirdness lies in the $+C$ term — because of its presence, you can't intelligently evaluate such a limit. After all, perhaps you should take $C$ to be equal to minus infinity in some way. The way to handle this is to use a definite integral. If one integrates instead between two limits — say, 0 and $x$, the the arbitrary constant disappears.<br /><br />$$\int_0^x {{e^{ax}}dx} = \frac{{{e^{ax}} - 1}}{a}$$<br />Meanwhile, integrating $1$ between 0 and $x$ just gives you $x$.<br /><br />Now, the limit $\mathop {\lim }\limits_{a \to 0} \frac{{{e^{ax}} - 1}}{a}$ is easy to take — just do a bit of L' Hopital, and you see that indeed:<br /><br />$$\mathop {\lim }\limits_{a \to 0} \frac{{{e^{ax}} - 1}}{a} = x$$<br />Like I said, this integral shows up a lot when we're dealing with complex functions. For example, the integral:<br /><br />$$\int\limits_{ - \infty }^\infty {{e^{-i\omega t}}dt} $$<br />Is zero for all values of $\omega$ except $\omega=0$, where it goes to infinity. We call this function the "Dirac delta function" $\delta(\omega)$. The integral is exactly the same as before, but <b>this time, taking the limit won't work either</b> — the limit of the integral as $\omega\to0$ is 0, <i>not</i> infinity.<br /><br />How do we understand this? Well, notice that the integral is really a Fourier transform — it's the Fourier transform of the function "1", but the same integral is also important in the Fourier transform of any function of the form ${e^{i{\omega _n}t}}$, that is —<br /><br />$$\int\limits_{ - \infty }^\infty {{e^{i({\omega _n} - \omega )t}}dt} $$<br />Similarly as above, the integral goes crazy when $\omega = {\omega _n}$, so the integral equals $\delta(\omega_n)$. So the limit is still 0 as you approach $\omega_n$.<br /><br />What changed in our integral that made the limit argument no longer apply? Could it be that our use of complex variables made everything weirder by introducing peridocity? Well, no — our evaluation of the limit didn't assume anything about $a$ being real. The only reason we choose periodicity here is so the improper integral doesn't diverge. Well, the other change was our <b>use of an infinite domain of integration</b>. Could this have made $F(\omega)$ discontinuous?<br /><br /><div style="text-align: center;"><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://i.stack.imgur.com/nVJ5e.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/spUNpyF58BY/0.jpg" frameborder="0" height="270" src="https://www.youtube.com/embed/spUNpyF58BY?feature=player_embedded" width="480"></iframe></a></td></tr><tr><td class="tr-caption" style="text-align: center;">A geometric interpretation of the Fourier transform</td></tr></tbody></table></div><br />Watch the video above. The idea is this: the Fourier transform is zero when the wrapped-up plot has its centre of mass at 0. When the domain of your integration is infinite, this is true <i>whenever</i> $\omega\neq\omega_n$, because the discrepancy between $\omega$ and $\omega_n$, however small, means the little cardoid keeps getting rotated a tiny little bit each winding, and finally gets smeared around the entire circle, so the centre of mass is at zero.<br /><br />Meanwhile when $\omega=\omega_n$, the cardoid keeps returning to the same point, so the Fourier transform goes to infinity, because a non-zero centre of mass is getting added an infinite number of times.<br /><br />On the other hand when you're only Fourier-transforming a finite piece of the function (i.e. the limits of your integral are not infinite), the cardoid doesn't get smeared all across the circle, so the value of $F(\omega)$ starts to rise even before $\omega=\omega_n$.<br /><br /><br /><div style="text-align: center;"><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://i.stack.imgur.com/nVJ5e.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="390" data-original-width="800" height="195" src="https://i.stack.imgur.com/nVJ5e.jpg" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">If the domain of the Fourier transform were infinite, the cardoid would have <br />been smeared further, winding around the circle an infinite number of times.</td></tr></tbody></table></div><br />In general, when you have an asymmetric shape forming from the wrapped-up plot, there is some number $N$ so that after $N$ windings, the asymmetric shape returns to its original position after winding around tons of places, and the resulting shape is symmetric. Or if $\frac{\phi}{2\pi}$ (where $\phi$ is the phase) is not a rational fraction of $2\pi$, then you can get as close as you want to the such a symmetric shape by approximating it a sufficiently close rational number, and the actual value of $N$ would be infinite.<br /><br /><div class="twn-furtherinsight"><i></i>Calculate $N$.</div><br />However, when using a finite domain for the Fourier transform, only those winding frequencies $\omega$ for which $N$ is less than the domain of winding — <b>i.e. values where the phase difference is "sufficiently rational"</b> — allow this symmetry to form, so only these values of $\omega$ show up as zero in the finite-domain Fourier transform.<br /><br />Meanwhile, the main peak where $\omega = \omega_n$ isn't quite infinitely tall, because you're only adding up the centre of mass a finite number of times ($\omega t/2\pi$ times).<br /><br />So the finite-Fourier transform actually ends up looking like this:<br /><br /><div style="text-align: center;"><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://i.stack.imgur.com/s9sPp.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="390" data-original-width="800" height="195" src="https://i.stack.imgur.com/s9sPp.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">We've actually been considering the x-coordinate (real part) of the Fourier<br />transform of $\cos(\omega_n t)$ in these illustrations, but this is really essentially<br />the same as the Fourier transform of $e^{i\omega_n t}$, as in our calculations.</td></tr></tbody></table></div><br />Which <i>isn't </i>a discontinuous Dirac delta function! As the domain of the transform widens, the true peaks above become narrower and narrower, taller and taller, the wavy stuff flattens out, and the Fourier transform approaches a Dirac delta function!<br /><br />So this tells us exactly what we need — we do still need to take a limit, but we need to take a limit of what <i>function </i>$F(\omega)$ the integral approaches as the domain $(-T,T)\to(-\infty,\infty)$. And this is simple.<br /><br />$$\int_{ - T}^T {{e^{ - i\omega t}}dt} = \frac{{{e^{i\omega T}} - {e^{ - i\omega T}}}}{{i\omega }} = \frac{2}{\omega }\sin (\omega T)$$<br />It is left as an exercise to the reader to prove that this converges to the delta function $2\pi\delta(\omega)$ in the limit where $T\to\infty$.<br /><br /><div class="twn-hint">To prove the coefficient $2\pi$ on the delta function, consider the area under the curve.</div><br /><div class="twn-furtherinsight">Here's another way you could've arrived at the idea of taking a finite-limit integral: Fourier transforms are pretty common in practical settings, except they're typically done over finite domains of time, since it's kind of impractical to play signals forever. It seems unlikely you'd get crazy some Dirac-delta in standard signal processing. So it seems sensible to expect that the discontinuity only arises when you integrate over all $\mathbb{R}$.</div><br /><div class="twn-exercises">Explain similar limiting cases in the following integrals:<br /><ol><li>Integral of $x^n$ as $n\to-1$</li><li>Integral of $a^x$ as $a\to1$ (hint: this isn't really different from the integral of $e^{ax}$)</li></ol></div>calculusfinite-domain fourier transformsfourier transformslimiting caseslimitsTue, 06 Mar 2018 09:02:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-6684702394913811967Abhimanyu Pallavi Sudhir2018-03-06T09:02:00ZThe central limit theorem
https://thewindingnumber.blogspot.com/2018/02/the-central-limit-theorem.html
0<style type="text/css">.tg {border-collapse:collapse;border-spacing:0;} .tg td{font-family:Arial, sans-serif;font-size:14px;padding:15px 20px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;} .tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:15px 20px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;} .tg .tg-yubs{background-color:#c3c3c3;vertical-align:top} .tg .tg-5794{background-color:#dddddd;vertical-align:top} .tg .tg-yw4l{vertical-align:top} </style><br />The central limit theorem is perhaps the most beautiful theorem in all statistics. By connecting the binomial distribution to the normal distribution, and otherwise, it answers the question of why the normal distribution comes up so often in statistics. In fact, by stating that means of any distribution are normally distributed, it appears to give the normal distribution its place as the king of all continuous distributions (or does it?)<br /><br />The typical motivation for the central limit theorem comes from looking at large-sample distributions -- the distributions of the sums of two variables.<br /><br />For instance, we've all seen what happens when we add two uniform distributions together:<br /><br /><center><table class="tg" style="text-align: center;"><tbody><tr> <th class="tg-yubs">Sum</th> <th class="tg-5794">0</th> <th class="tg-5794">1</th> <th class="tg-5794">2</th> <th class="tg-5794">3</th> <th class="tg-5794">4</th> <th class="tg-5794">5</th> </tr><tr> <td class="tg-5794">0</td> <td class="tg-yw4l">0</td> <td class="tg-yw4l">1</td> <td class="tg-yw4l">2</td> <td class="tg-yw4l">3</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> </tr><tr> <td class="tg-5794">1</td> <td class="tg-yw4l">1</td> <td class="tg-yw4l">2</td> <td class="tg-yw4l">3</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> </tr><tr> <td class="tg-5794">2</td> <td class="tg-yw4l">2</td> <td class="tg-yw4l">3</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> </tr><tr> <td class="tg-5794">3</td> <td class="tg-yw4l">3</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> <td class="tg-yw4l">8</td> </tr><tr> <td class="tg-5794">4</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> <td class="tg-yw4l">8</td> <td class="tg-yw4l">9</td> </tr><tr> <td class="tg-5794">5</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> <td class="tg-yw4l">8</td> <td class="tg-yw4l">9</td> <td class="tg-yw4l">10</td> </tr></tbody></table></center><br />If you graphed the distributions, they'd look like this:<br /><br /><center><iframe frameborder="0" height="500px" src="https://www.desmos.com/calculator/mddrggb3xg?embed" style="border: 1px solid #ccc;" width="500px"></iframe><br /><iframe frameborder="0" height="500px" src="https://www.desmos.com/calculator/xsiacskrov?embed" style="border: 1px solid #ccc;" width="500px"></iframe><br /></center><br />Which is in itself a little troubling, because of the discontinuity. The distribution is really just a graph of the number of partitions of $X_2$ into two, where each partition is between 0 and 5.<br /><br />A neat way to visualise this is to imagine a line passing across a square from one vertex to the opposite one, and track the length (divided by $\sqrt{2}$) of the line segment of intersection.<br /><br /><center><iframe frameborder="0" height="500px" src="https://www.desmos.com/calculator/mql72tvu6m?embed" style="border: 1px solid #ccc;" width="500px"></iframe><br /></center><br />What about when you add three variables? Perhaps you think you may have two parabolas intersecting at an even sharper needlepoint, instead of two straight lines/a triangle. And everything would get even weirder.<br /><br />Well, actually, it isn't. I encourage you to visualise this for yourself -- while the area starts out quadratically increasing (for about 1/6 of the journey), it eventually "hits" the faces of the cube and slows down in its increase.<br /><br /><center><iframe frameborder="0" height="500px" src="https://www.math3d.org/graph/9cc63e68319ca2b68f1cd175321cf961" style="border: 1px solid #ccc;" width="500px"></iframe></center><br /><div class="twn-furtherinsight">Figure out the piecewise formula for the area, i.e. for the <a href="https://en.wikipedia.org/wiki/Irwin%E2%80%93Hall_distribution" target="_blank">Irwin-Hall distribution</a>. Hint: you can use dimensional analysis to show that each "piece" is a quadratic, even though a different one each time. But which quadratic?</div><br />The distribution of a sum of three variables therefore looks like this:<br /><br /><center><iframe frameborder="0" height="500px" src="https://www.desmos.com/calculator/ob2gphmdjj?embed" style="border: 1px solid #ccc;" width="500px"></iframe><br /></center><br />As you keep increasing the dimension of the space, the function approaches a function that looks suspiciously increasingly like the normal distribution. This puts the normal distribution in a pretty important role as a function: it, as a function of <i>X</i>, represents the (<i>n</i> - 1)-volume of the region of intersection between a plane $x_1+x_2+...=X$ and an <i>n</i>-cube, as <i>n</i> approaches infinity (which is called a Hilbert space).<br /><br />Since the binomial distribution approaches a normal as the dimension approaches infinity (<b>at least as long as you squish the standard deviation so the space becomes more and more continuous -- this is what happens in the final version of the central limit theorem, where you use averages instead of sums</b>), it's fine to use areas and volumes to approximate "number of discrete points intersecting".<br /><br />Our goal is to find the function that this approaches, to "derive" the normal distribution from this defining axiom. You may consider several possible ways of doing so -- one that I thought of was to differentiate the volume under the plane in the cube, and take the limit to infinity. Well, you can try it, but you'll probably fail.<br /><br /><div class="twn-furtherinsight">Why can't we just use the formula for the number of partitions? Well, that would be like modelling the area as linear, quadratic, cubic, etc. -- it's only true for the first part of the distribution, after which we need to subtract off partitions with compartments larger than 5.</div><br /><hr /><br />Perhaps you noticed, when sliding the plane across the cube, the line of intersection between the plane and the bottom face of the cube is exactly the line from the previous step -- the line moving across a square. If we do some fancy integral over this domain, it would seem like we'd get the volume under the plane from the area under the line.<br /><br />Going back to the algebraic world, we retain this insight: <i>recurrence</i> <i>relations </i>seem to be the way to go to solve this problem.<br /><br />So let's do it that way -- but first, to get an appreciation of what's going on, let's look again at the discrete, blocky distribution of sums we have at low values of <i>n</i>.<br /><br />Like the sum of two dice-throws can be represented as a square of side 6, [0, 5]<sup>2</sup>, the sum of three dice-throws can be represented as a cube of side 6, [0, 5]<sup>3</sup>. The layers of the cube are shown below:<br /><br /><center><i>Layer </i>$X_3=0$:<br /><table class="tg"><tbody><tr> <th class="tg-yubs">X<sub>2</sub> \ X<sub>1</sub></th> <th class="tg-5794">0</th> <th class="tg-5794">1</th> <th class="tg-5794">2</th> <th class="tg-5794">3</th> <th class="tg-5794">4</th> <th class="tg-5794">5</th> </tr><tr> <td class="tg-5794">0</td> <td class="tg-yw4l">0</td> <td class="tg-yw4l">1</td> <td class="tg-yw4l">2</td> <td class="tg-yw4l">3</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> </tr><tr> <td class="tg-5794">1</td> <td class="tg-yw4l">1</td> <td class="tg-yw4l">2</td> <td class="tg-yw4l">3</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> </tr><tr> <td class="tg-5794">2</td> <td class="tg-yw4l">2</td> <td class="tg-yw4l">3</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> </tr><tr> <td class="tg-5794">3</td> <td class="tg-yw4l">3</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> <td class="tg-yw4l"><b>8</b></td> </tr><tr> <td class="tg-5794">4</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> <td class="tg-yw4l"><b>8</b></td> <td class="tg-yw4l">9</td> </tr><tr> <td class="tg-5794">5</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> <td class="tg-yw4l"><b>8</b></td> <td class="tg-yw4l">9</td> <td class="tg-yw4l">10</td> </tr></tbody></table><br /><i>Layer </i>$X_3=1$:<br /><table class="tg"><tbody><tr> <th class="tg-yubs">X<sub>2</sub> \ X<sub>1</sub></th> <th class="tg-5794">0</th> <th class="tg-5794">1</th> <th class="tg-5794">2</th> <th class="tg-5794">3</th> <th class="tg-5794">4</th> <th class="tg-5794">5</th> </tr><tr> <td class="tg-5794">0</td> <td class="tg-yw4l">1</td> <td class="tg-yw4l">2</td> <td class="tg-yw4l">3</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> </tr><tr> <td class="tg-5794">1</td> <td class="tg-yw4l">2</td> <td class="tg-yw4l">3</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> </tr><tr> <td class="tg-5794">2</td> <td class="tg-yw4l">3</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> <td class="tg-yw4l"><b>8</b></td> </tr><tr> <td class="tg-5794">3</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> <td class="tg-yw4l"><b>8</b></td> <td class="tg-yw4l">9</td> </tr><tr> <td class="tg-5794">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> <td class="tg-yw4l"><b>8</b></td> <td class="tg-yw4l">9</td> <td class="tg-yw4l">10</td> </tr><tr> <td class="tg-5794">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> <td class="tg-yw4l"><b>8</b></td> <td class="tg-yw4l">9</td> <td class="tg-yw4l">10</td> <td class="tg-yw4l">11</td> </tr></tbody></table><br /><i>Layer </i>$X_3=2$:<br /><table class="tg"><tbody><tr> <th class="tg-yubs">X<sub>2</sub> \ X<sub>1</sub></th> <th class="tg-5794">0</th> <th class="tg-5794">1</th> <th class="tg-5794">2</th> <th class="tg-5794">3</th> <th class="tg-5794">4</th> <th class="tg-5794">5</th> </tr><tr> <td class="tg-5794">0</td> <td class="tg-yw4l">2</td> <td class="tg-yw4l">3</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> </tr><tr> <td class="tg-5794">1</td> <td class="tg-yw4l">3</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> <td class="tg-yw4l"><b>8</b></td> </tr><tr> <td class="tg-5794">2</td> <td class="tg-yw4l">4</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> <td class="tg-yw4l"><b>8</b></td> <td class="tg-yw4l">9</td> </tr><tr> <td class="tg-5794">3</td> <td class="tg-yw4l">5</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> <td class="tg-yw4l"><b>8</b></td> <td class="tg-yw4l">9</td> <td class="tg-yw4l">10</td> </tr><tr> <td class="tg-5794">4</td> <td class="tg-yw4l">6</td> <td class="tg-yw4l">7</td> <td class="tg-yw4l"><b>8</b></td> <td class="tg-yw4l">9</td> <td class="tg-yw4l">10</td> <td class="tg-yw4l">11</td> </tr><tr> <td class="tg-5794">5</td> <td class="tg-yw4l">7</td> <td class="tg-yw4l"><b>8</b></td> <td class="tg-yw4l">9</td> <td class="tg-yw4l">10</td> <td class="tg-yw4l">11</td> <td class="tg-yw4l">12</td> </tr></tbody></table><br /><i>Layer </i>$X_3=3$:<br /><table class="tg"><tbody><tr><th class="tg-yubs">X<sub>2</sub> \ X<sub>1</sub></th><th class="tg-5794">0</th><th class="tg-5794">1</th><th class="tg-5794">2</th><th class="tg-5794">3</th><th class="tg-5794">4</th><th class="tg-5794">5</th></tr><tr><td class="tg-5794">0</td><td class="tg-yw4l">3</td><td class="tg-yw4l">4</td><td class="tg-yw4l">5</td><td class="tg-yw4l">6</td><td class="tg-yw4l">7</td><td class="tg-yw4l"><b>8</b></td></tr><tr><td class="tg-5794">1</td><td class="tg-yw4l">4</td><td class="tg-yw4l">5</td><td class="tg-yw4l">6</td><td class="tg-yw4l">7</td><td class="tg-yw4l"><b>8</b></td><td class="tg-yw4l">9</td></tr><tr><td class="tg-5794">2</td><td class="tg-yw4l">5</td><td class="tg-yw4l">6</td><td class="tg-yw4l">7</td><td class="tg-yw4l"><b>8</b></td><td class="tg-yw4l">9</td><td class="tg-yw4l">10</td></tr><tr><td class="tg-5794">3</td><td class="tg-yw4l">6</td><td class="tg-yw4l">7</td><td class="tg-yw4l"><b>8</b></td><td class="tg-yw4l">9</td><td class="tg-yw4l">10</td><td class="tg-yw4l">11</td></tr><tr><td class="tg-5794">4</td><td class="tg-yw4l">7</td><td class="tg-yw4l"><b>8</b></td><td class="tg-yw4l">9</td><td class="tg-yw4l">10</td><td class="tg-yw4l">11</td><td class="tg-yw4l">12</td></tr><tr><td class="tg-5794">5</td><td class="tg-yw4l"><b>8</b></td><td class="tg-yw4l">9</td><td class="tg-yw4l">10</td><td class="tg-yw4l">11</td><td class="tg-yw4l">12</td><td class="tg-yw4l">13</td></tr></tbody></table><br /><i>Layer </i>$X_3=4$:<br /><table class="tg"><tbody><tr><th class="tg-yubs">X<sub>2</sub> \ X<sub>1</sub></th><th class="tg-5794">0</th><th class="tg-5794">1</th><th class="tg-5794">2</th><th class="tg-5794">3</th><th class="tg-5794">4</th><th class="tg-5794">5</th></tr><tr><td class="tg-5794">0</td><td class="tg-yw4l">4</td><td class="tg-yw4l">5</td><td class="tg-yw4l">6</td><td class="tg-yw4l">7</td><td class="tg-yw4l"><b>8</b></td><td class="tg-yw4l">9</td></tr><tr><td class="tg-5794">1</td><td class="tg-yw4l">5</td><td class="tg-yw4l">6</td><td class="tg-yw4l">7</td><td class="tg-yw4l"><b>8</b></td><td class="tg-yw4l">9</td><td class="tg-yw4l">10</td></tr><tr><td class="tg-5794">2</td><td class="tg-yw4l">6</td><td class="tg-yw4l">7</td><td class="tg-yw4l"><b>8</b></td><td class="tg-yw4l">9</td><td class="tg-yw4l">10</td><td class="tg-yw4l">11</td></tr><tr><td class="tg-5794">3</td><td class="tg-yw4l">7</td><td class="tg-yw4l"><b>8</b></td><td class="tg-yw4l">9</td><td class="tg-yw4l">10</td><td class="tg-yw4l">11</td><td class="tg-yw4l">12</td></tr><tr><td class="tg-5794">4</td><td class="tg-yw4l"><b>8</b></td><td class="tg-yw4l">9</td><td class="tg-yw4l">10</td><td class="tg-yw4l">11</td><td class="tg-yw4l">12</td><td class="tg-yw4l">13</td></tr><tr><td class="tg-5794">5</td><td class="tg-yw4l">9</td><td class="tg-yw4l">10</td><td class="tg-yw4l">11</td><td class="tg-yw4l">12</td><td class="tg-yw4l">13</td><td class="tg-yw4l">14</td></tr></tbody></table><br /><i>Layer </i>$X_3=5$:<br /><table class="tg"><tbody><tr><th class="tg-yubs">X<sub>2</sub> \ X<sub>1</sub></th><th class="tg-5794">0</th><th class="tg-5794">1</th><th class="tg-5794">2</th><th class="tg-5794">3</th><th class="tg-5794">4</th><th class="tg-5794">5</th></tr><tr><td class="tg-5794">0</td><td class="tg-yw4l">5</td><td class="tg-yw4l">6</td><td class="tg-yw4l">7</td><td class="tg-yw4l"><b>8</b></td><td class="tg-yw4l">9</td><td class="tg-yw4l">10</td></tr><tr><td class="tg-5794">1</td><td class="tg-yw4l">6</td><td class="tg-yw4l">7</td><td class="tg-yw4l"><b>8</b></td><td class="tg-yw4l">9</td><td class="tg-yw4l">10</td><td class="tg-yw4l">11</td></tr><tr><td class="tg-5794">2</td><td class="tg-yw4l">7</td><td class="tg-yw4l"><b>8</b></td><td class="tg-yw4l">9</td><td class="tg-yw4l">10</td><td class="tg-yw4l">11</td><td class="tg-yw4l">12</td></tr><tr><td class="tg-5794">3</td><td class="tg-yw4l"><b>8</b></td><td class="tg-yw4l">9</td><td class="tg-yw4l">10</td><td class="tg-yw4l">11</td><td class="tg-yw4l">12</td><td class="tg-yw4l">13</td></tr><tr><td class="tg-5794">4</td><td class="tg-yw4l">9</td><td class="tg-yw4l">10</td><td class="tg-yw4l">11</td><td class="tg-yw4l">12</td><td class="tg-yw4l">13</td><td class="tg-yw4l">14</td></tr><tr><td class="tg-5794">5</td><td class="tg-yw4l">10</td><td class="tg-yw4l">11</td><td class="tg-yw4l">12</td><td class="tg-yw4l">13</td><td class="tg-yw4l">14</td><td class="tg-yw4l">15</td></tr></tbody></table></center><br />If you were to track the presence of any individual number through this cube, you'd find that they do, in fact, form a plane. For example, we've boldened the "8"s that appear through the cube. The number of them is the height of the distribution at $S_3=8$.<br /><br />So what is the number of '8's in the cube? Well, it's the number of '8's in the $X_3=0$ layer, plus the number of '8's in the $X_3=1$ layer, and so on. Now, this is interesting -- it seems like the number of '8's in the $X_3 = 1$ layer equals the number of '7's in the $X_3 = 0$ layer. Similarly, the number of '8's in the $X_3 = 2$ layer equals the number of '6's in the $X_3 = 0$ layer.<br /><br /><div class="twn-furtherinsight">Think about why this is so. The reason will be important when we extend this to continuous distributions.</div><br />In general, if we let the frequency of a sum $S$ of <i>n</i> variables with a uniform discrete distribution on $[0, L]$ be $f_{S,n}$, then:<br /><br />$$f_{S,n}=f_{S,n-1}+f_{S-1,n-1}+...+f_{S-L,n-1}$$<br />Which is the recursion we were looking for.<br /><br /><div class="twn-furtherinsight">Perhaps this identity may remind you of some sort of a transformation of the hockey-stick identity on the Pascal triangle. In fact, if you let $p=1$, you will get exactly the Pascal triangle -- or a right-angled version thereof -- if you plot a table of $f_{S,n}$ along $S$ and $n$ with some initial conditions. <b>Exploration of this is left as an exercise to the reader -- this is very important</b>, it is where the intuition behind the link to binomial distributions comes from.<br /><br />We can generalise this to a non-uniform base distribution -- just weight each frequency by the frequency $f_{S-K,n-1}$ of getting the number you need to bring the sum up from $S-K$ to $S$, $f_{K,1}$,<br /><br />$$f_{S,n}=f_{0,1}f_{S,n-1}+f_{1,1}f_{S-1,n-1}+...+f_{L,1}f_{S-L,n-1}$$<br />In fact, in this form, you don't even need to restrict the domain of the summation vary between $S$ and $S-L$ -- you can just write:</div><br />$${f_{S,n}} = \sum\limits_{k = - \infty }^\infty {{f_{k,1}}{f_{S - k,n - 1}}} $$<br />It's just that for $k$ outside the range of $0$ and $L$, $f_{k,1}$ is zero for the distribution we've been studying (a uniform distribution on the interval $[0,L]$). For more general distributions, this is not necessarily true, and the equation above is the general result.<br /><br /><div class="twn-furtherinsight">Why? Link this to the bracketed exploration two steps before this one.</div><br />This result we have now, if you think about it, has really been pretty obvious all the while. All we're doing is summing up the probabilities of all possible combinations of $S_{n-1}$ and $S_1$. This result applies generally, to all possible distributions -- discrete distributions, but we will see the analog for continuous distributions in a moment.<br /><br />We've been talking about frequencies all this while, but replacing $f$ with probabilities makes no change to the relation.<br /><br /><div class="twn-furtherinsight">Why? Think about this one a bit geometrically -- in terms of the plane-cube thing.</div><br />To extend the relation to continuous distributions, however, we need to talk in terms of probability <i>densities</i>. In doing so, we write the probabilities as the product of a probability density and a differential, replacing $k$ with a continuous variable and the summation with an integral.<br /><br />$${p_n}(S)\,dt = \int_{ - \infty }^\infty {{p_1}(t){p_{n - 1}}(S - t)\,d{t^2}} $$<br /><br /><div class="twn-furtherinsight">Why does it make sense to have a $dt$ differential on the left-hand side? Well, because $t$ is simply the variable on the axis on which a specific value of $S$ is marked -- the probability density is still obtained by dividing the probability by $dt$.</div><br />Dividing both sides by $dt$,<br /><br />$${p_n}(S)\, = \int_{ - \infty }^\infty {{p_1}(t){p_{n - 1}}(S - t)\,dt} $$<br />Now this is a very interesting result -- one can see that this is simply the <b>convolution</b> function:<br /><br />$${p_n}(S) = {p_1}(S) * {p_{n - 1}}(S)$$<br />Well, how do we evaluate a convolution? Well, we take a Laplace transform, of course! So we get:<br /><br />$$\begin{gathered}\mathcal{L}\left[ {{p_n}(S)} \right] = \mathcal{L}\left[ {{p_1}(S) * {p_{n - 1}}(S)} \right] \hfill \\<br />{\mathcal{P}_n}(\Omega) = {\mathcal{P}_1}(\Omega){\mathcal{P}_{n - 1}}(\Omega) \hfill \\<br />\end{gathered} $$<br /><br />Or trivially solving the recurrence relation,<br /><br />$${\mathcal{P}_n}(\Omega) = {\mathcal{P}_1}{(\Omega)^n}$$<br /><br />Our challenge is this: <b>does the Laplace transform of any probability density function, when taken to the $n$<sup>th</sup> power, always approach the Laplace transform of some given function as $n\to\infty$? </b>This function, ${\mathcal{P}_1}{(\Omega)^n}$, turns out to be the normal distribution (or rather its Laplace transform).<br /><br /><hr /><br />Well, how do we evaluate ${\mathcal{P}_1}{(\Omega)^n}$? At first glance, it seems impossible that this always converges to the same function -- after all, ${\mathcal{P}_1}{(\Omega)}$ could be any function, right?<br /><br />Not really. Think about the restrictive properties of a generating function/moment-generating function/"Laplace-transform of a probability distribution" -- a lot of them have to do with its value and the value of its derivatives at zero. This fact strongly suggests an approach involving a Taylor series.<br /><br />Suppose we Taylor expand ${\mathcal{P}_1}{(\Omega)}$ as follows:<br /><br />$${\mathcal{P}_1}(\Omega ) = \mathcal{P}_1^{(0)}(0) + \mathcal{P}_1^{(1)}(0)\Omega + \mathcal{P}_1^{(2)}(0)\frac{{{\Omega ^2}}}{2} + ...$$<br /><div class="twn-pitfall">This is NOT the generating function of a discrete probability distribution. The coefficients do not represent any probabilities -- they are simply derivatives of the generating function evaluated at zero.</div><br />Now, by the properties of generating functions (compare each one to a property of generating functions of discrete variables -- except those involve 1 instead of 0, because they don't do ${e^s}$ in their definition),<br /><ul><li>${\mathcal{P}_1}(0) = 1$ </li><li>$\mathcal{P}_1^{(1)}(0) = \mu $</li><li>$\mathcal{P}_1^{(2)}(0) = \sigma^2$</li></ul><br />There is something familiar with taking the limit $n\to\infty$ of an expression like<br /><br />$${\mathcal{P}_n}(\Omega ) = {\left( {1 + \mu \Omega + \frac{1}{2}{\sigma ^2}{\Omega ^2} + ...} \right)^n}$$<br />It might remind you of the old limit ${e^x} = {\left( {1 + \frac{x}{n}} \right)^n}$. If only we found a way to get the term on the inside to be "something" (some $x$) divided by $n$.<br /><br />And well -- it turns out, there is a way. Remember -- when we're talking about the summed distribution, the mean and variance are $n\mu$ and $n\sigma^2$. We will represent these as $\mu_n$ and $\sigma_n^2$, so<br /><br />$${\mathcal{P}_n}(\Omega ) = {\left( {1 + \frac{{{\mu _n}\Omega }}{n} + \frac{{\sigma _n^2{\Omega ^2}/2}}{n} + ...} \right)^n}$$<br />When you take the limit as $n\to\infty$, this approaches<br /><br />$${\mathcal{P}_n}(\Omega ) = {e^{{\mu _n}\Omega + \frac{1}{2}\sigma _n^2{\Omega ^2}}}$$<br />Which is precisely the moment-generating function of a normal distribution with mean ${{\mu _n}}=n\mu$ and variance ${\sigma _n^2}=n\sigma^2$. The distribution of the mean follows.<br /><br /><hr /><br />There may seem to be something off with our proof. We chose to "cut off" our Taylor series at the $\Omega^2$ term for no apparent reason -- if we had extended the series to include $\Omega^3$ term, we'd have gotten an additional term representing the "<b>third moment</b>" of the distribution (the zeroth moment is 1, the first moment is the mean and the second moment is the variance).<br /><br />Indeed, we would've. And the distribution of the mean indeed does approach a skewed normal distribution with the skew being possible to calculate from the skew of the original distribution (much like the mean and variance are calculated from those of the original distribution). The skew would decrease rapidly, of course, even faster than the standard deviation decreases as $n$ increases.<br /><br />Similarly if we'd stopped the series at $\Omega^4$, we'd get a "kurtosis" (the fourth moment), which would decrease even faster.<br /><br />So which distribution does the mean actually approach?<br /><br /><b>All of them.</b><br /><br />Think about the epsilon-delta definition of the limit. You can always define some amount of closeness and you'll be able to get an $n$ large enough to ensure that the distribution of the mean is close enough to any one of these distributions. The thing with functions is, they can approach a number of different functions at once, because all those functions also approach the same thing.<br /><br />We <i>can</i> take the skew into account if we wanted to -- it's just that for big-enough values of $n$, this skew is really small. For even bigger values of $n$, the standard deviation is really small, too. Indeed, we can even say the distribution approaches a dirac delta function at the mean (called a <b>degenerate distribution</b>).<br /><br /><div class="twn-furtherinsight">It seems that all functions not just can, but necessarily <em>do</em> approach a number of different functions at once -- i.e. you can always find multiple functions converging to the same thing. Are there any conditions on the function for this to be true? Think about the domain and co-domain.</div><br /><div class="twn-analogies">We've used the phrases "Laplace transform", "generating function", "bilateral Laplace transform" and "moment generating function" interchangeably, but there are subtle differences in the way they're defined (even though they're all kinda isomorphic).<br /><ul><li>A <b>generating function</b>, while typically defined as something for discrete/integer-valued random variables, can be extended to continuous distributions pretty easily. It's equivalent to a moment-generating function if you write $z^X = e^{\Omega X}$ via a variable substitution.</li><li>When talking about probability distributions, we always take the <b>bilateral Laplace transform</b>, not the standard <b>Laplace transform</b>, because the integrals are often easier to evaluate (for instance, you can always integrate a normal distribution with non-zero mean from $-\infty$ to $\infty$ with the standard $\sqrt{\pi}$ business, but you can't do that from $0$ to $\infty$ -- if you change the variables, the lower limit changes).</li><li><b>Moment generating functions </b>involve using ${e^{\Omega X}}$ instead of ${e^{ - \Omega X}}$ as the integrand. Thus a transformation of $\Omega\to-\Omega$ transforms between a moment generating function and a bilateral Laplace transform.</li><li>The <b>characteristic function</b> is a Wick rotation of the moment generating function, obtained via an integrand of ${e^{i\Omega X}}$.</li><li>The <b>Fourier transform</b> is similarly related to the characteristic function, it takes an integrand of ${e^{-i\Omega X}}$.</li></ul><br />Here's a convenient table to remember them by:<br />$$E\left( {{e^{\Omega X}}} \right) = G\left( {{e^\Omega }} \right) = MG(\Omega ) = \mathcal{L}_b( - \Omega ) = \varphi (i\Omega ) = \mathcal{F}( - i\Omega )$$<br /></div>central limit theoremgenerating functionsintegral transformlaplace transformmathematicsmoment generating functionsstatisticsThu, 22 Feb 2018 07:00:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-2881717743458457722Abhimanyu Pallavi Sudhir2018-02-22T07:00:00ZHow are skewness, kurtosis etc. distributed?
https://math.stackexchange.com/questions/2661409/how-are-skewness-kurtosis-etc-distributed
3<p>The mean of independent identically distributed random variables, if the moments exist, is normally distributed (to second order). The variance of independent identically distributed random variables is chi-square distributed.</p>
<p>Degenerate, normal, chi-squared... what comes next? How are the skewness, kurtosis and other higher moments distributed?</p>probability-distributionsrandom-variablesmoment-generating-functionsThu, 22 Feb 2018 06:50:04 GMThttps://math.stackexchange.com/q/2661409Abhimanyu Pallavi Sudhir2018-02-22T06:50:04ZRandom variables as vectors
https://thewindingnumber.blogspot.com/2018/02/random-variables-and-their-properties.html
0<div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/IEiX_Aj5tVQ/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/IEiX_Aj5tVQ?feature=player_embedded" width="320"></iframe></div><br />Even if you don't immediately see the problem with the seated character's reasoning in the above clip, you probably do realise that Muhammad Li <i>isn't</i> the most common name in the world. Indeed, it seems that while the seated character is quite humorous, he doesn't have a very good grasp of basic statistics.<br /><br />The key mistake made in his reasoning is that the first name and the last name are <i>not independent variables</i> -- a person with <i>Muhammad</i> as his first name is much more likely to have <i>Haafiz</i> as his last name than <i>Li</i>, even though <i>Li</i> may be more common among humans as a whole. In fact, the most common name in the world -- where the 2-variable name plot has its multivariable global maximum -- is <i>Zhang Wei</i>.<br /><br />This raises an essential issue in statistics -- the variables "first name" and "last name" have vary together, or <i>covary</i> -- as the first name varies on a spectrum, perhaps from "Muhammad" to "Ping", the last name varies together, perhaps from "Hamid" to "Li". One may then assign numbers to each first name and each last name, and perform all sorts of statistical analyses on them.<br /><br />But this ordering -- or the assignment of numbers -- seems to be dependent on some sort of reasoning based on prior knowledge. There are always plenty of other ways you can arrange the values of the variables so they correlate just as well, or even better. In this case, our reasoning was that both name and surname have a common determinant, e.g. place of origin, or religion. Without this reasoning, the arrangement seems arbitrary, or random -- which is why we call the specific numerical variable associated with the variable a <i>random variable</i>.<br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://i.stack.imgur.com/Fvxxt.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="519" data-original-width="800" height="257" src="https://i.stack.imgur.com/Fvxxt.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Example of non-linear transformation/rearrangement on some data.</td></tr></tbody></table>This also explains why linear correlations are of particular importance -- changing between arrangements, or random variables, is simply some transformation in the variable. As we saw, doing so would change the observed correlation. However, with linear transformations, the correlation will not change -- if all data points perfectly fit a line (any line), the correlation <i>is</i> going to be 1.<br /><br />This importance of linearity -- and the fact that maximum correlation is achieved when data points fit on a <i>line</i>, is suggestive.<br /><br />Well, if they all fit on a line, it means the two variables -- random variables -- $X$ and $Y$ satisfy some relation $Y = mX + c$. Well, if it were $Y = mX$, then it would be clear where we're going -- it means if you put all the values of $X$ and the corresponding values of $Y$ (i.e. the x-coordinates and y-coordinates of each data point) into two $N$-dimensional vectors (where $N$ is the number of data points), then the two vectors would be multiples of each other, $\vec Y = m\vec X$.<br /><br />So how do we get rid of the $+c$ term and make the whole thing linear, instead of affine? Obviously, we can transform the variables in some way, including a translation. Rather than arbitrarily choosing the translation, though (remember, translating either $X$ or $Y$ can make the thing pass through the origin), we translate them <i>both</i>, so the mean of the data points lies on the origin.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/K5CQ6.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="519" data-original-width="800" height="257" src="https://i.stack.imgur.com/K5CQ6.png" width="400" /></a></div><br />In addition, we scale the data points by $1/\sqrt{N}$ so the sizes of the vectors aren't influenced by the number of data points (this is the same reason we divide by $\sqrt{N}$ in stuff like standard deviation formulae).<br /><br />Then the vectors:<br /><br />$$\vec X = \frac1{\sqrt{N}}\left[ \begin{gathered}<br />{X_1} - \bar X \\<br />{X_2} - \bar X \\<br />\vdots\\<br />{X_N} - \bar X \\<br />\end{gathered}\right]$$<br />and<br /><br />$$\vec Y = \frac1{\sqrt{N}}\left[ \begin{gathered}<br />{Y_1} - \bar Y \\<br />{Y_2} - \bar Y \\<br />\vdots \\<br />{Y_N} - \bar Y \\<br />\end{gathered} \right]$$<br />are colinear... if the linear correlation is perfect.<br /><br />Well, what if it's not? Well, clearly, the vectors $\vec X$ and $\vec Y$ represent the deviation of each data point from the mean. Calculating their norms would give us the standard deviation in $X$ and $Y$ respectively.<br /><br />$$\begin{gathered}<br />\text{Var}\,(X) = {\left| {\vec X} \right|^2} \hfill \\<br />\text{Var}\,(Y) = {\left| {\vec Y} \right|^2} \hfill \\<br />\end{gathered} $$<br />Similarly, calculating their dot product tells us how much the two vectors go together -- or, how much $X$ and $Y$ vary together, or <i>covary</i>. It tells us their <i>covariance</i>.<br /><br />$$\text{Cov}\,(X,Y) = \vec X \cdot \vec Y$$<br />Note, however: this doesn't really give us the measure of <i>colinearity</i>, much like the dot product doesn't tell us the measure of colinearity. The dot product tells us the measure of how much the two vectors go together -- it's not just the "together" part that matters, but also the "go". The more each vector "goes" (in its own direction), the more the dot product.<br /><br />In a sense, the dot product measures "co-going". Similarly, the covariance of two variables depends not only on how correlated they are, but also on how much each variable varies.<br /><br />To measure correlation, we need $\cos\theta$, i.e.<br /><br />$${\text{Corr}}\,(X,Y) = \frac{{{\text{Cov}}\,(X,Y)}}{{\sqrt {{\text{Var}}\,(X){\text{Var}}\,{\text{(}}Y{\text{)}}} }}$$<br /><hr /><br />This geometric understanding of random variables is extremely useful. For instance, you may have wondered about this oddity about the variance of a sum of variables -- the variance of the sum of independent variables goes like this:<br /><br />$${\text{Var}}\left( {{X_1} + {X_2}} \right) = {\text{Var}}\left( {{X_1}} \right) + {\text{Var}}\left( {{X_2}} \right)$$<br />But the variance of the sum of the same variable goes like this:<br /><br />$${\text{Var}}\left( {2{X_1}} \right) = 4{\text{Var}}\left( {{X_1}} \right)$$<br />Why? Well, when you're talking about independent variables, you're talking about <strong>orthogonal vectors</strong>. When you're talking about the same variable -- or any two perfectly correlated variables -- you're talking about <strong>parallel vectors</strong>. Variance is just norm-squared, so in the former case, we apply Pythagoras's theorem, which tells us the norm-squared adds up. In the latter case, scaling a vector by 2 scales up its norm by 2 and thus its norm-squared by 4. <br /><br />Well, what about for the cases in between? What's the generalised result? Well, it's the cosine rule, of course!<br /><br />$${\text{Var}}\left( {{X_1} + {X_2}} \right) = {\text{Var}}\left( {{X_1}} \right) + {\text{Var}}\left( {{X_2}} \right) + 2{\text{Cov}}\left( {{X_1},{X_2}} \right)$$<br /><div class="twn-furtherinsight">Why is it $+ 2{\text{Cov}}\left( {{X_1},{X_2}} \right)$ and not $- 2{\text{Cov}}\left( {{X_1},{X_2}} \right)$? Try to derive the formula above geometrically to find out.</div><br /><div class="twn-furtherinsight">How would this result generalise to variances of the form ${\text{Var}}\left(X_1+X_2+X_3\right)$ for instance?</div><br /><hr /><br />Here's something to ponder about: what if we had more than two variables? Then we'd have more than two vectors. Can we still measure their linear correlation? What about planar correlation? <br /><br />To answer this, you may first want to consider natural generalisations of cosines to three vectors that arise from answering the previous Further Insight Prompt ("How would this result generalise to variances of...?"). Trust me, the result will be worth it!<br /><br />Another exercise: with only the intuition and analogies we've developed here, discover the equations of the least-squares approximation/line-of-best-fit with:<br /><ul><li>vertical offsets</li><li>horizontal offsets</li><li>if you can, perpendicular offsets</li></ul>And finally, a simple one: interpret the formula $\text{Cov}(X,Y)=E(XY)-E(X)E(Y)$ with the intuition we already have.<br /><br /><hr/><br />Here's an interesting application of the idea of variables being dependent or independent -- you've probably heard of Aristotle's "the truth lies in the middle of the two extremes" nonsense. Ignoring for a moment the fact that this statement completely, utterly lacks anything remotely resembling something called <em>meaning</em>, and the fact that the true answer to the question "How thoroughly should Aristotle be brutally flogged to death for being retarded?" is pretty extreme, let's do some intuitive hand-wavy analysis of the statement.<br /><br />My first reaction -- temporarily suppressing rationality and homicidal feelings towards the crook who defrauded Ancient Greece and helped plunge Europe into the dark ages -- is that this want-meaning statement is completely untrue. In fact, the truth typically lies <em>at</em> the extremes. Back in the 1850s in the U.S., the "middle-of-the-road" position was to ship the slaves off to Africa. This wasn't the truth, the truth was "liberate the slaves". The reason that the truth tends to lie at the extremes, I realised, is that the same principles that hold true to justify one prescription, still remain true when justifying another prescription. So the correct prescription on <em>all</em> issues tend towards the same principles, and the correct ideology results from applying the same principles consistently -- the sum of which therefore doesn't cancel out, but instead adds up to an extreme correct ideology.<br /><br />But hey -- while this is true in the context of abstract political ideologies, it's not true in other contexts. For example, "how much is the environment worth?" Clearly, neither extreme -- "chop down every tree on Earth to give some kid an iPhone" or "let everyone in the world die a gruesome death to save one tree" -- is the right prescription here. The environment has some finite, non-zero economic value. Why is the golden mean so wrong in the context of "extreme intellectual consistency in politics is good" and yet right in the context of "there are optimal balances/allocations in the economy"?<br /><br />The reason there is a finite value for the environment is that the more "environment" you have, the less valuable the next unit of "environment" becomes, since there's less use for it -- and people not dying gruesome deaths (or getting iPhones) becomes a better use of resources. In other words, the next unit of environment and the current unit of environment are <em>not independent variables</em> -- one variable affects the other. On the other hand, the separate political prescriptions are independent variables, so stuff adds up and doesn't cancel out.<br /><br />The analogy is far from a perfect one (the variables aren't random variables in the first place), but it's interesting to think about.correlationcovarianceintuitionmathematicsrandom variablesstatisticsvectorsSat, 10 Feb 2018 04:48:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-8155535396454463676Abhimanyu Pallavi Sudhir2018-02-10T04:48:00ZIntroduction to tensors and index notation
https://thewindingnumber.blogspot.com/2017/10/introduction-to-tensors-and-index.html
0When you learned linear algebra, you learned that under a passive transformation $B$, a scalar $c$ remained $c$, vector $v$ transformed as $B^{-1}v$ and a matrix transformed as $B^{-1}AB$. That was all nice and simple, until you learned about quadratic forms. Matrices can be used to represent quadratic forms too, in the form<br /><br />$$v^TAv=c$$<br />Now under the passive transformation $B$, $c\to c$, $v\to B^{-1}v$ and $v^T\to v^T\left(B^T\right)^{-1}$. Let $A'$ be the matrix such that $A\to A'$. Then<br /><br />$${v^T}{\left( {{B^T}} \right)^{ - 1}}A'{B^{ - 1}}v = c = {v^T}Av$$<br />As this must be true for all vectors $v$, this means ${\left( {{B^T}} \right)^{ - 1}}A'{B^{ - 1}} = A$. Hence<br /><br />$$A' = {B^T}AB$$<br />This is rather bizarre. Why would the <i>same object</i> -- a matrix -- transform differently based on how it's used?<br /><br />The answer is that these are really two different objects that just correspond to the same matrix in a particular co-ordinate representation. The first, the object that transformed as $B^{-1}AB$, is an object that maps vectors to other vectors. The second is an object that maps two vectors to a scalar.<br /><br /><b>These objects we're talking about are tensors.</b> A tensor representing a quadratic form is not the same as a tensor representing a standard vector transformation, because they only have the same representation (i.e. the same matrix) in a specific co-ordinate basis. Change your basis, and voila! The representation has transformed away, into something entirely different.<br /><br />There's a convenient notation used to distinguish between these kinds of tensors, called index notation. Representing vectors as ${v^i}$ for index $i$ that runs between 1 and the dimension of the vector space, we write<br /><br />$$A_i^j{v^i} = {w^j}$$<br />For the vanilla linear transformation tensor -- $j$ can take on new indices, if we're dealing with non-square matrices, but this is not why we use a different index -- we use a different index because $i$ and $j$ can independently take on distinct value. Meanwhile,<br /><br />$$\sum\limits_{i,j}^{} {{A_{ij}}{v^i}{v^j}} = c$$<br />A few observations to define our notation:<br /><ol><li>Note how we really just treat $v^i$, etc. as the $i$th component of the vector $v$, as the notation suggests. This is very useful, because it means we don't need to remember the meanings of fancy new products, etc. -- just write stuff down in terms of components. This is also why order no longer matters in this notation -- the fancy rules regarding matrix multiplication are now irrelevant, our multiplication is all scalar, and the rules are embedded into the way we calculate these products.</li><li>An index, if repeated once on top and once at the bottom anywhere throughout the expression, ends up cancelling out. This is the point of choosing stuff to go on top and stuff to go below. E.g. </li><li>If you remove the summation signs, things look a lot more like the expressions with vectors directly (i.e. not component-wise).</li></ol><br />(1) cannot be emphasised enough -- when we do this product, ${v^i}{w_j} = A_j^i$, what we're really doing is multiplying two vectors to get a rank-2 tensor. When we multiply $v_iw^i=c$, we're multiplying a covector by a vector, and get a rank-0 tensor (a scalar). The row vector/column vector notation and multiplication rules are just notation that helps us yield the same result -- we represent the first as a column vector multiplied by a row vector, and the second as a row vector multiplied by a column vector. Note that this does not really correspond to the positioning of the indices -- $v_iw^j$ also gives you a rank 2 tensor, since you can swap around the order of multiplication in tensor notation -- this is because here we're really operating with the scalar components of $v$, $w$ and $A$, and scalar multiplication commutes.<br /><br /><div class="twn-furtherinsight">If we were to use standard matrix form and notation to denote $A_j^i$, would $j$ denote which column you're in or which row you're in?</div><br />A demonstration for (3) is the dot product between vectors $v^i$ and $w^i$, $\sum\limits_i {{v_i}{w^i}} $ where writing $i$ is a subscript represents a covector (typically represented as a row vector). This certainly looks a lot nicer just written as ${{a_i}{b^i}}$ -- like you're just multiplying the vectors together.<br /><br />This -- omitting the summation sign when you have repeated indices -- half of them on top and the other half at the bottom -- is called the <b>Einstein summation convention</b>.<br /><br />An important terminology to mention here -- you can see that the summation convention introduces two different kind of indices, unsummed and summed -- the first is called a "free index", because you can vary the index within some range (typically 1 to the dimension of the space, which it will mean throughout this article set unless stated otherwise, but sometimes the equation might hold only for a small range of the index), and the second is called a dummy index (because it gets summed over anyway and holds no relevance to the result).<br /><br /><b>Question 1</b><br /><br />Represent the following in tensor index notation, with or without the summation convention.<br /><ol><li>$\vec a + \vec b$</li><li>$\vec v \cdot \vec w$</li><li>$|v{|^2}$</li><li>$AB=C$ </li><li>$\vec{v}=v^1\hat\imath+v^2\hat\jmath+v^3\hat{k}$</li><li>$B = {A^T}$</li><li>$\mathrm{tr}A$</li><li>The pointwise product of two vectors, e.g. $\left[ {\begin{array}{*{20}{c}}a\\b\end{array}} \right]\wp \left[ {\begin{array}{*{20}{c}}c\\d\end{array}} \right] = \left[ \begin{array}{l}ac\\bd\end{array} \right]$</li><li>${v^T}Qv = q$</li></ol><br />Feel free to define your own tensors if necessary to solve any of these problems.<br /><br /><b>Question 2</b><br /><br />The fact that the components of a vector and its corresponding covector are identical, i.e. that ${v_i} = {v^i}$, has been a feature of Euclidean geometry, which is the geometry we've studied so far. The point of defining things in this way is that the value of ${w_i}{v^i}$, the Euclidean dot product, is then invariant under rotations, which are a very important kind of linear transformation.<br /><br />However in relativity, Lorentz transformations, which are combinations of skews between the <i>t</i> and spatial axes and rotations of the spatial axes, are the important kinds of transformations. This is OK, because <a href="https://thewindingnumber.blogspot.in/2017/08/symmetric-matrices-null-row-space-dot-product.html">rotations are really just complex skews</a>. The invariant under this Lorentz transformation is also called a dot product, but defined slightly differently:<br /><br />$$\left[ \begin{array}{l}{t}\\{x}\\{y}\\{z}\end{array} \right] \cdot \left[ \begin{array}{l}{E}\\{p_x}\\{p_y}\\{p_z}\end{array} \right] = - {E}{t} + {p_x}{x} + {p_y}{y} + {p_z}{z}$$<br /><br />Therefore we define covectors in a way that negates their zeroth (called "time-like" -- e.g. $t$) component. I.e. ${v_0} = - {v^0}$. For instance if the vector in question is<br /><br />$$\left[ \begin{array}{l}t\\x\\y\\z\end{array} \right]$$<br />Then the covector is<br /><br />$$\left[ \begin{array}{l} - t\\x\\y\\z\end{array} \right]$$<br />These are called the covariant and contravariant components of a vector respectively.<br /><br />The dot product is then calculated normally as ${v_i}{w^i}$, and is invariant under Lorentz transformations like the Euclidean dot product is invariant under spatial rotations. Similarly, the norm (called the Minkowski norm) is calculated as $(v_iv^i)^{1/2}$.<br /><br />But what if we wished, for some reason, to calculate a Euclidean norm or Euclidean dot product? How would we represent that in index notation?<br /><br />(<a href="https://thewindingnumber.blogspot.in/2017/10/relativistic-dynamics-spacetime-vectors.html">More on dot products on Minkowski geometry</a>)<br /><br /><b>Answers to Question 1</b><br /><ol><li>${a_i} + {b_i}$</li><li>${a_i}{b^i}$</li><li>$a_ia^i$</li><li>$C_j^i = A_k^iB_j^k$</li><li>${v^i} = {v^j}\delta _j^i$</li><li>$B^i_j=A^j_i$ (or alternatively $B_{ij}=A_{ji}$, etc.)</li><li>$A_i^i$</li><li>$z_k = \wp_{ijk}x^iy^j$ where $\wp_{ijk}$ is a rank-3 tensor which is 0 unless $i=j=k$, in which case it's 1.</li><li>${v^i}{Q_{ij}}{v^j} = q$</li></ol><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://upload.wikimedia.org/wikipedia/commons/thumb/7/71/Epsilontensor.svg/500px-Epsilontensor.svg.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="250" data-original-width="500" height="160" src="https://upload.wikimedia.org/wikipedia/commons/thumb/7/71/Epsilontensor.svg/500px-Epsilontensor.svg.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Another rank-3 tensor, the Le Cevita symbol. Let's not call it a<br />"3-dimensional tensor", since that just means the indices all<br />range from 1 to 3 (or any other three integer values)</td></tr></tbody></table><br /><b>Answer to Question 2</b><br /><br />Euclidean dot product: $\eta _i^i{v_i}{w^i}$<br />Euclidean norm: $\eta _i^i{v_i}{v^i}$<br /><br />Where<br /><br />$$\eta _i^j = \left[ {\begin{array}{*{20}{c}}{ - 1} & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\0 & 0 & 0 & 1\end{array}} \right]$$<br /><br /><div class="twn-furtherinsight">We really want its inverse in the above two formulae, but they happen to be equal in the basis we're using, where $c=1$.</div><br />...is the Minkowski metric tensor, the Minkowski analog of the Dirac Delta function, and contains the dot products of the basis vectors as its components.<br /><br />For this reason, we actually call dot products, cross products, pointwise products, etc. <i>tensors themselves</i>. For instance, the Euclidean dot product is $\delta_{ij}$, the Minkowski dot product is $\eta_{ij}$, the pointwise product we mentioned earlier is a rank 3 tensor $\wp_{ijk}$, and as we will see, the cross product is also a rank 3 tensor $\epsilon_{ijk}$. In fact, it is conventional to define different dot products based on what transformations are important, so the dot product is invariant under this transformation. If rotations are important, use the circular, Euclidean dot product. If skews are important for one dimension and rotations for the other three, as it is in relativity, use the hyperbolic, Minkowski dot product.<br /><br /><b>Relabeling of indices</b><br /><br />In solving things with standard summation-y notation, you might've often noticed it to be useful to group certain terms together. For instance, if you have<br /><br />$$\sum\limits_{i = 1}^n {x_i^2} + \sum\limits_{j = 1}^n {y_j^2} = \sum\limits_{k = 1}^n {2{x_k}{y_k}} $$<br />It might be useful to rewrite this as<br /><br />$$\sum\limits_{i = 1}^n {(x_i^2 + y_i^2 - 2{x_i}{y_i})} = 0$$<br />What we did here, implicitly, was change the indices $j$ and $k$ to $i$. This is possible, because the summed indices vary between the same limits. In Einstein notation, the first sum would have been<br /><br />$${x_i}{x^i} + {y_j}{y^j} = 2{x_k}{y^k}$$<br />And the relabelling was ${x_i}{x^i} + {y_i}{y^i} = 2{x_i}{y^i}$. We will do this all the time, so get used to it.<br /><br />Even when the ranges of the indices are not the same, you can add or subtract a few terms to make the indices the same. E.g. if in ${x_i}{x^i} + {y_j}{y^j} = 2{x_k}{y^k}$, $k$ ranges between 1 and 3 while $i$ and $j$ range between 0 and 3, then we can write<br /><br />$${x_i}{x^i} + {y_j}{y^j} = 2{x_m}{y^m} - 2{x_0}{y^0}$$<br />And then relabel, where $m$ ranges between 0 and 3.<br />einstein summation conventionindex notationtensor algebratensor calculustensorsMon, 16 Oct 2017 04:15:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-3742174555661933308Abhimanyu Pallavi Sudhir2017-10-16T04:15:00ZRelativistic dynamics
https://thewindingnumber.blogspot.com/2017/10/relativistic-dynamics.html
0There are numerous "proofs" you will finding online of mass-energy equivalence and other dynamic equations in relativity, most of which are wrong ("let's accelerate an object near the speed of light"), circular ("let's prove momentum from energy and vice versa"), simplistic and insufficiently motivated ("it's just empirical"), or just plain inelegant (some really weird collision).<br /><br />The motivation for studying relativistic dynamics comes from thinking about conservation of the standard forms of energy and momentum with our new relativistic dynamics. It is easy to demonstrate that $mv$ cannot be conserved in all inertial frames of reference in special relativity. Consider two balls of equal mass colliding inelastically with equal speed $v$ in opposite directions, $+v$ and $-v$. They smash into each other and remain stationary.<br /><br />Now boost into one of the balls' frames, say $v$. Now the velocity of the other ball is $2v/(1+v^2)$, so the total initial momentum is $-2mv/(1+v^2)$. But after the collision, we see the thing moving at a velocity of $-v$ (we know this because it was 0 in the original frame), which means the final total momentum is $-2mv$, so momentum is not conserved.<br /><br />But we don't like this! If this expression isn't conserved, we can't use it so nicely in calculations and stuff. We want to define momentum in a way that it is conserved. Similar arguments can be used to show that $mv^2/2$ is not conserved, either.<br /><br />You may try to derive a conserved expression via similar arguments as the symmetry-based arguments we use in non-relativistic mechanics, swapping Galilean symmetry with Lorentz symmetry where appropriate. The resulting functional equations would be ludicrously complicated, though, and we'd much rather use a different symmetric argument.<br /><br />We've made several arguments so far based on known properties of light, and it would make sense to assume other, quantum mechanical properties of light as well. Two such properties are:<br /><br />$$\begin{array}{l}p = hf/c\\E = hf\end{array}$$<br />This means that we know the behavior of $p$ and $E$ at low velocities, as well as at velocities close to the speed of light. Surely, we're smart enough to fill in the stuff in between?<br /><br />Consider the following set-up: a stationary mass <i>m</i> lets out two equal flashes of light in opposite directions, each with energy = momentum (since $c=1$) <i>E</i>/2. We then analyse the same set-up from a boosted reference frame with velocity $v$. This involves a doppler shift in the frequency of each light beam.<br /><br />We'll consider this set-up in the following three examples:<br /><br /><b>(a) <i>v</i> is small, momentum conservation</b><br /><b><br /></b>We first consider the case where <i>v</i> is small enough to allow the usage of non-relativistic mechanics. Formally, this means taking the limit as $v\to0$.<br /><br />Then the doppler shift factor $\sqrt{\frac{1+v}{1-v}}$ approaches $1+v$ and $\sqrt{\frac{1-v}{1+v}}$ approaches $1-v$. Both energy and momentum are scaled by the same factor since they're proportional to frequency. Now you know why we choose momentum conservation instead of energy conservation -- the total energy is clearly conserved anyway.<br /><br />The reason we consider low velocities is that we know the formula for momentum must reduce to the Newtonian $p=mv$, i.e. the initial momentum of the system was $-mv$. The total momentum of the two flashes of light is $((1+v)E/2-(1-v)E/2)=vE$. Since momentum must be conserved, this means the momentum of the mass itself is no longer $-mv$. But its velocity is constant, and still low, so this means some of the mass must have been converted into the energy of the photons. Specifically,<br /><br />$$-m_fv-(-m_iv)=vE$$<br />Giving us the celebrated equation<br /><br />$$E=m$$<br />Where $m$ is the amount of mass that was converted into energy. You could, of course, write this in inelegant ways such as $E=c^2m$ or even $E=mc^2$.<br /><br /><div class="twn-pitfal">this change in mass is not linked to the whole "relativistic mass" thing we'll be doing later. This decrease in mass is absolute, mass is not conserved, it is also seen in the rest frame, and is required to produce that bit of energy. It's only the derivation that requires boosting into another reference frame, to ensure conservation in all reference frames.</div><br /><div class="twn-furtherinsight">On a related note, note that <i>conserved</i> and <i>invariant</i> are by no means the same thing, or even related. A quantity is <i>conserved</i> if it doesn't change with time when taken of the whole system. It is <i>invariant</i> if it is the same from all reference frames. The difference isn't even subtle -- proper mass is an invariant in special relativity, but Energy and momentum are conserved.</div><br /><div class="twn-furtherinsight">Something to think about: why doesn't our argument work in a non-relativistic frame? I mean, we even assumed that <i>v</i> is small. Try to perform the same arguments without relativity -- you will see that since there is no relativistic doppler shift, the result will have a unit of mass being worth an infinite amount of energy -- something you get in the limit $c\to\infty$ -- useless anyway.</div><br /><b>(b) v is not small, energy conservation</b><br /><b><br /></b>We said the decrease in mass exists in all reference frames. If we found what exactly the decrease in mass $\Delta m$ is in each reference frame, then we'd be able to see how mass transforms under a Lorentz transformation.<br /><br />In the rest frame, energy $E$ is released, therefore by energy conservation the energy (or equivalently, the mass) of the object decreases by $E$.<br /><br />In the moving frame, one of the beams transforms as $\sqrt {\frac{{1 + v}}{{1 - v}}} \frac{E}{2}$ while the other transforms as $\sqrt {\frac{{1 - v}}{{1 + v}}} \frac{E}{2}$. So the total energy released (i.e. the energy loss of the object) is:<br /><br />$$\left( {\sqrt {\frac{{1 - v}}{{1 + v}}} + \sqrt {\frac{{1 + v}}{{1 - v}}} } \right)\frac{E}{2} = \gamma E$$<br />So the mass has transformed as $\gamma m$ under a Lorentz boost of significant velocity.<br /><br />We call this mass the "relativistic mass" $M$, and distinguish it from the rest mass $m$.<br /><br />Then the following are immediately true:<br /><ul><li>$E = m$ is only true when an object is at rest. In general, $E = \gamma m$. We may call $E_0=m$ the rest energy.</li><li>$E=M$</li><li>$M=\gamma m$</li><li>The increase in mass is essentially the kinetic energy. One may Taylor (or Newton's Binomial) expand out $m/\sqrt{1-v^2}$ to see that the terms start as $m+\frac12mv^2+3/8mv^4+...$, and the higher-order terms vanish at low speeds. Therefore the relativistic kinetic energy is generally $M-m=(\gamma-1)c^2m$.</li></ul><br />In general, we will denote the relativistic mass as $E$ and the rest mass as $m$ unless otherwise stated.<br /><br />It is a fad among modern relativity textbooks to claim the phrase "relativistic mass" is a misnomer or even a mnemonic to help kids understand relativity and simply call it the energy, reserving the word "mass" to mean the rest mass. However, this obscures some of the best analogies between spacetime and momentum-energy, as we will soon see -- for instance, the relativistic mass is actually analogous to the co-ordinate time and the rest mass to the proper time/spacetime interval.<br /><br />Therefore, we will use the word "mass" to refer to the relativistic mass $E$ and "proper mass" and "momentum-energy interval" to refer to the rest mass $m$. This is a convention in our course only.<br /><br /><b>(c) v is not small, momentum conservation</b><br /><b><br /></b>We may do a similar analysis as above with momentum to arrive at the expression for relativistic momentum.<br /><br />The total/net momentum of the light beams in the boosted frame is<br /><br />$$\left( {\sqrt {\frac{{1 + v}}{{1 - v}}} - \sqrt {\frac{{1 - v}}{{1 + v}}} } \right)\frac{E}{2} = \gamma vE$$<br />(Note that $E$ represents the total rest energy of the light beams here, as was defined in the question.)<br /><br />Therefore $p=\gamma mv$, or $p=vE$.<br /><br />You may use this to calculate the relativistic calculation for $F=dp/dt$, but it's simply computation from this point, so I'll just direct you to <a href="https://en.wikipedia.org/w/index.php?title=Relativistic_mechanics&oldid=800422825#Force" target="_blank">wikipedia</a>. Come up with an expression for a general directional inertia (simple).<br /><br /><div class="twn-pitfall">some people are surprised by the relation $p=vE$, or even remember it wrongly as $E=vp$ because of the seeming resemblance with $E=pc$ at the speed of light (this confusion is because of people not getting the hang of $c=1$ natural units). But it's really nothing new. $E$ is simply the mass. We know momentum equals mass times velocity. This is not new.</div><br />Continued in <a href="https://thewindingnumber.blogspot.in/2018/05/minkowski-everything-four-vectors-rapidity.html">Minkowski everything -- spacetime vectors, rapidity</a>.relativistic dynamicsrelativityspecial relativityFri, 13 Oct 2017 07:29:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-2916412151588498190Abhimanyu Pallavi Sudhir2017-10-13T07:29:00ZMinkowski everything -- invariants
https://thewindingnumber.blogspot.com/2017/10/minkowski-everything-invariants.html
0Some philosophers often say silly things like "truth is relative" or worse, "relativity implies that truth is relative".<br /><br />Even before relativity, there would be people who gave obviously insincere explanations of this axiomatically incorrect statement -- e.g. "the number 6 viewed from the opposite direction looks like the number 9, therefore truth is relative" or "some people like doughnuts, some people don't, therefore truth is relative". The answer to these kinds of arguments is "someone who sees the number as 6 agrees the other guy sees it as 9, and vice versa", "someone who likes donuts agrees the other person doesn't". The statement <i>donuts are good</i> is not meaningful, except in terms of the donut-liker's neurobiology -- it's equivalent to saying "when you put a donut in his mouth, dopamine is released in his brain". All observers agree that this is the case with him, it's just that dopamine isn't released in the donut-disliker's brain. These statements of absolute truth <i>are</i> absolute.<br /><br />Perhaps this gives too much credit to these nonsensical arguments, but the response is similar with relativity. If your parents were bored of raising two children so decided to send your twin brother to Trappist-1 at close to the speed of light, then you would be 80 years old when he returns as a newborn baby. But you do see him as a newborn baby, not an old man, and if you could understand his unintelligible babbling, you would hear that he sees you as an old man on the verge of death, not a kid his age he can play with.<br /><br />So biological age is an invariant. Even though you see him as having lived 80 years, you also think that his clock moved a lot slower, which is why he's still an infant.<br /><br />But there's nothing special about human biology or biological clocks. Even if the newborn took a clock with him, the time recorded on that clock is an invariant -- all observers agree on what it is.<br /><br />Let's try to extract this biological time -- we will call this the "proper time" from the co-ordinate measurements of any arbitrary observer.<br /><br />We have:<br /><br />$$\Delta t = \frac{{\Delta t'}}{{\sqrt {1 - {v^2}} }}$$<br />We write ${\Delta t'}$ as ${\Delta \tau }$, the general proper time according to the moving observer himself.<br /><br />$$\begin{array}{l}\Delta \tau = \Delta t\sqrt {1 - {v^2}} \\\Delta \tau = \sqrt {\Delta {t^2} - {v^2}\Delta {t^2}} \\\Delta \tau = \sqrt {\Delta {t^2} - \Delta {x^2}} \\\Delta {\tau ^2} = \Delta {t^2} - \Delta {x^2}\end{array}$$<br />One may check that this result is always invariant by Lorentz-transforming $t$ and $x$ and showing $t'^2-x'^2=t^2-x^2$. In a general orthonormal co-ordinate system of spatial co-ordinates (i.e. we don't necessarily take $x$ to be the direction of motion), we may write:<br /><br />$$\Delta {\tau ^2} = \Delta {t^2} - \Delta {x^2} - \Delta {y^2} - \Delta {z^2}$$<br />Note the resemblance to the Euclidean norm/Pythagorean theorem! If only the minus signs were pluses, this would be the Euclidean norm. This norm is called the Minkowski norm, and the proper time $\Delta\tau$ (or sometimes $\Delta s=c\Delta\tau$, which is the same thing when we set $c=1$) is called the spacetime interval.<br /><br />This equation summarises the non-dynamical results of special relativity, and can be treated as an alternative axiomatic foundation for the theory (the "Minkowskian formulation", as opposed to the Einsteinian one we've been discussing so far) -- it's the Pythagorean theorem on spacetime. Unlike in Galilean relativity, where time and space are individually invariant, in special and general relativity, <i>spacetime</i> is invariant -- time and space simply transform between each other leaving the norm of $(\Delta t,\Delta x,\Delta y,\Delta z)$ invariant. This is indeed a rotation ("skew") of this vector, but in Minkowski spacetime, rotations are across hyperboloids, called <b>invariant hyperboloids </b>(or in 2D, hyperbolae), not spheres (or circles). Changing the observer changes the spacetime vector (called four-position), but doesn't take it off this invariant hyperbola.<br /><br />Indeed, this means that Minkowski spacetime doesn't have the geometry of Euclidean geometry -- instead, it has a geometry called "hyperbolic geometry", which cannot be embedded in Euclidean space (i.e. we have no way to visualise it).<br /><br />Here's another possible motivation for studying invariants:<br /><blockquote>Lorentz boosts are essentially rotations in the t-x plane (hyperbolic rotations, actually, or <em>skews</em>, but stick with the analogy for now), so it's often useful to get an intuitive feel for them in special relativity by comparing boosts to rotations on some other plane, like the x-y plane. So let's do that.<br /><br />Consider if you were measuring the y-length of a stick on the x-y plane -- clearly, this depends on your frame of reference. A co-ordinate system in which the stick lies on the y-axis clearly gives you the maximum value of this y-length, a co-ordinate system in which it lies on the x-axis clearly gives you a value of 0.<br /><br /><a href="https://i.stack.imgur.com/kp8b2.png"><img border="0" data-original-height="571" data-original-width="792" height="230" src="https://i.stack.imgur.com/kp8b2.png" width="320" /></a><br /><br />So the specific co-ordinate dimensions $(x, y)$ of the stick depend on your reference frame. But we can also be interested in the <em>real</em> lengths of sticks, because this is invariant in all reference frames. This can be calculated easily using the Pythagorean theorem:<br /><br />$$\psi=\sqrt{x^2+y^2}$$<br />(Note that the invariance is not the only thing that is important, but also that it allows you to define a polar co-ordinate system where $x=\psi\cos\theta$, $y=\psi\sin\theta$.)<br /><br />If you accept that it can be useful to know the dimensions of objects on their own axes, it's clear that the same principle applies on the t-x plane. Here, the "rotations" are skews, the trigonometry is hyperbolic trigonometry, the Pythagoras theorem is $\tau=\sqrt{t^2-x^2}$ and instead of the proper time being the highest point of a circle it is the lowest point of a hyperbola.<br /><br />But the same principles still apply -- if you see someone blast a toddler off into outer space at a high speed then return, you might measure the toddler as having taken a hundred years to return, but you and the toddler both agree (assuming he isn't dead yet from starvation) that he's only aged a year. This biological time, or proper time, is an invariant.</blockquote>(From my answer on Physics Stackexchange to <a href="https://physics.stackexchange.com/questions/171562/why-invariance-is-important/410663#410663">Why is invariance important?</a>)<br /><br />A related fact is an intuitive explanation for the speed of light being the maximum achievable speed -- all observers have a fixed speed ($ds/d\tau$) through spacetime, which is the speed of light -- this is essentially a tautology. A stationary object has no speed through space, so $dx^2+dy^2+dz^2=0$ so it moves at $c$ through time ("co-ordinate time" $t$ -- as opposed to proper time), i.e. $d(ct)/d\tau=c$. On the other hand, when an object moves at the speed of light, its clock has stopped -- we see $d(ct)/d\tau=0$. The velocity cannot exceed the speed of light, because the object simply doesn't have that much speed -- it doesn't have any more speed to take from its time-speed. Another way of saying this is that an invariant hyperboloid never crosses the light cone.<br /><br />It's important to keep in mind that in our argument above, time, position and velocity are always with respect to some other observer (again, this is also implied by the Minkowskian formulation, as $dx$, $dt$ etc. are in the frame of some observer). So the point is really that "no observer can see an object going faster than light, because to keep the speed through spacetime fixed, the Lorentz transformation would have to map the time to an imaginary number ($\Delta t^2 < 0$).<br /><br />We will see later that there are other quantities that transform between each other like time and space. Then we will see that the four-position is just another vector among a class of vectors called four-vectors.<br /><br />(Note of caution: often, $\Delta s^2$ instead of $\Delta s$ is called the spacetime interval. When you hear the phrase "negative spacetime interval", this is typically what is being referred to.)<br /><br />(Note: Because both $\Delta s^2$ and $-\Delta s^2$ are invariants, sometimes $- {c^2}d{t^2} + d{x^2} + d{y^2} + d{z^2}$ is called the spacetime interval instead. This choice is called the "metric signature" and is denoted by $(+---)$ and $(-+++)$ respectively. The first is also called the <i>particle physics</i> <i>convention</i>, the <i>quantum field theory convention, </i>the <i>West coast convention</i>, the <i>time-like convention</i> and the <i>mostly-minus convention</i>. The second is also called the <i>cosmology convention, </i>the <i>general relativity convention</i>, the <i>East Coast convention</i>, the <i>space-like convention</i> and the <i>mostly-plus convention</i>. However, $\Delta\tau^2$ is always defined via the time-like convention, as it is the proper time.)<br /><br /><div class="twn-pitfall">You might be tempted to say that Minkowski spacetime is simply 4-dimensional Euclidean spacetime with one of the dimensions being $ict$ instead of $ct$. However, this doesn't actually make Minkowski spacetime Euclidean -- for instance, Minkowski spacetime allows distinct points in spacetime to have a zero spacetime interval between them, something not possible with a Euclidean distance function. After all, the norm of a complex number $t + ix$ is still $\sqrt{t^2+x^2}$, not $\sqrt{t^2-x^2}$.</div><br /><div class="twn-pitfall">You might be tempted to rewrite the equation as $d{t^2} = d{\tau ^2} + d{x^2} + d{y^2} + d{z^2}$. But since $d{t^2}$ is not an invariant, this obscures the true geometry of Minkowksi spacetime, which is hyperbolic, not Euclidean. Similarly, equations like $m^2 = E^2-p^2$ (where $m$, $E$ and $p$ are the proper mass, relativistic mass and momentum respectively -- we will later derive this) should not be written as $E^2=m^2+p^2$.</div><br /><div class="twn-pitfall">You might recall some equations in physics that seem to exhibit the same kind of symmetry between space and time as the spacetime interval -- $-c^2t^2$ and $x^2$ showing a symmetry. An example is the wave equation for light, $\frac{1}{c^2}\frac{\partial^2u}{\partial t^2}-\frac{\partial^2u}{\partial x^2}=0$. This is actually the reason why Maxwell's equations are already Lorentz invariant, and indeed, we will see that this symmetry will be our criterion for Lorentz invariance.</div><br />(Technical note: Formally speaking, Minkowski spacetime doesn't actually have hyperbolic geometry itself. What it does have are sub-manifolds with a hyperbolic geometry.)<br /><br />We may divide spacetime intervals into three categories: space-like (outside the light cone), light-like (on the light cone) and time-like (inside the light cone), corresponding to the cases $\Delta s^2<0$, $\Delta s^2=0$ and $\Delta s^2>0$ respectively (in the cosmology convention, it is exactly disrespectively). The fact that you cannot influence space-like separated events, i.e. cannot travel faster than light is the same as saying "you cannot transverse an imaginary proper time".<br /><br />Saying the speed of light is fixed for all observers is equivalent to saying that the statement $\Delta s^2=0$ is invariant, since $\Delta s= \sqrt{c^2\Delta t^2-\Delta x^2}$ and $x=ct$. We now know that $\Delta s^2=n$ is invariant for all $n$, not just 0.<br /><br /><div style="text-align: center;"><iframe frameborder="0" height="500px" src="https://www.desmos.com/calculator/eucqo5qhjr?embed" style="border: 1px solid #ccc;" width="500px"></iframe><br /></div><br />The image above shows invariant some hyperbolae plotted -- $\Delta s^2=-3$, $\Delta s^2=-2$, $\Delta s^2=-1$, $\Delta s^2=0$, $\Delta s^2=1$, $\Delta s^2=2$, $\Delta s^2=3$. Note how the hyperbolae never cross the light cone -- implying the existence of an absolute future, an absolute past, an absolute left and an absolute right.einsteininvariantsminkowski spacetimephysicsrelativityspacetimespecial relativityFri, 06 Oct 2017 06:57:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-3846291875962431070Abhimanyu Pallavi Sudhir2017-10-06T06:57:00ZBTJC script (extended)
https://thewindingnumber.blogspot.com/2017/09/btjc-script-extended.html
0Most people, when they hear the word "relativity", either run away screaming<br />(run away screaming animation)<br />Or refuse to believe its results.<br />(What if I drive a bike on a train that's 30mi/h from the speed of light? What then?! Einstein was WRONG!)<br />(Speed limit: 30mi/h sign)<br />The fundamental problem people have with relativity is that it doesn't match their intuition<br />(You're telling me running makes me fatter? (eat potato))<br />Perhaps the most dearly-held intuition that relativity throws away is the invariance of distances and durations.<br />(Throw a metre rule and a clock in the trash can.)<br />No matter how fast you move in standard Newtonian mechanics, you don't see things getting shorter -- or worse, clocks slowing down.<br />(Jog past a clock. "Fastest I can move.")<br />(Switch to notepand and paper. Draw and label spacetime.)<br />The best way to understand relativity intuitively, is to recognise that we don't care about space and time any longer. We care about *spacetime*. Instead of space intervals and time intervals remaining individually invariant regardless of the observer, the *spacetime interval* is. In special relativity, the spacetime interval is this:<br />(Delta s = sqrt(c^2 Delta t2 - Delta x2) -- c is the speed of light)<br />If an event<br />(display definition: ... a point in spacetime)<br />occurs, say a 140 years ago, 6 thousand kilometres away from me.<br />(Draw x-axis on map from Ulm, Germany to Bangalore, mark: Albert Einstein is born, me)<br />If an observer moving at a speed of half the speed of light were to make these measurements,<br />(Lorentz transformations)<br />he would measure this as having happened about 160 years ago, 2.09 trillion kilometers away in the opposite direction -- because he sees the Germany crossing him at a rapid speed, which means the event itself happened in Germany when Germany was 2 trillion kilometers away from him.<br />(Show globe receding from me, Germany at center.)<br />Taking the speed of light as about 26 billion kilometers a year, both observers give the exact same value of the spacetime interval.<br />(Show Lorentz calculations -- (26 billion)^2 etc.)<br /><br />(Show face)<br />Let's take a moment and analyse exactly how the Lorentz transformations work.<br />(Show spacetime diagram, Lorentz boost)<br />If you're familiar with linear algebra, you would know what a co-ordinate transformation is. The vector space -- in this case, spacetime -- remains the same, but the way in which you record co-ordinates changes. The fact that the vector space really remains the same is why a co-ordinate transformation is also called a passive transformation.<br /><br />The line "spacetime remains unchanged" might remind you of something.<br /><br />(show principles of relativity)<br /><br />Perhaps it reminds you of the principle of relativity. The laws of physics remain the same in all reference frames. It is this insight on which Minkowski diagrams are built.<br /><br />What relativity does is represent moving reference frames as co-ordinate transformations. When an observer observes an event, his observations are fundamentally linked to the co-ordinates of that event in his reference frame.<br />In<br />(List Four-momentum, four-etc.)<br />general, what an observer actually measures of any event or thing, are linked to the components of vectors called "four-vectors", which are vectors in spacetime. Taking the norm of these four-vectors results in things like the spacetime interval, and the rest mass, which are invariant in all reference frames. So what co-ordinate transformations actually do is rotate these four-vectors around, changing their direction while preserving their norm.<br /><br />(Flash for 10 seconds: the kind of rotation we're talking about here is called a "hyperbolic rotation", rather than a standard circular rotation. This has to do with the way the norms is taken with regards to time co-ordinates -- e.g. Delta t^2 - ... as opposed to Delta t^2 + ...)<br /><br />(Flash for 5 seconds: vector sliding across invariant hyperbola, animation)<br /><br />A lot of the weirdness of special relativity actually comes from the fact that the x-axis is transformed in the Lorentz transformation. Galilean relativity agrees that the t-axis is transformed --<br />(draw Galilean transformation, label "Galileo")<br /><br />this is essentially just the distance-time graph.<br />(rotate paper)<br /><br />special relativity however, introduces a symmetry between space and time. It requires that the x-axis is also shifted, and what more? By the same angle.<br /><br />To understand why this is so, consider a light ray coming out of the origin. Light has a fixed speed, so it's going to have some known slope.<br /><br />(Mark axes in seconds and light seconds.)<br /><br />Now suppose you did a Galilean transformation on this reference frame. The new co-ordinate system will look like this.<br />(Draw.)<br /><br />But if you read the speed of light in this reference frame, it appears *lower* than the speed of light in the original reference frame.<br /><br />(Draw.)<br /><br />In fact, if you Galilean transformed it enough, it would appear that light is barely moving at all.<br /><br />(Draw.)<br /><br />This violates the second postulate of special relativity -- that light moves at the same speed in all reference frames. So to adjust for this, it is necessary that the x-axis be transformed by the same angle as the t-axis, towards it. This is absolutely bizarre, and wonderful, because the x-axis essentially represents the present, and this tells us that two observers will disagree on what "the present" is.<br /><br />As a sidenote, if you know a bit of linear algebra, you should be able to tell that the invariance of the speed of light is essentially equivalent to saying that vectors pointed along the spacetime path, or "worldline", of the light ray are eigenvectors of the Lorentz transformation. The eigenvectors of the Galilean transformations, meanwhile, lie on the x-axis, which is what the path of a particle moving at infinite speed would be. This is why non-relativistic mechanics arises in the limit of special relativity where the speed of light is infinite.<br /><br />(Stand next to whiteboard. Linear transformation definitions.)<br /><br />Here's something to ponder about: we've been considering only *linear co-ordinate transformations* so far. You'll often see linear transformations defined by something like this set of equations up here, but a simpler -- and rather useful -- way to put it is: all straight lines remain straight lines under the transformation, and the origin remains fixed.<br /><br />Why do we assume the Lorentz transformations must be linear? Well, we choose the origin such that it's a point that remains fixed -- and this is possible, because as long as the two reference frames are not stationary with respect to each other, they must intersect at a point, since they're non-parallel lines, and this point of intersection is what we call the origin.<br /><br />As for lines remaining lines ,<br /><br />(Line evolves into tyrannosaurus rex)<br /><br />a straight line in spacetime represents either the worldline of an inertial frame of reference --- or its corresponding x-axis. If this line became curved under the co-ordinate transformation of another inertial frame of reference, it means the inertial frame of reference has become non-inertial, i.e. the moving observer thinks the object is accelerating, while the stationary observer does not. This contradicts the principle of relativity, which requires that absolute inertial frames of reference exist, i.e. all observers agree on what frames of reference are inertial, absolute acceleration exists. Unlike absolute velocity.<br /><br />But what if... what if we wanted to extend our formalism to accelerating frames of reference? What if we wanted to find a way to be able to say "the laws of physics are applicable to absolutely all frames of reference"? Then we would have non-linear transformations corresponding to accelerating reference frames.<br /><br />Non-linear transformations... meaning curved co-ordinates.<br /><br />(Draw curved axes.)<br /><br />Or, as we call it, curved spacetime.<br /><br />OK. So how do we actually do this? How do we pretend that accelerating frames of reference are perfectly OK to deal with, when clearly, they give rise to fictitious forces. When clearly, one can do an experiment to detect acceleration?<br /><br />(Falling out of chair//falling in train)<br /><br />The key lies with apples falling on your head, and elevators falling... hopefully not on your head.<br /><br />Say you're in an elevator in gravity-free space, accelerating in the direction towards your head at 10m/s2. Do you really know you're in an accelerating elevator, and not just moving down at a constant velocity back on Earth, and the normal force you feel at your feet simply a result of gravity?<br /><br />Well, if you wait for a non-zero period of time, then yes. The gravitational field back on Earth changes, and you would be able to feel a stronger normal force as you approached the Earth, unlike with constant acceleration. But instantaneously, in other words if you only made an observation over an infinitesimally short period of time, you would not. You don't know how the acceleration is going to change -- you can only feel acceleration, not a change in it -- you only know your acceleration. And you cannot distinguish between this acceleration and gravity.<br /><br />(Start listing position and its derivatives, cross them out and circle acceleration)<br /><br />This is quite a remarkable insight: you cannot feel -- i.e. measure or observe -- your "absolute position" or your "absolute velocity", and over an infinitesimal unit of time -- or as we will call it, locally -- you cannot feel the derivatives of acceleration either. The only thing you can feel is your acceleration, and this is locally indistinguishable from gravity.<br /><br />This is called the equivalence principle, and is one of the postulates of general relativity.<br />(acceleration = gravity)<br />(Pretty creative terminology, huh?)<br /><br />The other postulate of general relativity is that special relativity holds locally. Another way you hear this is "spacetime is locally flat". In other words, if you zoom in close enough to the curvy axis, it starts looking flat.<br /><br />Where else do you hear about curves becoming lines when you zoom in real close?... Something about tangents perhaps?...<br /><br />Well, it's calculus.<br /><br />(Show zoomin to curve with dy, dx labelled and "instantenous slope "derivative"" formula written.)<br /><br />In fact, this feature of calculus -- or perhaps I should say, this defining feature of calculus -- is why it's so important. It's why it's taught in high school, and why it's such a useful tool in mathematics and physics.<br /><br />Calculus is fundamentally about doing curvy things with straight things, as long as the curvy things look like lines when you zoom in close enough. The properties of lines<br />(Show length, area under line, slope formulae etc.)<br />are pretty clear, and if curves are locally lines, we can use these straightforward properties to describe them locally, on the levels of infinitesimals, or differentials.<br />(Mark tiny dx, ds's on curve)<br />An *integral* is just a sum of things that are defined at this infinitesimal level.<br />(Integral of ds, integral of dA, etc.)<br />In fact, calculus is what allows us to *define* ideas like length and area for curved things.<br /><br />So calculus is fundamentally important in general relativity, specifically: doing calculus on curved surfaces, because spacetime is a curved surface. The metric we talked about earlier, for example,<br />(start writing general form of metric)<br />is now rewritten with differentials, because it's easier to talk on the scale of differentials, where things are flat, as opposed to a finite non-zero scale, where things are curvy and weird.<br />The mathematics required for this -- the calculus of curved surfaces -- is called "differential geometry". Perhaps the best terminology ever created by mathematicians, because that's what it's about -- normally studying the geometry of differentials, and integrating that around the surface or some part of it to figure out stuff about the manifold.<br />(Show integral of ds)<br /><br />Remember four-vectors?<br />(start flipping page back to four-vectors page)<br />Those trusty things that just underwent hyperbolic rotations under co-ordinate transformations? They still exist in general relativity, of course, and they are now defined at every point on a manifold, and are infinitesimal in nature.<br />(draw tangents at a point on spacetime)<br />The four-position, for instance, is replaced by the integral of a series of infinitesimally long vectors defined at every point, tangent to the manifold.<br />(This is what we mean when we say "tangent space" or "tangent bundle")<br />It's important to talk about the norm of these quantities. We know the norm must be invariant for all observers, and we like that about it. But the old definition of the norm doesn't seem to work.<br />(Evaluate dt^2 - dx^2...)<br />This is not the invariant.<br />(Underline dt^2-....)<br />*This* is.<br />What we do in general relativity is incorportate the infinitesimal spacetime interval in the definition of the norm itself.<br />(Definition)<br />This term right<br />(underline)<br />over here is called the metric tensor, and its components are the weights on the co-ordinates in the spacetime interval.<br />(Expand out an example metric, correspond with spacetime interval)<br />As for what a tensor is... for now, just think of it as a matrix, in this case a 4-by-4 matrix.<br />The norm can also be generalised to the dot product using the metric tensor, and in fact, the components of the metric tensor are the dot products of unit vectors on the manifold.<br />(Label dot products on Minkowski tensor -- label as "Minkowski tensor (metric tensor for special relativity)")<br />The metric tensor is so important, because it completely describes the geometry of your spacetime. It tells you exactly what distances look like on your curved manifolds on an infinitesimal level, and you can do some integration to find out more global properties of your manifold.<br />To be fair, not absolutely everything about your manifold is encoded in the metric tensor. For instance, the metric tensor is the same for a flat<br />(plonk flat sheet)<br />sheet of paper and a cylinder<br />(plonk cylinder)<br />This is because locally, the distances between two points don't<br />(mark two points and roll them up, measure with ruler)<br />change when you roll a flat piece of paper into a cylinder. In other words, he the distance between two points really close to each other, really doesn't change, because you haven't stretched or torn the paper in anyway, like you'd need to do if you wanted to turn the paper into a sphere, for instance.<br />(paper crumpling frustration)<br />The whole tensor formalism is quite useful in general relativity, just like the four-vector formalism. Some four-vectors, like the energy-momentum vector, are actually replaced with tensors<br />(Show tensor -- label "energy-momentum tensor")<br />This specific tensor is quite important in general relativity, and shows up in the Einstein Field Equations<br />(Gmunu=8piG/c^4 Tmunu -- write on slip of paper)<br />The left-hand side is called the Einstein tensor, is a 4-by-4 matrix, and is an expression in the metric tensor. The right-hand-side has information about the energy, momentum, pressure and shear stress in the region. Although it appears simple, the Einstein tensor is a fairly complicated expression if one were to express it purely in terms of the metric tensor, and the index<br />(circle indices)<br />notation hides that the equation is actually a set of 10 distinct equations.<br />The Einstein Field Equation, alongside<br />(start moving geodesic slip there)<br />the Geodesic equation sum up the two important results of General Relativity. The geodesic equation is a purely geometric result from differential geometry, and essentially says "The geometry of spacetime tells<br />(Cross out with arrow, SPACE MOVES MATTER)<br />matter how to move". The Einstein Field Equations, meanwhile, say "Matter tells<br />(MATTER CURVES SPACE)<br />spacetime how to curve."<br />I'd like to end this video with a little bit of insight into how General Relativity relates to a lot of modern fundamental physics.<br />An important tensor that shows up a lot in General Relativity is the Riemann curvature tensor,<br />(R_abcd)<br />which is a 4 by 4 by 4 by 4 matrix that describes everything about the curvature, or intrinsic cur<br />(flat sheet = cylinder)<br />vature, of a manifold. Here's the interesting thing about the Riemann curvature tensor. In four dimensions, its components can be written as two separate tensors,<br />(write R ab and C abcd)<br />the Ricci curvature tensor and the Weyl tensor<br />(flash the labels on Rab and Cabcd)<br />In terms of gravity, the Ricci curvature tensor is zero wherever there is no matter, because the Einstein Field Equations tell us that the Einstein Tensor is zero whenever the Energy-Momentum Tensor is zero<br />(flip back to EFE page and highlight equation)<br />and<br />(flip back)<br />the Einstein tensor is defined such that whenever it is zero, the Ricci curvature tensor is also zero.<br />But we know there can be gravity at a point even if there is no matter at the point. For instance,<br />(draw Sun, point)<br />there is gravity in a vaccum near the sun simply because of the sun's gravity, even though there is no matter at that point. This is the effect of the<br />(draw Rab all over sun, Cabcd all over vacuum)<br />Weyl tensor, which describes gravity at a distance. Here's the catch: there is no Weyl tensor in less than four dimensions. Which means if we lived in flatland,<br />(draw plane with stickmen on it with ? marks over their heads)<br />with two dimensions of space and 1 dimension of time, gravity would be a contact force, if General Relativity applied there.<br />This is a remarkable observation, and it's natural to ask: if non-contact gravity appears only in four dimensions, what kind of new force would arise in five dimensions, four of space and one of time?<br />(...)<br />The<br />(Start drawing 5D metric tensor)<br />answer is surprising: electromagnetism.<br />This insight is known today<br />(drawn images of Kaluza and Klein)<br />as Kaluza-Klein theory, where electromagnetism is the result of us living in a five-dimensional world, except that one of these dimensions, instead of<br />(draw infinite axis, arrows, infinities labelled)<br />being allowed to run free on an infinitely long axis, is a really<br />(draw circle, 0 and 2pi)<br />small circle.<br />This ninety-year old idea might sound like just another neat idea from the past, but it's right at the frontier of physics<br />(Write SUGRA, string theory, M-theory, draw galaxies),<br />with even higher dimensions<br />(10)<br />allowing us to integrate the strong and weak nuclear forces in a theory known as supergravity, which arises as the non-quantum approximation of string theory.<br /><br />(Show face)<br />I hope you enjoyed this video. To learn more -- like if you<br />(Scroll through nothing can move... section from top of page))<br />want to see a neat thought experiment illustrating why you cannot move faster than light, why<br />(Scroll through light cones and causality page)<br />the relativity of simultaneity does not violate causality as long as nothing moves faster than light,<br />(Show face)<br />or what exactly *are* tensors and how they differ from matrices --<br />(Scroll through homepage)<br />I have written several explanatory articles for the<br />(click on SR)<br />Special Relativity course on the subject at<br />(zoomed video of typing in the URL)<br />thewindingnumber.blogspot.com.blogBreakthrough Junior Challengegeneral relativitymiscellaneousrelativityspecial relativityWed, 27 Sep 2017 12:37:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-2989924434875789273Abhimanyu Pallavi Sudhir2017-09-27T12:37:00ZThe correct resolution of the twin paradox
https://thewindingnumber.blogspot.com/2017/09/the-correct-resolution-of-twin-paradox.html
0(Note: in this article, we will use the phrase "seeing" to mean "considers simultaneous to". E.g. "We see Betelgeuse explode today" doesn't literally mean that we can observe Betelgeuse going supernova today, but rather that we will see it happening ~600 years later, thus calculating that Betelgeuse exploded today, i.e. in a time simultaneous to the present.)<br /><br />The time dilation equation, $\Delta t = \gamma \Delta \tau$, often looks utterly wrong.<br /><br />"If Observer $A$ observes $A'$'s clocks as being slower and $A'$ observes $A$ as being slower, who's right? Surely, we cannot have $\Delta t = \gamma^2 \Delta t$!"<br /><br />Hopefully, you should be able to answer this paradox based on your understanding of the relativity of simultaneity.<br /><br />This is in fact <i>exactly</i> why we started out with a discussion of simultaneity being relative -- Observer $A$ is seeing a different point on the worldline of $A'$, and Observer $A'$ is seeing a different point on the worldline of $A$.<br /><br />Again, a spacetime diagram is instructive:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.imgur.com/Q3w0PrB.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="629" data-original-width="800" height="313" src="https://i.imgur.com/Q3w0PrB.png" width="400" /></a></div><br />The point that Observer A sees on A' is not looking back at him at all -- it's seeing his <i>past</i>. The point on A's worldline that A thinks is simultaneous to himself -- at that point, A' is thinking of a different point on A's worldline to be simultaneous to <i>him</i>self.<br /><br />OK. But what if $A'$ returns to meet $A$. What if, for example, $A$ and $A'$ are two twins, and $A'$ goes to Betelgeuse at a non-zero speed and comes back? Who's older?<br /><br />The key is the change in velocity when $A'$ turns around. It's not that special relativity doesn't apply any longer (this is a common myth), but rather that the axis of simultaneity changes here.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.imgur.com/IkA8tlP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="602" data-original-width="800" height="300" src="https://i.imgur.com/IkA8tlP.png" width="400" /></a></div><br />In other words, $A'$ sees $A$'s clock suddenly speed up rapidly for the time that he turns around. If the turnaround is instantaneous, he sees $A$'s clock suddenly skip ahead a few years and continue at a dilated rate. So $A'$ agrees: $A$ <i>is</i> older -- he rapidly aged at a certain point during $A'$'s journey, compensating for his otherwise apparently slow aging. When $A'$ comes back, he <i>does</i> come to a futuristic world where time travelers are culled at sight.<br /><br />This is the resolution to the twin's paradox. The key point is that while observers may disagree on the simultaneity of spatially separated points, they cannot disagree on the simultaneity of points that lie on each other (i.e. when the twins meet).<br /><br /><div class="twn-furtherinsight">Think: What if $A'$ were going in a circle to return to his original starting point? Or worse -- what if the universe itself had a spherical geometry that $A'$ comes back to the same spot without being acted on by a force?</div><br /> In the latter case mentioned above, you truly need general relativity, because we made spacetime spherical. But what if we use a cylindrical spacetime instead? (It turns out cylinders do not have "intrinsic curvature", because distance functions are not affected when a cylinder is made from a flat sheet.) I.e. where only one spatial dimension is chosen as curved?<br /><br />It turns out that the first postulate of special relativity (you can't do an experiment to determine your absolute velocity) gets broken in this context (because the radius of the cylinder gets Lorentz contracted when you're moving with respect to it and you can measure this by sending light signals across the cylinder).<br /><br /><div class="twn-furtherinsight">To answer the question in an arbitrary geometry, including a curved one, we have to compare the "spacetime interval" tranversed by each twin. We will define this in a coming article, and show that it is invariant. In general relativity, it is simply the calculation of this interval that changes a bit.</div><br />There are some other, less trivial paradoxes in special relativity. An example is <a href="https://physics.stackexchange.com/questions/197393/whats-the-name-for-the-relativistic-paradox-with-the-train-car-travelling-over" target="_blank">Rindler's grid paradox</a> -- whose resolution requires realising that rigid objects do not exist in special relativity.einsteinrelativityspacetimespecial relativityspecial relativity with accelerationtwin paradoxtwins paradoxWed, 20 Sep 2017 10:48:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-4261480110251341441Abhimanyu Pallavi Sudhir2017-09-20T10:48:00ZLorentz transforms lives
https://thewindingnumber.blogspot.com/2017/09/lorentz-transforms-lives.html
0<b>Duration</b><br /><br />In your years as an infant reading up stuff on wikipedia, you might've seen formulae such as<br /><br />$$\Delta t = \frac{{t'}}{{\sqrt {1 - {v^2}} }}$$<br />Or simply $t=\gamma t$. From our knowledge of the Lorentz transformations, we certainly know that the scale on the time axis <i>changes</i>. It would be interesting to find out exactly how this might be observed in real life -- I mean, we know how <i>time</i> as a co-ordinate transforms, but how does <i>duration</i> -- the interval between two points in time -- transform?<br /><br />You might be tempted to do calculations like ${t'_1} - {t'_2} = \gamma \left( {{t_1} - vx} \right) - \gamma \left( {{t_2} - vx} \right)$, much like people are tempted to sign up for "get rich quick" scams. Doing so would be reckless and stupid.<br /><br />What we need to do is first precisely formulate what we're looking for. We ask:<br /><br /><i>Suppose there is a clock moving at a constant velocity v relative to me. In my time, how long does it take for the moving clock to tick by 1 second? Assume that we synchronised our clocks in the beginning, i.e. the moving clock and my own clock showed exactly the same time at t = 0 when our positions coincided.</i><br /><br />Let's draw a spacetime diagram.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/fbNSY.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="632" data-original-width="800" height="315" src="https://i.stack.imgur.com/fbNSY.png" width="400" /></a></div><br />Point <i>A</i> represents the event "moving clock ticks the one second mark". Since lines parallel to the <i>x</i>-axis link points that we (i.e. the stationary observer) consider simultaneous, we draw a horizontal line connecting Point A and the <i>t</i>-axis (remember, we want to find out what tick of our clock is simultaneous, according to us, with 1 second elapsing on the moving clock). Mark this point of intersection <i>B</i>. Then we are interested in finding the duration <i>OB</i>, which we call $t$ in terms of <i>OA</i>, which we call $t'$.<br /><br />Well, from the Lorentz transformations we know that $t' = \gamma \left( {t - vs} \right)$. We also know, geometrically, that $s = vt$, so we may write $t' = \gamma t \left( {1 - v^2} \right)$, i.e. $t'=t\sqrt(1-v^2)$, or $t=\gamma t'$.<br /><br />In general, for the duration between two events (where stuff might not pass through the origin at the right time), we may say $\Delta t = \gamma \Delta t'$. This phenomenon is called time dilation.<br /><br /><b>Distance</b><br /><br />We do the same sort of calculation for distances, first operationalising what we mean:<br /><br /><i>If I hold out a ruler to measure the length of a metre-stick (i.e. something that is 1 metre in its own reference frame) moving at speed v relative to me, what would be the length I measure?</i><br /><br />Once again, we draw a spacetime diagram.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/wNuz1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="607" data-original-width="800" height="302" src="https://i.stack.imgur.com/wNuz1.png" width="400" /></a></div><br />This is a little trickier -- when measuring the length of an object, we do so by measuring the two ends of the object simultaneously (or rather, what is simultaneous according to us). However, what is simultaneous for us is not what is simultaneous for the rod. While the rod's reference frame holds <i>O</i> and <i>L</i> as simultaneous, we actually choose another point on the worldline -- <i>K</i> -- as simultaneous with <i>O</i>, because it lies on the <i>x</i>-axis.<br /><br />Then:<br /><br />$$x'=\gamma\left(x+SK-vh\right)=x'=\gamma\left(x+vh-vh\right)=\gamma x$$<br />Hence $x=x'/\gamma$, i.e. length/distance in the direction of motion is <b>contracted </b>under a Lorentz transformation.<br /><br />Back when I was an infant, I was confused about why it was that time got dilated (<i>multiplied</i> by $\gamma$), while length got contracted (divided by $\gamma$). Well, now you know -- the two phenomena aren't temporal-spatial analogs of each other at all! Length contraction is a result of measuring the two ends of a distance simultaneously<br /><br /><b>Speed</b><br /><br />We have been interested, since the beginning of this series, in finding out how velocities and speeds transform under a Lorentz transformation. Once again, we formulate our question precisely as follows (if you've done DIDYMEUS, you should understand how this forces us to accept logical positivism):<br /><br /><i>Suppose O' is moving at velocity v with respect to O. In O', the velocity of object K is w. What is the velocity of K in O?</i><br /><br />Once again, we draw a spacetime diagram.<br /><br /><div style="text-align: center;"><a href="https://i.stack.imgur.com/yoE4p.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="632" data-original-width="800" height="315" src="https://i.stack.imgur.com/yoE4p.png" width="400" /></a></div><br />So given $x'/t'$, how would we find $x/t$?<br /><br />Well, here's an idea: we know the Lorentz transformation associated with the velocity $w$. So we just use simple matrix multiplication to find the compound transformation, and figure out what velocity is associated with this transformation.<br /><br />In other words, we write $L(v)L(w)$ as the co-ordinate system of $K$ with respect to $O$. Performing the matrix product,<br /><br />$$\begin{array}{c}\gamma (v)\left[ {\begin{array}{*{20}{c}}1&v\\v&1\end{array}} \right]\gamma (w)\left[ {\begin{array}{*{20}{c}}1&w\\w&1\end{array}} \right] = \frac{1}{{\sqrt {\left( {1 - {v^2}} \right)\left( {1 - {w^2}} \right)} }}\left[ {\begin{array}{*{20}{c}}{1 + vw}&{v + w}\\{v + w}&{1 + vw}\end{array}} \right]\\ = \frac{{1 + vw}}{{\sqrt {\left( {1 - {v^2}} \right)\left( {1 - {w^2}} \right)} }}\left[ {\begin{array}{*{20}{c}}1&{\frac{{v + w}}{{1 + vw}}}\\{\frac{{v + w}}{{1 + vw}}}&1\end{array}} \right]\\ = \frac{1}{{\sqrt {\frac{{{v^2}{w^2} + 1 - \left( {{v^2} + {w^2}} \right)}}{{{v^2}{w^2} + 1 + 2vw}}} }}\left[ {\begin{array}{*{20}{c}}1&{\frac{{v + w}}{{1 + vw}}}\\{\frac{{v + w}}{{1 + vw}}}&1\end{array}} \right]\\ = \frac{1}{{\sqrt {\frac{{{v^2}{w^2} + 1 + 2vw - {{\left( {v + w} \right)}^2}}}{{{v^2}{w^2} + 1 + 2vw}}} }}\left[ {\begin{array}{*{20}{c}}1&{\frac{{v + w}}{{1 + vw}}}\\{\frac{{v + w}}{{1 + vw}}}&1\end{array}} \right]\\ = \frac{1}{{\sqrt {1 - {{\left( {\frac{{v + w}}{{1 + vw}}} \right)}^2}} }}\left[ {\begin{array}{*{20}{c}}1&{\frac{{v + w}}{{1 + vw}}}\\{\frac{{v + w}}{{1 + vw}}}&1\end{array}} \right]\\ = \gamma \left( {\frac{{v + w}}{{1 + vw}}} \right)\left[ {\begin{array}{*{20}{c}}1&{\frac{{v + w}}{{1 + vw}}}\\{\frac{{v + w}}{{1 + vw}}}&1\end{array}} \right]\\ = L\left( {\frac{{v + w}}{{1 + vw}}} \right)\end{array}$$<br />Interestingly, this product is commutative. We may thus write:<br /><br />$$L\left( v \right)L\left( w \right) = L\left( {\frac{{v + w}}{{1 + vw}}} \right)$$<br />The reason this is a useful form to write the velocity addition formula is that it conveys the precise positivist sense in which velocity is transformed: as it is observed in the Lorentz transformation of things associated with it.<br /><br />One may let one of the velocities be $c$ and confirm that $c$ is the same in all reference frames.<br /><br /><div class="twn-furtherinsight">What happens when the Lorentz boost is in the direction perpendicular to the direction of motion? Well, distance is not contracted, but time is still dilated, and the velocity is reduced by a factor of $1/\gamma(v)$ where $v$ is the velocity of the observer. This ensures, and you can verify, that the resultant velocity in the new frame doesn't exceed $c$ even by the Pythagorean sum.)</div><br /><b>Relativistic doppler shift</b><br /><br />This is a surprisingly important lemma to our future derivation of the equation $E=mc^2$, so make sure you're clear with it. Also, it tells you that speeding through a red light might cause it to turn into gamma radiation, if you go fast enough.<br /><br />We're interested in finding out how the frequency (i.e. colour) of light changes with respect to a moving observer, accounting for all relativistic effects. Frequency is just the inverse of the time period, which is the time interval between two wavefronts.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/BwIbD.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="632" data-original-width="800" height="315" src="https://i.stack.imgur.com/BwIbD.png" width="400" /></a></div><br />The red vertical line is the worldline of the source, the blue line is the worldline of a moving observer and the black vertical line is of course the worldline of the observer we consider stationary. The purple lines are the wavefronts emitted by the source. Suppose one wavefront hits the worldlines of both the stationary and moving observers at the origin. Another wavefront hits quite later.<br /><br />We first find the co-ordinates of the point of intersection between the blue worldline and the worldline of the second wavefront in the stationary co-ordinate system. We simply find the equations of the lines and set: $x=vt$, $t=T-x$ so that:<br /><br />$$\begin{array}{l}t = T - vt\\(1 + v)t = T\\t = \frac{1}{{1 + v}}T\\x = \frac{v}{{1 + v}}T\end{array}$$<br />Now we may easily calculate the co-ordinate $t'$, which is the same as $T'$:<br /><br />$$T' = t' = \gamma \left( {\frac{1}{{1 + v}}T - vx} \right) = \gamma T\left( {\frac{1}{{1 + v}} - \frac{{{v^2}}}{{1 + v}}} \right) = \gamma T\left( {1 - v} \right) = \sqrt {\frac{{1 - v}}{{1 + v}}} T$$<br />Then<br /><br />$$f' = \sqrt {\frac{{1 + v}}{{1 - v}}} f$$<br />This is an important result! It means that even though a photon has the same speed however fast you chase it, you do see it getting less and less energetic.<br /><br />Sometimes you will see the inverse coefficient $\sqrt{\frac{1-v}{1+v}}$ -- this involves an observer moving away from the source.<br /><br /><div class="twn-furtherinsight">How fast would you need to go for a red light to become gamma radiation? Well, it means that $\sqrt {\frac{{1 + v}}{{1 - v}}} =f'/f=10^{19}/(4*10^{14})=2.5*10^4$, i.e. $(1+v)/(1-v)=6.25\times10^8$. Solving for v, one sees that it must be within 1m/s of the speed of light.</div>einsteinlorentz transformationsphysicsrelativityspecial relativitySun, 17 Sep 2017 07:03:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7385843498574102194Abhimanyu Pallavi Sudhir2017-09-17T07:03:00ZLight cones and causality
https://thewindingnumber.blogspot.com/2017/09/light-cones-and-causality.html
0Before we do anything further, I'd like to prove a point I made earlier about causal events.<br /><br />Recall that we earlier showed that the simultaneity of two events depends on the observer. Similarly, we know that the order of two events depends on the observer.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/EkNyg.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="516" data-original-width="636" height="323" src="https://i.stack.imgur.com/EkNyg.png" width="400" /></a></div>In the above diagram, we have chosen too reference frames, Lorentz boosted with respect to each other, where the event <i>A</i> is the origin of both. Now while the black reference frame (call it $O$) measures B's time as being after A's time, the red reference frame ($O'$) measures B as occurring earlier. The figure below illustrates this phenomenon:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/IZgBk.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="519" data-original-width="702" height="295" src="https://i.stack.imgur.com/IZgBk.png" width="400" /></a></div><div class="separator" style="clear: both; text-align: center;"><br /></div>OK.<br /><br />Now notice how $x'$ is the set of all points that $O'$ measures as simultaneous to $A$, and $x$ is the set of all points $O$ measures as simultaneous to $A$. What this means is that <i>if and only if the event B is between the x-axis and the x'-axis will the two observers disagree on which happened first</i>.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/4bJBO.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="410" data-original-width="394" height="400" src="https://i.stack.imgur.com/4bJBO.png" width="383" /></a></div><br />Why is this significant? Well, no matter how fast $O'$ moves, it can never move faster than light, so $x'$ can never cross the blue boundary. If $O'$ moves in the opposite direction, $x'$ will go below the x-axis, but can once again not cross the blue boundary below. Therefore, the set of all events that observers may disagree on whether they happened before or after A (the origin) are those shaded in purple below:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/svN6q.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="410" data-original-width="394" height="400" src="https://i.stack.imgur.com/svN6q.png" width="383" /></a></div><div class="twn-furtherinsight">Think: What about the events on the blue lines? Are those included or excluded?</div><br />These are <b>all the events outside the <i>light cone </i>of the event A</b>. The light cone is the set of elements not shaded in the diagram above. The part with positive <i>t</i> is called the "future light cone", and the part with negative <i>t</i> is called the "past light cone" of the event.<br /><br />Why is this significant? Well, the past light cone is the set of all events that could have possibly caused (i.e. influenced) the event at <i>A</i>, and the future light cone is the set of all events that A could possibly cause (influence).<br /><br />In other words, the light cone is the set of all events causally connected to A! You should be able to explain why by now -- the borders of the light cone are paths of light beams sent away from or coming towards A. For A to cause or influence another event, it must be able to send some message, object or any form of information to that event. Since nothing can move faster than light, this message cannot move faster than light, therefore it cannot influence anything outside its future light cone. An analogous argument can be easily made for why nothing outside the past light cone can have influenced or caused A.<br /><br />The result is extremely significant! All observers agree on the order of two events A and B if A and B are in each other's light cones (if A is in B's future light cone, B is in A's past light cone, and vice versa). So the issue we earlier mentioned, is solved.<br /><br />(Note: in 2+1 dimensions, the light cone would be an actual 3-dimensional cone, and in 3+1 dimensions, the light cone would be a 4-dimensional cone. If we could actually see a light cone in three spatial dimensions, it would be a sphere expanding from the event at the speed of light.)<br /><br />---<br /><br />We can also call the events in the future light cone the "absolute future" of $A$, and the events in the past light cone the "absolute past" of $A$. Similarly, one may call events outside the light cone the "absolute left" and "absolute right" of $A$ -- there are no observers whose worldlines pass through $A$ such that these events are to their left but to the right of $A$, or vice versa.<br /><br />This is a pretty obvious conclusion of the fact that nothing can move faster than light. Indeed, we will often find that conclusions regarding time are obvious in the context of space, or vice versa. This fact, and the symmetry between space and time, might help us "guess" many conclusions in special relativity from what we already know.<br /><br />---<br /><br />Note that our proof is equivalent to requiring that no physical reference frame can move faster than the speed of light, i.e. the x' axis cannot cross into the light cone.<br /><br />In fact, it is equivalent to our earlier proof that nothing can move faster than light, where events A and B are "light hitting the hi-tech wall" and "train hitting the high-tech wall". There is a causal link between the two, since what happens when the train hits the high-tech wall is affected by whether or not the light hits the high-tech wall.<br /><br />This gives us an important insight regarding the thought experiment: we implicitly assumed that <i>time cannot run backwards</i>. This principle is known as <b>causality, </b>and we have just demonstrated that it is equivalent to the statement that things cannot move faster than light (i.e. <b>locality</b>).causalityeinsteinlight conesphysicsrelativityspecial relativitySat, 09 Sep 2017 07:00:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-4640119048019214827Abhimanyu Pallavi Sudhir2017-09-09T07:00:00ZIntroduction to special relativity
https://thewindingnumber.blogspot.com/2017/09/introduction-to-special-relativity.html
0Often, one wonders why some major paradigm shift took so long to occur. We ponder this in the context of political economy, for instance -- with regards to the Neolithic Revolution (the invention of agriculture, circa 9000 BC) and the Industrial Revolution (which one may trace as the ultimate conclusion of a series of events that began with the end of feudalism in Europe in the 1400s).<br /><br />We also often ponder this in the context of scientific achievements. Why, for instance, did it take till Einstein for an insight as key as relativity to be discovered? Why was it not discovered by Archimedes, or by some Rashtrakuta or Vijayanagara mathematician, or at least in the time of Newton?<br /><br />While this question is sometimes tricky to answer, the question is very clear in the context of relativity.<br /><br />Special Relativity was developed as a resolution to the failure of Galilean Relativity to accomodate the predictions of Maxwell's electromagnetism. It turns out that while Maxwell's electromagnetism was fine (it is "Lorentz invariant"), mechanics itself needed to be fixed. A key insight from relativity with regards to electromagnetism is, in fact, that <b>magnetism is a relativistic effect</b>. Magnetism is what you get when electricity undergoes a Lorentz transformation, i.e. when the charge starts moving. It is just like the effect of velocity on mass, for instance, or distance, or duration.<br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://imgs.xkcd.com/comics/fundamental_forces.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="282" data-original-width="740" height="200" src="https://imgs.xkcd.com/comics/fundamental_forces.png" width="525" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">(Source: <a href="https://xkcd.com/1489/">XKCD - 1489</a>)</td></tr></tbody></table>The precise contradiction was as follows: Maxwell gives an absolute value for the speed of light, but Galileo says that no absolute speed exists -- it depends on the reference frame. For instance, if a train travelling at <i>v </i>with respect to the ground) blares light at speed <i>c</i> (in its own reference frame), then according to the ground's reference frame, the light should be travelling at <i>c</i> + <i>v</i>.<br /><br /><div class="twn-pitfall">Light is not important! Even though the initial result that spurred relativity came from the theory of electromagnetism and light, relativity itself produces the same predictions for anything that travels at the same speed that light does -- any massless particle, in general. I.e. the curious case of light in relativity is not a result of its particle-physics-y properties, but its kinematic properties.</div><br />This prediction (by Galilean relativity) is fundamentally a result of the nature of the <i>Galilean transformation</i> (and this is the transformation that Einstein sought to change). This is the transformation that tells you how to transform co-ordinates (or really anything) between inertial reference frames with the same origin.<br /><br />Suppose Observer $O'$ is moving at speed $v_{O'}$ with respect to Observer $O$. Now consider the time and position of some event $P$ to be $(t, x)$ in reference frame $O$. If we're dealing with four dimensions, then $x$ and $v$ are of course, three-vectors. Then what's the position of event $P$ according to $O'$? Well, at time $t$, $O'$ would be $v_{O'}t$ to the right of $O$, hence the position of $P$ would be measured as $(t,x-v_{O'}t)$.<br /><br />This is the Galilean transformation<br /><br />$$G(w):\,\,\left[ {\begin{array}{*{20}{c}}t\\x\end{array}} \right] \to \left[ {\begin{array}{*{20}{c}}t\\{x - wt}\end{array}} \right]$$<br />One may also write this as the matrix:<br /><br />$$G(w) = \left[ {\begin{array}{*{20}{c}}1&0\\{ - w}&1\end{array}} \right]$$<br />Perhaps the asymmetry of the matrix bothers you. It bothers me too. And fortunately for us, it bothered Einstein too, and he actually did something about it, rather than rant about it on a blog. In fact, the asymmetry of this matrix corresponds directly to the asymmetry of time and space in Galilean relativity. This is pretty clear from the form of the Galilean transformation, and should also be obvious from your knowledge of linear algebra (if it's not, you should go and read up the first few chapters of the linear algebra course). As we will see, the symmetry between space and time will arise quite neatly from our postulates.<br /><br />One may plot this transformation on a <i>spacetime diagram</i>. Below shows a spacetime diagram viewed from the perspective of $O$, where the transformed reference frame $O'$ is shown as well.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/KPToQ.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="481" data-original-width="492" height="312" src="https://i.stack.imgur.com/KPToQ.png" width="320" /></a></div><br />A spacetime diagram is essentially a displacement-time graph, where the displacement function is considered a transformation of the <i>t</i>-axis.We make the following observations:<br /><ul><li>The <i>t</i>' curve is the worldline of observer $O'$, i.e. the path taken by $O'$ in spacetime. The $x$-axis is not transformed (this is the asymmetry we were talking about earlier).</li><li>The <i>x</i>-axis is essentially the set of all events in spacetime such that $t=0$, i.e. are simultaneous to "the present". Surely, these points must be the same to all observers, since whether $t=0$ or $t=1$, or whatever value <i>t</i> holds, is independent of the observer? </li><li>Within the reference frame $O$, $O'$'s reference frame seems squished up. But since no reference frame is special, within $O'$'s reference frame, $O'$ will look normal, and $O$ will look transformed, specifically by the velocity $-v$ (so the axis <i>t</i> is tilted from the "normal" $t'$ by the same angle in the opposite direction). This is just the inverse transformation.</li></ul><br />So those were the Galilean transformations, which we know are incorrect (differentiate $x' = x-wt$ so you get $\dot{x}'=\dot{x}-w$ -- we know from Maxwell that this is incorrect with regards to light). Before we derive the correct transformations (called the "Lorentz transformations"), we'll first take a detour to prove some significant results in special relativity, which will also give ourselves an idea of how powerfully predictive our two axioms are.<br /><br />(A note on notation: we will use units of distance and time such that the value of <i>c</i> is unity. For instance, lightseconds and seconds, etc. This is useful, because it eliminates <i>c</i> from our formulae and helps expose the symmetry between space and time.)<br /><br /><b>1. Nothing can travel faster than light</b><br /><br />We consider a thought experiment, where an object $O'$ travelling at speed <i>v</i> (in reference frame $O$) releases light in the positive <i>x</i>-direction. The speed of this light is, of course, <i>c</i> in all reference frames. According to $O$ (i.e. an observer in $O$), the speed of this light is $c$, but the speed of light relative to the object is $c-v$.<br /><br />That's okay. But now consider if $v>c$. Then $c-v$ is negative, i.e. $O$ observes the light ray being emitted at some speed in the <i>other direction </i>with respect to the object, i.e. it sees $O'$ farting out the light ray, rather than vomiting it up. Whereas according to $O'$, he is stationary, and the velocity of light is still $c$ in the positive <i>x</i>-direction.<br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://upload.wikimedia.org/wikipedia/commons/6/64/Tachyon04s.gif" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="159" data-original-width="571" height="89" src="https://upload.wikimedia.org/wikipedia/commons/6/64/Tachyon04s.gif" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Simulation of how one would observe a "tachyon" -- a hypothetical particle that can move faster than light.</td></tr></tbody></table>Why is this inconsistency problematic? Well, suppose there is some hi-tech wall somewhere further down the positive x-direction, which functions in the following way:<br /><ul><li>If light is shone on it, the hi-tech wall stops working.</li><li>If the object collides into the wall while it is working, it sets off blaring alarms and sends planes flying into buildings so everyone knows about the event.</li></ul><br />According to Observer $O$, the object collides with the wall <i>first, </i>before the wall stops working. His children die in a plane crash and he ends up drunk and homeless.<br /><br />According to Observer $O'$, however, he can never catch up with the light ray, and he bangs into a dysfunctional wall, and nothing happens. He turns around and waves at $O$, who is sober.<br /><br />We have ended up at a logical contradiction, and our only solution is to say that such an object that travels faster than light, $O'$, <i>does not exist</i>.<br /><br /><div class="twn-furtherinsight">When I narrate this proof to people, they are quick to ask "If we're talking about the response of the wall, shouldn't we only care about the wall's reference frame?" Well, no, because that privileges a reference frame. The laws of physics -- including the laws of the wall -- are valid in all reference frames, and an external observer shouldn't see the wall giving a response inconsistent with the laws of the wall. I mean, it's <i>possible </i>to have a wall that functions in the way demonstrated in the question in all reference frames.</div><br />This is a rather surprising result. If you're travelling at .99999999c, can't you just supply a bit of energy to go 4m/s faster? As we will see, it turns out the laws of dynamics also change in special relativity, and this "bit" of energy is <i>infinite</i>.<br /><br />"Aha!" you say, "Maybe you can never choose an inertial reference frame travelling faster than light with respect to your reference frame, but what if you choose a reference frame travelling at 0.6c, then another reference frame travelling at 0.6c with respect to that reference frame. Then wouldn't this third reference frame be travelling at 1.2c with respect to us?" Again, it turns out that even the velocity addition formula is changed in special relativity.<br /><br /><div class="twn-pitfall">When we say "velocity addition formula", we mean "co-ordinate transformation between reference frames in relative motion with each other", i.e. if an observer on the train moving at speed <i>v</i> relative to the ground measures the speed of something to be <i>w</i>, then what's the speed of that thing wrt the ground? That's the velocity addition we're talking about.<br /><br />We're <i>not</i> talking about velocity addition within the same reference frame. If we see two light beams shining towards each other, we <i>do</i> see the space closing at speed 2<i>c</i>.</div><br /><b>2. Relativity of simultaneity</b><br /><br />We know from linear algebra that a linear transformation in $\mathbb{R}^n$ can be described fully by the images of $n$ linearly independent vectors. These images, written next to each other, form the matrix of the transformation in the basis comprised of these vectors. One such set of linearly independent vectors is the standard basis.<br /><br />We're working towards an expression for the Lorentz transformation. To find out how the unit vectors in the <i>t</i>, <i>x</i>, <i>y</i> and <i>z</i> axes transform, it is sufficient to find out how the axes themselves transform, and what the scale is on these transformed axes (e.g. the identity transformation and a scaling of two both leave the axes unchanged, but the scales on the transformed axes are different).<br /><br /><div class="twn-furtherinsight">For readers with a reasonable knowledge of linear algebra: we know two sets of eigenvectors of the Lorentz transformation. The fact that these are not eigenvectors of the Galilean transformation is equivalent to the problem of Galilean transformation not respecting the invariance of the speed of light. The problem of special relativity is therefore equivalent to finding a matrix with these eigenvectors, with eigenvalues that respect some symmetry properties we will see later.)</div><br />We consider a general 1+1-dimensional spacetime diagram in the reference frame of $O$. Obviously, the <i>t</i>-axis is the worldline of our observer/the origin of our observer.<br /><br />What, exactly is the <i>x</i>-axis? Well, any <i>P-</i>axis is essentially the set of points such that all co-ordinates except <i>P</i> are 0. The <i>x</i>-axis is the set of points such that <i>t</i> = 0.<br /><br />In other words, the <i>x</i>-axis is what the observer regards as <i>the present</i>. If the x-axis were transformed in any way, then it would mean that the idea of what's the present and what's the past <i>also depends on the observer</i>. In general, any line parallel to the <i>x</i>-axis is a line of simultaneity (i.e. events that occur at the same point in time), and if the <i>x</i>-axis is transformed, the conception of simultaneity depends on the observer.<br /><br />So it makes sense to study simultaneity in our quest to find the Lorentz transformation.<br /><br />The relativity of simultaneity can be illustrated with the following thought experiment: suppose we have two sources of light, $S_1$ and $S_2$, which (in the reference frame of Observer O) release a pulse of light at the same instant $t=0$. How does Observer O know this? Well, he is situated at the midpoint of the two sources, and knows the distance between him and each source to be <i>s</i>, so when he sees the two pulses simultaneously at $t=s/c$, he knows that each pulse was released $s/c$ time earlier, i.e. at $t=0$.<br /><br />Now consider another observer $O'$, moving parallel to the light coming from $S_1$ at speed $v$. It happens to be that at the instant the two pulses collide at the origin of $O$, $O'$ also crosses this point.<br /><br />It is important to note that he observes the collision of the two pulses at the same instant as $O$ does. This occurs at a single event (a single point in spacetime), and all observers must agree on what happens at this event (this is from the principle of relativity).<br /><br />However, what $O'$ disagrees with $O$ on is on the simultaneity of the release of the light pulses itself. Observer $O'$ considers himself to be an ordinary, stationary observer. He has seen the point of intersection running away from the light emanating from $S_2$ and towards the light emanating from $S_1$. Therefore light -- whose speed is $c$ -- emanating from $S_2$ has to catch up with the intersection point, the distance closing at a speed of $c-v$, while light emanating from $S_1$ meets the intersection point with the distance closing at a speed of $c+v$.<br /><br />So for them to meet at the intersection point the same distance away from each source, $S_2$ must have released its pulse earlier than $S_1$. How much earlier? Well, let's not go there too fast -- we still don't know if distances themselves change in the reference frame, i.e. what the scale on the transformed <i>x</i>-axis is.<br /><br /><div class="twn-pitfall">Being simultaneous to doesn't mean "see". To see something, light (or anything else) must travel from that event to the observer's worldline. For instance, if Betelgeuse were to go supernova today, we are not simultaneous with the event, which we calculate to have happened hundreds of 600 years ago.</div><br />You might wonder, then: what if the two events are causally connected? I.e. what if there is a link that ensures $S_2$ turns on in response to $S_1$ turning on? Well, it turns out that in such a case, the order of the two events <i>is</i> in fact preserved. We will see why later -- the reason has to do with the connection between causality and light cones.<br /><br /><b>3. Transformation of the <i>x</i>-axis</b><br /><br />Let's think about how one would actually determine some event to be simultaneous to us right now. Well, obviously one must observe the event, for which we must detect the light coming from that event. Suppose we just observed the light we get from that event right now, at $t=0$. Could we say that gives us an event simultaneous with the present? Well, of course not. We know that the light traveled some distance to get here, so we're observing an event some time into the past. We would figure out how much into the past by determining the distance of the event from us. How would we do this? Well, we would reflect light off the object and see how long it takes to return.<br /><br />So suppose we use this method to determine which event is simultaneous to us. Releasing a light ray <i>now</i> would be too late -- we would get an event in the future in the reflected ray. Instead, we should have released a light ray <i>d</i>/<i>c</i> seconds ago, and if the light ray returns <i>d</i>/<i>c</i> seconds into the future, .,the object is <i>d</i> away from us, and the reflected ray shows the event simultaneous to us right now.<br /><br />For instance, if we shot a light ray at Betelgeuse in 1400, then the reflection we get in 2600 will be an image of how Betelgeuse looks today, in the year 2000 (on the scale of the universe, 17 years is no big deal -- for example, it is clearly an insufficient age to learn the difference between 2000 and 2017), because the star is 600 lightyears away.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/RTPWq.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="502" data-original-width="615" height="261" src="https://i.stack.imgur.com/RTPWq.png" width="320" /></a></div><br />(Note that the slope of a light ray on a spacetime diagram is always <i>1/c</i> in some direction -- since we're using natural units, this is just a slope of 1.)<br /><br />So here we have a general property -- and in fact a defining property -- of the <i>x</i>-axis: <b>it is the set of all points such that if you sent a ray to bounce off the point $-a$ seconds ago, it will return to us $a$ seconds later.</b><br /><br />Why is this useful? Well, if this reference frame were viewed in some other observer's reference frame, it would still be true (by the principle of relativity).<br /><br />What do we mean?<br /><br />Label the axes of this co-ordinate system as $t'$ and $x'$:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/3A2Vv.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="521" data-original-width="655" height="254" src="https://i.stack.imgur.com/3A2Vv.png" width="320" /></a></div><br />Then how would the points of spacetime this reference frame map into another reference frame? Well, perhaps something like this:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/3XY0n.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="512" data-original-width="645" height="254" src="https://i.stack.imgur.com/3XY0n.png" width="320" /></a></div><br />What do we want to know about this diagram? Well, the direction of the <i>x</i>' axis relative to the <i>x</i>-axis, i.e. the angle between them.<br /><br />What do we know about this diagram?<br /><ul><li>The slopes of the blue lines (the paths of the light rays) are of magnitude one (because the speed of light is the same in this reference frame, too).</li><li>AO = OD.</li><li>The angle between <i>t</i> and the <i>t'</i> axes, which is simply a function of the velocity (you should be able to calculate this angle with respect to the velocity by now -- remember, it's just a distance-time graph).</li></ul>How would you calculate angle BOC?<br /><br />Note: this is simply a geometric problem at this point. I encourage you to try it out on your own.<br /><br /><b>SPOILERS AHEAD.</b><br /><br />Well, if you look at the diagram hard enough, you might have noticed that ABD is a right-angled triangle with right angle B. Additionally, AO = OD. Well, any triangle can be inscribed in a circle, and in the case of a right-angled triangle, AD becomes the diameter. Thus AO = OD = OB is the radius.<br /><br />Then ODB is an isoceles triangle, and angle ODB = angle OBD. Meanwhile angles OCE and OEC are both pi/4, thus OED and OCB are equal as well (they are 3pi/4). Triangle OBC is thus congruent to triangle ODE (since two angles and a side are equal), and angle BOC = angle DOE. Since angle DOE = angle FOA, this means angle BOC = angle FOA.<br /><br />This conclusion is tremendously significant: the <i>x</i>-axis is rotated by precisely the same angle as the <i>t</i>-axis is, towards each other. This creates a brilliant symmetry between space and time in relativity. We are also very close to our final expression for the Lorentz transformation.<br /><br /><div class="twn-furtherinsight">By the way, this proof also illustrates the beauty of natural units: by choosing a system of units such that the slope of light's path is one, the angle ABD became a right angle, and we were able to exploit the property of an angle subtended by the diameter being 90 degrees.</div><br /><div class="twn-furtherinsight">Think about what happens in a reference frame moving at the speed of light. The axes then coincide.</div><br /><div class="twn-furtherinsight">To be fair, we already expected this. Since the speed of light is constant, the null vectors (vectors pointing along the path of light in spacetime, i.e. along the diagonals) are eigenvectors of the Lorentz transformation. The only way for this to be true when you have a linear transformation is for the <i>x</i>-axis to be tilted inwards by the same angle.</div><br /><b>4. Scale on the transformed axes</b><br /><br />We now know how the axes transform, and must determine the scale on each axis.<br /><br />First of all, we may assume that the Lorentz transformation is linear. Why? Well, a linear transformation is one which ensures that all straight lines remain straight lines, and the origin remains fixed. The origin remains fixed in the Lorentz transformation by definition (since the observer is at the same spot -- translations are not considered), and lines must not turn into curves, since curves represent non-inertial reference frames and an inertial reference frame must be seen as inertial in all reference frames.<br /><br />So how do our unit vectors look like? Well, we know the image of the <i>x</i>-unit vector is a multiple of the vector $\left[ {\begin{array}{*{20}{c}}<br /> 1 \\<br /> v<br />\end{array}} \right]$, where we're of course using natural units. The <i>t</i>-unit vector, meanwhile, is a multiple of the vector $\left[ {\begin{array}{*{20}{c}}<br /> v \\<br /> 1<br />\end{array}} \right]$.<br /><br />So the transformation matrix, which is itself a function of $v$, takes the form<br /><br />$$L(v)=\left[ {\begin{array}{*{20}{c}}<br /> \alpha &{\beta v} \\<br /> {\alpha v}&\beta <br />\end{array}} \right]$$<br />For some constants $\alpha$ and $\beta$. Note that this is the transformation matrix which maps the original co-ordinate system to the new one -- the actual Lorentz transformation is a co-ordinate transformation, and thus the inverse of this matrix.<br /><br />How would we find the values of $\alpha$ and $\beta$? Well, one way would be to consider the product $L(v)L(-v)$. Since you are simply boosting by a velocity of $v$ then boosting back by $-v$, this product must equal the identity matrix $I$. We impose this condition:<br /><br />$$\begin{gathered}<br /> \left[ {\begin{array}{*{20}{c}}<br /> 1&0 \\<br /> 0&1<br />\end{array}} \right] = \left[ {\begin{array}{*{20}{c}}<br /> \alpha &{\beta v} \\<br /> {\alpha v}&\beta <br />\end{array}} \right]\left[ {\begin{array}{*{20}{c}}<br /> \alpha &{ - \beta v} \\<br /> { - \alpha v}&\beta <br />\end{array}} \right] = \left[ {\begin{array}{*{20}{c}}<br /> {{\alpha ^2} - \alpha \beta {v^2}}&{{\beta ^2}v - \alpha \beta v} \\<br /> {{\alpha ^2}v - \alpha \beta v}&{{\beta ^2} - \alpha \beta {v^2}}<br />\end{array}} \right] \hfill \\<br /> {\alpha ^2}v - \alpha \beta v = 0 = {\beta ^2}v - \alpha \beta v \Rightarrow {\alpha ^2} = \alpha \beta = {\beta ^2} \Rightarrow \alpha = \beta \hfill \\<br /> {\alpha ^2} - \alpha \beta {v^2} = 1 = {\beta ^2} - \alpha \beta {v^2} \Rightarrow {\alpha ^2} = 1 + \alpha \beta {v^2} = {\beta ^2} \Rightarrow {\alpha ^2} = 1 + {\alpha ^2}{v^2} \hfill \\<br /> \Rightarrow \alpha = \beta = \frac{1}{{\sqrt {1 - {v^2}} }} \hfill \\<br />\end{gathered} $$<br />We call this coefficient the "Lorentz factor", and denote it by $\gamma$. From linear algebra, we know then that the co-ordinates of any point can then be transformed into the reference frame $O'$ as follows:<br /><br />$$\begin{gathered}<br /> \left[ {\begin{array}{*{20}{c}}<br /> {x'} \\<br /> {t'}<br />\end{array}} \right] = {L^{ - 1}}\left[ {\begin{array}{*{20}{c}}<br /> x \\<br /> t<br />\end{array}} \right] = {\gamma ^{ - 1}}{\left[ {\begin{array}{*{20}{c}}<br /> 1&v \\<br /> v&1<br />\end{array}} \right]^{ - 1}}\left[ {\begin{array}{*{20}{c}}<br /> x \\<br /> t<br />\end{array}} \right] \\<br /> = \sqrt {1 - {v^2}} \cdot \frac{1}{{1 - {v^2}}}\left[ {\begin{array}{*{20}{c}}<br /> 1&{ - v} \\<br /> { - v}&1<br />\end{array}} \right]\left[ {\begin{array}{*{20}{c}}<br /> x \\<br /> t<br />\end{array}} \right] \\<br /> = \frac{1}{{\sqrt {1 - {v^2}} }}\left[ {\begin{array}{*{20}{c}}<br /> 1&{ - v} \\<br /> { - v}&1<br />\end{array}} \right]\left[ {\begin{array}{*{20}{c}}<br /> x \\<br /> t<br />\end{array}} \right] \\<br /> = \gamma \left[ {\begin{array}{*{20}{c}}<br /> 1&{ - v} \\<br /> { - v}&1<br />\end{array}} \right]\left[ {\begin{array}{*{20}{c}}<br /> x \\<br /> t<br />\end{array}} \right] \\<br />\end{gathered} $$<br />We may write this without matrices as:<br /><br />$$\begin{gathered}<br /> x' = \gamma \left( {x - vt} \right) \\<br /> t' = \gamma \left( {t - vx} \right) \\<br />\end{gathered} $$<br />Which updates the Galilean transformation discussed previously, which was $x'=x-vt,\ \ t' = t$.<br /><br />How does this look without natural units? Well, first of all,<br /><br />$$\gamma = \frac{1}{{\sqrt {1 - \frac{{{v^2}}}{{{c^2}}}} }}$$<br />And<br /><br />$$\begin{gathered}<br /> x' = \gamma \left( {x - \frac{v}{c}ct} \right) \hfill \\<br /> ct' = \gamma \left( {ct - \frac{v}{c}x} \right) \hfill \\<br />\end{gathered} $$<br />You can see why we prefer to set $c=1$, but this is also instructive -- it presents a symmetry between $x$ and $ct$, and $v/c$ is the important "ratio factor" between these dimensions.<br /><br /><div class="twn-pitfall">The transformation we've been calling "Lorentz transformations" are actually Lorentz <i>boosts</i>. Lorentz transformations are a broader set of transformations which includes boosts as well as spatial rotations -- essentially all linear transformations under which special relativity is invariant. An even broader set, called the Poincaire transformations, is the set of all <i>affine</i> transformations under which special relativity is invariant, i.e. it includes translations. As we will learn, General Relativity is only invariant under Lorentz transformations, not translations.</div><br /><div class="twn-furtherinsight">We imposed the condition $L(v)L(-v)=L(0)$. Do you think one may impose, in general, that $L(v)L(w)=L(v+w)$? Why or why not? ... Answer is "no", because the velocity addition formula is <i>not</i>, in general, $v+w$.</div><br /><b>5. Zero orthogonal action of the Lorentz transformation</b><br /><br />Something we haven't considered so far is how a Lorentz boost treats spatial directions orthogonal to a Lorentz boost. We've been considering a Lorentz boost in the <i>x</i>-direction -- what happens to the <i>y</i>- and <i>z</i>- coordinates under this boost?<br /><br />Well, turns out, the answer is nothing. The explanation for this is pretty simple: attach a paintbrush to a train and let it paint the walls of the tunnel as the train drives through. Now send another train in the opposite direction and attach a paintbrush to it at the same height. Neither paintbrush can be "higher" than the other -- the paintbrushes must overlap in all reference frames.einsteinmathematicsphysicsrelativityspecial relativitysymmetryFri, 08 Sep 2017 17:10:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-2708899645807757616Abhimanyu Pallavi Sudhir2017-09-08T17:10:00ZTranspose, symmetric matrices, null and row spaces, dot products
https://thewindingnumber.blogspot.com/2017/08/symmetric-matrices-null-row-space-dot-product.html
0This is an exciting article, so pay close attention.<br /><br />In the last article, we introduced the column space, and we are lead to wonder about the vectors that are mapped to the zero vector (let's call the set of these vectors the "null space"). Based on some of the transformations we've seen, we might wonder if the null space is essentially the subspace perpendicular to the column space, i.e. the set of all vectors perpendicular to every vector in the column space (we call this the "orthogonal complement" of the column space).<br /><br />However, this is demonstrably wrong. Suppose one rotates the co-ordinate system before collapsing it onto a lower dimension. The "collapsing" matrix, the singular matrix, may be $\left[ {\begin{array}{*{20}{c}}1&0\\0&0\end{array}} \right]$, and the rotation matrix may be $\left[ {\begin{array}{*{20}{c}}0&{ - 1}\\1&0\end{array}} \right]$, then the composite transformation is $\left[ {\begin{array}{*{20}{c}}1&0\\0&0\end{array}} \right]\left[ {\begin{array}{*{20}{c}}0&{ - 1}\\1&0\end{array}} \right] = \left[ {\begin{array}{*{20}{c}}0&{ - 1}\\0&0\end{array}} \right]$<br /><br />In general in $\mathbb{R}^2$, if the "simple collapse" transformation looks like this:<br /><br />$$\left[ {\begin{array}{*{20}{c}}{{{\cos }^2}\phi }&{\cos \phi \sin \phi }\\{\cos \phi \sin \phi }&{{{\sin }^2}\phi }\end{array}} \right]$$<br />(Show that the non-rotating collapses can be written in this way), and the rotation matrix looks like this:<br /><br />$$\left[ {\begin{array}{*{20}{c}}{\cos \theta }&{\sin \theta }\\{ - \sin \theta }&{\cos \theta }\end{array}} \right]$$<br />Then their product<br /><br />$$\left[ {\begin{array}{*{20}{c}}{{{\cos }^2}\phi }&{\cos \phi \sin \phi }\\{\cos \phi \sin \phi }&{{{\sin }^2}\phi }\end{array}} \right]\left[ {\begin{array}{*{20}{c}}{\cos \theta }&{\sin \theta }\\{ - \sin \theta }&{\cos \theta }\end{array}} \right] \\= \left[ {\begin{array}{*{20}{c}}{\cos \phi \cos (\phi - \theta )}&{\cos \phi \sin (\phi - \theta )}\\{\sin \phi \cos (\phi - \theta )}&{\sin \phi \sin (\phi - \theta )}\end{array}} \right]$$<br />Which is also a singular matrix with the same rank and column space as the original, but its null space is a rotated version of the original, and therefore no longer perpendicular to the column space.<br /><br />We make the following observations:<br /><ol><li>The rotation has introduced an "asymmetry" in the components of the matrix -- the matrix is no longer symmetric across the principal diagonal, the rows and columns are no longer the same. In our original terminology before we introduced matrices, the "contribution of <i>y</i> to <i>x</i>" is no longer the same as "the contribution of <i>x</i> to <i>y</i>. </li><li>The span of the column vectors is not dependent on $\theta$ (proving again that the column space is not the space orthogonal to the null space), but the span of the <i>row</i> vectors is -- we'll call this the <i>row space</i>. The row space used to be the same as the column space, but is now rotated clockwise by an angle of $\theta$. Since the space is rotated counter-clockwise by $\theta$ before being collapsed, this means the null space is $\theta$ clockwise to what its position would have been without the rotation. </li></ol>Hence, the null space is perpendicular to the row space, at least if the pre-collapse transformation is a rotation. It would be a very educated guess to say that this applies generally, that the null space of a transformation is always perpendicular to the row space.<br /><br />Indeed, this is the case, and is pretty easy to show:<br /><br />$$\left[ {\begin{array}{*{20}{c}}{{r_1}}\\ \vdots \\{{r_n}}\end{array}} \right]\vec x = 0\,\,\, \Rightarrow \,\,\,{r_i} \cdot \vec x = 0\,\,\,\,\,\,\forall i \in [1,n] \cap \mathbb{Z}$$<br />More interestingly, however, we made a pretty important observation about the symmetry of a matrix: <b>asymmetry in the matrix seems to be a measure of "rotation-ish" a matrix is</b>. The reason this makes sense is that while the values on the principal diagonal talk about how much each component is <i>scaled</i> in the resulting image, the off-diagonal elements talk about how much contribution there is to one component of a vector from the component in another direction, something that happens during rotations.<br /><br />If the contribution from <i>x</i> to <i>y</i> is exactly the same as the contribution from <i>y</i> to <i>x</i>, then the <i>xy</i>-plane isn't really being rotated, but the basis vectors are rather being pulled closer to/apart from each other. On the other hand, when they are different, the "orientation" of the axes changes, and an actual rotation is induced.<br /><br />The most "rotation-ish" effect is created when the values of the matrix are the negative of the their reflection across the principal diagonal, because it means the basis vectors are being rotated by the same angle in the same direction.<br /><br />Quick exercise: plot how such a transformation looks in $\mathbb{R}^2$. You'll notice that the rotation isn't as "perfect" as you might've hoped -- the vectors change length, etc., and it's the orthogonal displacement in the basis vectors that is the same in orthogonal directions for the basis vectors, not the angles.<br /><br />These changes in length are a result of scaling, i.e. the values along the principal diagonal.<br /><br /><div class="twn-pitfall">Well, not exactly, because the values along the principal diagonal scale the part of the basis vectors in their original direction, not the overall length of the basis vectors. This is why (i) the resulting matrix not only eliminates the scaling, but also ensures the rotation is in right-angles, (ii) the resulting matrix is not actually just a bunch of pure rotation, as its rotations are still scaled, and (iii) why some otherwise-antisymmetric matrices <i>with </i>a principal diagonal, like$\left[ {\begin{array}{*{20}{c}}<br /> {\cos \theta }&{ - \sin \theta } \\<br /> {\sin \theta }&{\cos \theta }<br />\end{array}} \right]$, can still be rotations in some angle other than a right angle. Highly recommended reading: my answer to "<a href="https://math.stackexchange.com/a/2780461/78451">Intuition behind the Speciality of Symmetric Matrices</a>".</div><br />Therefore, it is useful to extract from the matrix the purely anti-symmetric part, with a zero principal diagonal:<br /><br />$$\left[ {\begin{array}{*{20}{c}}0&{{b_{12}}}& \ldots &{{b_{1n}}}\\{ - {b_{12}}}&0& \ldots &{{b_{2n}}}\\ \vdots & \vdots & \ddots & \vdots \\{ - {b_{1n}}}&{ - {b_{2n}}}& \ldots &0\end{array}} \right]$$<br />This matrix is called a <i>skew-symmetric matrix</i> or an <i>anti-symmetric matrix</i>, and in $\mathbb{R}^n$ is essentially a combinations of scaled rotations around different axes.<br /><br />Meanwhile, one may define a <i>symmetric matrix</i> in the following way:<br /><br />$$\left[ {\begin{array}{*{20}{c}}0&{{b_{12}}}& \ldots &{{b_{1n}}}\\{{b_{12}}}&0& \ldots &{{b_{2n}}}\\ \vdots & \vdots & \ddots & \vdots \\{{b_{1n}}}&{{b_{2n}}}& \ldots &0\end{array}} \right]$$<br />These matrices essentially scale and skew vectors -- the principal diagonal components do the scaling in each direction, and the other components do the skewing, i.e. tilting the basis vectors towards each other.<br /><br /><div class="twn-furtherinsight">One may wonder if it would have been better to simply deal with diagonal matrices instead of symmetric ones. However, as we will see, it turns out that any matrix can be written as the sum of a symmetric matrix and an anti-symmetric matrix. It is also worth noting that we will eventually see that a <b>symmetric matrix is a generalisation of a scaling matrix</b>, where the scaling is done in some arbitrary directions that need not be the basis vectors. These vectors are called "<b>eigenvectors</b>".</div><br />Looking at the definitions of symmetric and antisymmetric matrices, it is clear that any matrix may be written as the sum of a symmetric matrix and an antisymmetric one. Specifically, one may write:<br /><br />$$A = \underbrace {\frac{1}{2}(A + {A^T})}_{\scriptstyle{\rm{symmetric }}\atop\scriptstyle{\rm{part}}} + \underbrace {\frac{1}{2}(A - {A^T})}_{\scriptstyle{\rm{antisymmetric }}\atop\scriptstyle{\rm{part}}}$$<br />Where $A^T$ is the "transpose" of $A$, referring to the matrix formed upon taking flipping $A$'s entries across the principal diagonal. E.g. the row space of $A$ is the column space of $A^T$.<br /><br /><div class="twn-analogies">Does this all remind you of something?<br /><ul><li>Any entity can be written in two "parts" of a specific nature, which look curiously like the exponential form of the cosine and sine functions</li><li>These two parts represent scaling and scaled $\pi/2$ rotation respectively.</li></ul>It should remind you of the <b>Cartesian form of a complex number</b>.<br /><br /><b>A symmetric matrix is "like" a real number, and an anti-symmetric matrix is "like" an imaginary number. </b> A complex number can be thought of as an object to be transformed (like a vector) as well as as a transformation itself (like a matrix). <br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://upload.wikimedia.org/wikipedia/commons/thumb/d/d2/90-Degree_Rotations_in_the_Complex_Plane.svg/2000px-90-Degree_Rotations_in_the_Complex_Plane.svg.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="800" data-original-width="800" height="320" src="https://upload.wikimedia.org/wikipedia/commons/thumb/d/d2/90-Degree_Rotations_in_the_Complex_Plane.svg/2000px-90-Degree_Rotations_in_the_Complex_Plane.svg.png" width="320" /></a></div><br />However, it is important to highlight the differences: (i) while a real number only scales things in their own direction (like a multiple of $I$), a symmetric matrix can skew things, which is equivalent to scaling across a different set of axes. Therefore while complex numbers can encode spiral transformations, matrices encode <i>all</i> linear transformations. (ii) complex numbers transform on the complex plane, which is a two dimensional plane like $\mathbb{R}^2$, while linear transformations can operate in any number of dimensions, and not necessarily even just in $\mathbb{R}^n$. An example of a consequence of this would be that an anti-symmetric matrix can encode a series of right-angle rotations with corresponding scalings in dimensions greater than three, but an imaginary number can only correspond to a rotation around an axis pointing out of the plane.<br /><br />It is fair to say that linear algebra generalises complex numbers with matrices. For some examples of correspondence between specific complex numbers, see <a href="https://thewindingnumber.blogspot.in/2016/11/introduction-to-symmetry.html">Introduction to symmetry</a>, section "symmetry on the complex plane". We will formalise this whole idea of matrices being "like real numbers" and "like imaginary numbers" when we do eigenvalues and eigenbases -- in fact, antisymmetric matrices have imaginary eigenvalues.<br /><br /><div class="twn-furtherinsight">The complex conjugate, similarly, is generalised to the transpose. Explain how.</div></div><br /><div style="text-align: center !important;"><a href="https://i.stack.imgur.com/nVJ5e.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/czIYvs9zj2g/0.jpg" frameborder="0" height="252" src="https://www.youtube.com/embed/czIYvs9zj2g?feature=player_embedded" width="448"></iframe></a></div><br /><div class="twn-furtherinsight">Watch the above Khan Academy video.<br />Come up with an intuitive explanation for why the row space solution is the shortest.</div>dot productdualityeigenvalueslinear algebralinear transformationsmathematicssymmetric matrixtransposeThu, 17 Aug 2017 19:17:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1567387654570018003Abhimanyu Pallavi Sudhir2017-08-17T19:17:00ZComment by Abhimanyu Pallavi Sudhir on reference for higher spin - not gravitational nor stringy
https://mathoverflow.net/questions/195125/reference-for-higher-spin-not-gravitational-nor-stringy
On <a href="http://www.physicsoverflow.org/27048/reference-for-higher-spin-not-gravitational-nor-stringy?show=27499#a27499" rel="nofollow noreferrer">PhysicsOverflow</a>, there is a link to <a href="http://inspirehep.net/record/265411" rel="nofollow noreferrer">this paper</a> for the same question.Sun, 01 Mar 2015 02:25:25 GMThttps://mathoverflow.net/questions/195125/reference-for-higher-spin-not-gravitational-nor-stringy?cid=493513Abhimanyu Pallavi Sudhir2015-03-01T02:25:25ZComment by Abhimanyu Pallavi Sudhir on Classical and Quantum Chern-Simons Theory
https://mathoverflow.net/questions/159695/classical-and-quantum-chern-simons-theory
This has received an answer on PhysicsOverflow if you're still interested: <a href="http://www.physicsoverflow.org/22251/classical-and-quantum-chern-simons-theory#c22256" rel="nofollow noreferrer">Classical and Quantum Chern-Simons Theory</a>Thu, 14 Aug 2014 13:14:02 GMThttps://mathoverflow.net/questions/159695/classical-and-quantum-chern-simons-theory?cid=447277Abhimanyu Pallavi Sudhir2014-08-14T13:14:02ZComment by Abhimanyu Pallavi Sudhir on What is convolution intuitively?
https://mathoverflow.net/questions/5892/what-is-convolution-intuitively
<a href="http://en.wikipedia.org/wiki/File:Convolution_of_spiky_function_with_box2.gif" rel="nofollow noreferrer">Wikipedia</a>Fri, 17 Jan 2014 16:20:39 GMThttps://mathoverflow.net/questions/5892/what-is-convolution-intuitively?cid=396721Abhimanyu Pallavi Sudhir2014-01-17T16:20:39ZComment by Abhimanyu Pallavi Sudhir on Embedding of F(4) in OSp(8|4)?
https://mathoverflow.net/questions/111110/embedding-of-f4-in-osp84
Cross-posted to: <a href="http://physics.stackexchange.com/q/41155/23119">physics.stackexchange.com/q/41155/23119</a>Mon, 23 Dec 2013 04:35:50 GMThttps://mathoverflow.net/questions/111110/embedding-of-f4-in-osp84?cid=391443Abhimanyu Pallavi Sudhir2013-12-23T04:35:50ZComment by Abhimanyu Pallavi Sudhir on How to compare Unicode characters that "look alike"?
https://stackoverflow.com/questions/20674577/how-to-compare-unicode-characters-that-look-alike
I compared every single pixel of it, and it looks the same.Thu, 19 Dec 2013 09:26:53 GMThttps://stackoverflow.com/questions/20674577/how-to-compare-unicode-characters-that-look-alike?cid=30963612Abhimanyu Pallavi Sudhir2013-12-19T09:26:53ZComment by Abhimanyu Pallavi Sudhir on What is the definition of picture changing operation?
https://mathoverflow.net/questions/152295/what-is-the-definition-of-picture-changing-operation
Related: <a href="http://physics.stackexchange.com/q/12595/23119">physics.stackexchange.com/q/12595/23119</a>Thu, 19 Dec 2013 07:26:36 GMThttps://mathoverflow.net/questions/152295/what-is-the-definition-of-picture-changing-operation?cid=390438Abhimanyu Pallavi Sudhir2013-12-19T07:26:36ZComment by Abhimanyu Pallavi Sudhir on The Fuchsian monodromy problem
https://mathoverflow.net/questions/146099/the-fuchsian-monodromy-problem/148462#148462
@KetilTveiten: Ah, thanks.Tue, 10 Dec 2013 12:08:10 GMThttps://mathoverflow.net/questions/146099/the-fuchsian-monodromy-problem/148462?cid=388122#148462Abhimanyu Pallavi Sudhir2013-12-10T12:08:10ZComment by Abhimanyu Pallavi Sudhir on Understanding the intermediate field method for the $\phi^4$ interaction
https://mathoverflow.net/questions/149564/understanding-the-intermediate-field-method-for-the-phi4-interaction
@DanielSoltész: Nope, high-level questions generally get largely ignored there these days.Tue, 26 Nov 2013 14:40:20 GMThttps://mathoverflow.net/questions/149564/understanding-the-intermediate-field-method-for-the-phi4-interaction?cid=384774Abhimanyu Pallavi Sudhir2013-11-26T14:40:20ZComment by Abhimanyu Pallavi Sudhir on Intuition behind the ricci flow
https://mathoverflow.net/questions/143144/intuition-behind-the-ricci-flow/143146#143146
I was about to post the same thing, I think this is very illustrative.Tue, 19 Nov 2013 16:05:08 GMThttps://mathoverflow.net/questions/143144/intuition-behind-the-ricci-flow/143146?cid=383288#143146Abhimanyu Pallavi Sudhir2013-11-19T16:05:08ZComment by Abhimanyu Pallavi Sudhir on What is the relationship between complex time singularities and UV fixed points?
https://mathoverflow.net/questions/134939/what-is-the-relationship-between-complex-time-singularities-and-uv-fixed-points
This actually got twice the number of views here than on Physics.SE.Sun, 10 Nov 2013 14:50:44 GMThttps://mathoverflow.net/questions/134939/what-is-the-relationship-between-complex-time-singularities-and-uv-fixed-points?cid=381229Abhimanyu Pallavi Sudhir2013-11-10T14:50:44ZAnswer by Abhimanyu Pallavi Sudhir for The Fuchsian monodromy problem
https://mathoverflow.net/questions/146099/the-fuchsian-monodromy-problem/148462#148462
1<p>Equation 6.2 is just the Liovelle Action, the action principle for the <em>Liouville Field</em>, which is well-known from the familiar conformal gauge. </p>
<p>$$S_L=\frac{c}{96\pi}\int_\mathcal{M}\left(\dot\varphi^2-\frac{16\varphi}{\left(1-\lvert t\rvert^2\right)^2}\right)\mathrm{d}^2t$$ </p>
<p>... along with some trivial facts about partition functions. </p>
<p>You could of course think of it as the $Z_\mathcal{M}$'s (partition functions) of the metrics being related by the $S_L$'s in the same way that the metrics are related by the Liouvelle field. </p>
<p>And yes, I don't know how to spell "Lioivulle" properly. </p>Sun, 10 Nov 2013 06:53:28 GMThttps://mathoverflow.net/questions/146099/-/148462#148462Abhimanyu Pallavi Sudhir2013-11-10T06:53:28ZComment by Abhimanyu Pallavi Sudhir on Modular Arithmetic in LaTeX
https://mathoverflow.net/questions/18813/modular-arithmetic-in-latex
Haha, I thought this question was about typsetting a paper in $\LaTeX$Fri, 08 Nov 2013 11:34:52 GMThttps://mathoverflow.net/questions/18813/modular-arithmetic-in-latex?cid=379817Abhimanyu Pallavi Sudhir2013-11-08T11:34:52ZAnswer by Abhimanyu Pallavi Sudhir for String theory "computation" for math undergrad audience
https://mathoverflow.net/questions/47770/string-theory-computation-for-math-undergrad-audience/147307#147307
2<p>Derive the Casimir Energy in Bosonic String Theory. </p>
<p>You start with the $\hat L_0$ operator and get rid of the non-vacuum $\displaystyle\frac{\alpha_0^2}{2}+\sum_{n=1}^\infty\alpha_{-n}\cdot\alpha_n$, then you use a Ramanujam sum to do $\zeta$-function renormalisation, from which you find out that the vacuum energy denoted by $\varepsilon_0$ is </p>
<p>$$\varepsilon_0=-\frac{d-2}{24}$$ </p>
<p>However, the most interesting part comes when you go around <a href="https://mathoverflow.net/a/140354/36148">deriving</a> the critical dimension of Bosonic String Theory. </p>
<p>After which, the expression surprisingly simplifyies to a $-1$. </p>
<p>For a more detailed derivation of the above stuff, see <a href="http://arxiv.org/pdf/hep-th/0207142v1.pdf" rel="nofollow noreferrer">these</a> lecture notes/. (Section 4) (Equation 4.5-4.10) </p>Fri, 08 Nov 2013 04:33:41 GMThttps://mathoverflow.net/questions/47770/-/147307#147307Abhimanyu Pallavi Sudhir2013-11-08T04:33:41ZComment by Abhimanyu Pallavi Sudhir on Book on mathematical "rigorous" String Theory?
https://mathoverflow.net/questions/71909/book-on-mathematical-rigorous-string-theory/71998#71998
I don't think that BBS falls into the category of "mathematically rigorous". It's a very good, intuitive book.Fri, 08 Nov 2013 04:17:49 GMThttps://mathoverflow.net/questions/71909/book-on-mathematical-rigorous-string-theory/71998?cid=379753#71998Abhimanyu Pallavi Sudhir2013-11-08T04:17:49ZComment by Abhimanyu Pallavi Sudhir on About the massless supermultiplets in $2+1$ dimensional supersymmetry
https://mathoverflow.net/questions/103392/about-the-massless-supermultiplets-in-21-dimensional-supersymmetry
@S.Carnahan: The OP has voluntarily deleted it, which is weird... I have flagged this as unclear what you're asking.Wed, 06 Nov 2013 16:49:00 GMThttps://mathoverflow.net/questions/103392/about-the-massless-supermultiplets-in-21-dimensional-supersymmetry?cid=379331Abhimanyu Pallavi Sudhir2013-11-06T16:49:00ZAnswer by Abhimanyu Pallavi Sudhir for Does $SO(32) \sim_T E_8 \times E_8$ relate to some group theoretical fact?
https://mathoverflow.net/questions/57529/does-so32-sim-t-e-8-times-e-8-relate-to-some-group-theoretical-fact/147129#147129
5<p>The answer to this question can be found in Lubos Motl's answer to <a href="https://physics.stackexchange.com/q/65092/23119">this question of mine on Physics.SE</a>. </p>
<p>The key here are the weight lattices bosonic representations $\Gamma$ of these gauge groups.</p>
<p>As I understand it, the weight lattice of $E(8)$ is $\Gamma^8$, whereas the weight lattice of $\frac{\operatorname{Spin}\left(32\right)}{\mathbb{Z}_2}$^ is $\Gamma^{16}$. The first fact means that the weight lattice of $E(8)\times E(8)$ is $\Gamma^{8}\oplus\Gamma^8$, </p>
<p>Now, an identity, that $\Gamma^{8}\oplus\Gamma^8\oplus\Gamma^{1,1}=\Gamma^{16}\oplus\Gamma^{1,1} $ , which actually allows this T-Duality. Now, this means that it is <em>this very identity</em> which allows the identity mentioned in the original post. </p>
<p>So, the answer to your question is "<strong>Yes</strong>", there <em>is</em> a group-theoretical fact, and that is that $ \Gamma^{8}\oplus\Gamma^8\oplus\Gamma^{1,1}= \Gamma^{16}\oplus\Gamma^{1,1} $. </p>Wed, 06 Nov 2013 16:46:03 GMThttps://mathoverflow.net/questions/57529/does-so32-sim-t-e-8-times-e-8-relate-to-some-group-theoretical-fact/147129#147129Abhimanyu Pallavi Sudhir2013-11-06T16:46:03ZComment by Abhimanyu Pallavi Sudhir on Count of binary matrices that avoids a certain sub-matrix
https://mathoverflow.net/questions/30362/count-of-binary-matrices-that-avoids-a-certain-sub-matrix/30371#30371
@quid: Ok, I forgot about that.Tue, 29 Oct 2013 12:03:27 GMThttps://mathoverflow.net/questions/30362/count-of-binary-matrices-that-avoids-a-certain-sub-matrix/30371?cid=376986#30371Abhimanyu Pallavi Sudhir2013-10-29T12:03:27ZComment by Abhimanyu Pallavi Sudhir on Can the equation of motion with friction be written as Euler-Lagrange equation, and does it have a quantum version?
https://mathoverflow.net/questions/146042/can-the-equation-of-motion-with-friction-be-written-as-euler-lagrange-equation
Uh, how is this <i>Research-level</i>?Mon, 28 Oct 2013 11:14:41 GMThttps://mathoverflow.net/questions/146042/can-the-equation-of-motion-with-friction-be-written-as-euler-lagrange-equation?cid=376669Abhimanyu Pallavi Sudhir2013-10-28T11:14:41ZAnswer by Abhimanyu Pallavi Sudhir for Strange factor multiplying the fermionic part in the NS mass-squared operator?
https://math.stackexchange.com/questions/541351/strange-factor-multiplying-the-fermionic-part-in-the-ns-mass-squared-operator/542655#542655
1<p>If one solves the field equations for the bosonic field, with the Newmann/Dirchilet/Closed String Boundary conditions, one can see that the mode expansion is something like: </p>
<p>$$X^\mu=...+i\sqrt{2\alpha'} \sum_{n\neq0 }^{ } \frac{\alpha^\mu}{n}\exp\left(in\sigma^0\right)\cos\left(n\sigma^1\right)$$ </p>
<p>On the other hand, the fermionic field mode expansion goes like: </p>
<p>$$\psi^\mu_\pm = \frac1{\sqrt2}\sum_{n\in\mathbb Z \ \mathrm{or} \ \mathbb{Z}+\frac12 }b_r^\mu \exp\left(-ir\sigma^\pm\right) $$ </p>
<p>Notice that there is a missing factor of $\frac1r$ in the second equation. </p>
<p>$[N,b_{-r}^\mu]=rb_{-r}^\mu$ and the conclusion follows. </p>Mon, 28 Oct 2013 10:30:48 GMThttps://math.stackexchange.com/questions/541351/-/542655#542655Abhimanyu Pallavi Sudhir2013-10-28T10:30:48ZAnswer by Abhimanyu Pallavi Sudhir for A closed form for $\sum_{k = 1}^{\infty} k^{-k}$?
https://math.stackexchange.com/questions/518960/a-closed-form-for-sum-k-1-infty-k-k/518979#518979
0<p>There isn't a known closed form in terms of elementary functions. I've seen it being called the "Sophomore's dream function" $\mathrm{Sphd}(x) = \sum_{k = 1}^{\infty} k^{-k-x}$ at least once, due to its significance in the <a href="http://enwp.org/Sophomore's_dream" rel="nofollow noreferrer">Sophomore's Dream</a>.</p>Tue, 08 Oct 2013 14:07:44 GMThttps://math.stackexchange.com/questions/518960/a-closed-form-for-sum-k-1-infty-k-k/518979#518979Abhimanyu Pallavi Sudhir2013-10-08T14:07:44ZAnswer by Abhimanyu Pallavi Sudhir for Surprising identities / equations
https://math.stackexchange.com/questions/505367/surprising-identities-equations/507802#507802
50<p><a href="http://enwp.org/Sophomore%27s_dream">$$\sum_{k=1}^{\infty}k^{-k}=\int_0^1x^{-x}\mbox{ d}x=\mathrm{Sophomore's}\mbox{ } \mathrm{dream}$$</a> </p>Sat, 28 Sep 2013 12:36:00 GMThttps://math.stackexchange.com/questions/505367/-/507802#507802Abhimanyu Pallavi Sudhir2013-09-28T12:36:00ZAnswer by Abhimanyu Pallavi Sudhir for Why does bosonic string theory require 26 spacetime dimensions?
https://mathoverflow.net/questions/99643/why-does-bosonic-string-theory-require-26-spacetime-dimensions/140354#140354
5<p>$$$$</p>
<p><em>Note, that here, the $\hat L_n$ are operators on the state given by the sums of the dots of the mode operators, i.e. $\hat L_0=\sum_{k=-\infty}^\infty\hat\alpha_{-n}\cdot\hat\alpha_n$.</em> </p>
<p>Also note that The Virasoro Algebra is the central extension of the Witt/Conformal Algebra so that explains why we have a $D$, it is equivalent to the central charge. </p>
<p>I'll expand on Chris Gerig's answer. </p>
<p>Not only do we need $D=26$, we also need the normal ordering constant $a=1$. The normal ordering constant is the eigenvalue of $\hat L_0$ with the eigenvector the state. </p>
<p>We want to promote the time-like states to spurious, zero-norm states, right? So, we impose the (level 1) spurious state conditions on the state as ffollows ($|\chi\rangle$ are the basis vectors to build the spurious state $\Phi\rangle$ on.) </p>
<p>$$ \begin{gathered}
0 = {{\hat L}_1}\left| \Phi \right\rangle \\
{\text{ }} = {{\hat L}_1}{{\hat L}_{ - 1}}\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left[ {{{\hat L}_{ - 1}},{{\hat L}_1}} \right]\left| {{\chi _1}} \right\rangle + {{\hat L}_{ - 1}}{{\hat L}_1}\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left[ {{{\hat L}_{ - 1}},{{\hat L}_1}} \right]\left| {{\chi _1}} \right\rangle \\
{\text{ }} = 2{{\hat L}_0}\left| {{\chi _1}} \right\rangle \\
{\text{ }} = 2{c_0}\left( {a - 1} \right)\left| {{\chi _1}} \right\rangle \\
\end{gathered} $$</p>
<p>That means that $a=1$. </p>
<p>Now, for a level 2 spurious state, </p>
<p>$$\begin{gathered}
\left[ {{{\hat L}_1},{{\hat L}_{ - 2}} + k{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right]\left| \psi \right\rangle = \left( {3{{\hat L}_{ - 1}} + 2k{{\hat L}_0}{{\hat L}_{ - 1}} + 2k{{\hat L}_{ - 1}}{{\hat L}_0}} \right)\left| \psi \right\rangle {\text{ }} \\
{\text{ }} = \left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_0}{{\hat L}_{ - 1}}{\text{ }}\left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_0}{{\hat L}_{ - 1}}{\text{ }} \\
0 = {{\hat L}_1}\left| \psi \right\rangle = {{\hat L}_1}\left( {{{\hat L}_{ - 2}} + k{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right)\left| {{\chi _1}} \right\rangle = \left( {\left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_0}{{\hat L}_{ - 1}}} \right)\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left( {\left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_{ - 1}}\left( {{{\hat L}_0} + 1} \right)} \right)\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left( {3 - 2k} \right){{\hat L}_{ - 1}}\left| {{\chi _1}} \right\rangle \\
2k = 3 \\
k = \frac{3}{2} \\
\end{gathered} $$ </p>
<p>Since this level 2 spurious state can be written as: </p>
<p>$$ {\left| \Phi \right\rangle = {{\hat L}_{ - 2}}\left| {{\chi _1}} \right\rangle + k{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}\left| {{\chi _2}} \right\rangle }$$ ## </p>
<p>So, then, </p>
<p>$$ \begin{gathered}
{{\hat L}_2}\left| \Phi \right\rangle = 0 \\
{{\hat L}_2}\left( {{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right)\left| {{\chi _2}} \right\rangle = 0 \\
\left[ {{{\hat L}_2},{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right]\left| {{\chi _2}} \right\rangle + \left( {{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right){{\hat L}_2}\left| {{\chi _2}} \right\rangle = 0 \\
\left[ {{{\hat L}_2},{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right]\left| {{\chi _2}} \right\rangle = 0 \\
\left( {13{{\hat L}_0} + 9{{\hat L}_{ - 1}}{{\hat L}_{ + 1}} + \frac{D}{2}} \right)\left| {{\chi _2}} \right\rangle = 0 \\
\frac{D}{2} = 13 \\
\text{Since $L_0|\chi_2\rangle = -|\chi_2\rangle$ and $L_{+1}|\chi_2\rangle=0$, we have }
D = 26 \\
\end{gathered} $$ \ </p>
<p>And then, finally,</p>
<p>Q.E.D. </p>
<p>So, this was done essentially to remove the imaginary norm ghost states and using the Canonical / Gupta - Bleuer formalism. </p>
<p>It's also possible to use , say, e.g. Light Cone Gauge (LCG) quantisation. However, in other quantisation methods, the conformal anomaly is manifest in other forms. E.g., in LCG quantisationn, it is manifest as a failure of lorentz symmetry. A good overview of this method can be found in <strong>Kaku</strong> <em>Strings, Conformal fields, and M-theory</em> (it's the only part of the book that I liked, actually. The rest of the book is too rigorous, without much physical intuition.). </p>Sun, 25 Aug 2013 09:40:17 GMThttps://mathoverflow.net/questions/99643/why-does-bosonic-string-theory-require-26-spacetime-dimensions/140354#140354Abhimanyu Pallavi Sudhir2013-08-25T09:40:17ZAnswer by Abhimanyu Pallavi Sudhir for Why is there a deep mysterious relation between string theory and number theory, elliptic curves, $E_8$ and the Monster group?
https://physics.stackexchange.com/questions/4748/why-is-there-a-deep-mysterious-relation-between-string-theory-and-number-theory/71301#71301
7<p>I'll answer the relation between string theory and $E(8)$ -- a common appearance of $E(8)$ in string theory is in the gauge group of <a href="http://en.wikipedia.org/wiki/Type_HE_theory" rel="nofollow noreferrer">Type HE string theory</a> $E(8)\times E(8)$ (see <a href="https://physics.stackexchange.com/questions/68242/why-do-the-mismatched-16-dimensions-have-to-be-compactified-on-an-even-lattice">here</a> for an explanation why). But it's interesting physically because it embeds the standard model subgroup.</p>
<p>$$SU(3)\times SU(2)\times U(1)\subset SU(5)\subset SO(10)\subset E(6)\subset E(7)\subset E(8)$$ </p>
<p>Indeed, the ones in between are GUT subgroups, and $E(8)$ happens to be the "largest" of the exceptional lie groups.</p>
<p><a href="http://en.wikipedia.org/wiki/Monstrous_moonshine#Borcherds.27_proof" rel="nofollow noreferrer">Wikipedia</a> has some things to say about the connections to monstrous moonshine, I'm not familiar with it. See <a href="https://physics.stackexchange.com/questions/5207/number-of-dimensions-in-string-theory-and-possible-link-with-number-theory?lq=1#comment13658_5207">[1]</a>, <a href="https://physics.stackexchange.com/questions/5207/number-of-dimensions-in-string-theory-and-possible-link-with-number-theory?lq=1#comment13659_5207">[2]</a> re: the connections to number theory. Another example is how "1+2+3+4=10" demonstrates a 10-dimensional theory's ability to explain the four fundamental fources -- EM is the curvature of the $U(1)$ bundle, the weak force is the curvature of the $SU(2)$ bundle, the strong is the curvature of the $SU(3)$ bundle and gravity is the curvature of spacetime.</p>
<p>[Archiving Ron Maimon's comment here in case it gets deleted --]</p>
<blockquote>
<p>There is another point, that E(8) <s>is</s> has embedded E6xSU(3), and on a Calabi Yau, the SU(3) is the holonomy, so you can easily and naturally break the E8 to E6. This idea appears in Candelas Horowitz Strominger Witten in 1985, right after Heterotic strings and it is still the easiest way to get the MSSM. The biggest obstacle is to get rid of the MS part--- you need a SUSY breaking at high energy that won't wreck the CC or produce a runaway Higgs mass, since it seems right now there is no low-energy SUSY. </p>
</blockquote>Tue, 16 Jul 2013 18:46:52 GMThttps://physics.stackexchange.com/questions/4748/-/71301#71301Abhimanyu Pallavi Sudhir2013-07-16T18:46:52ZAnswer by Abhimanyu Pallavi Sudhir for Revolution of Earth
https://physics.stackexchange.com/questions/70150/revolution-of-earth/70153#70153
1<p>No, all motion is not relative in this classical sense. You can determine if you are accelerating by noting fictitious forces -- in this case centrifugal forces which would not arise if the Earth were inertial and the Sun orbiting it. The classic way to do this a Foucalt's pendulum.</p>Sat, 06 Jul 2013 06:50:52 GMThttps://physics.stackexchange.com/questions/70150/-/70153#70153Abhimanyu Pallavi Sudhir2013-07-06T06:50:52ZAnswer by Abhimanyu Pallavi Sudhir for Do electrons have a radius when they behave like a particle?
https://physics.stackexchange.com/questions/70067/do-electrons-have-a-radius-when-they-behave-like-a-particle/70077#70077
2<p>No, particles have zero spatial extent in standard quantum mechanics. In fact, they are the limiting cases of waves, which do have a spatial extent. While sometimes we assign a "classical radius" to particles, these are for specific practical purposes relating to a specific physical system.</p>Fri, 05 Jul 2013 08:04:48 GMThttps://physics.stackexchange.com/questions/70067/-/70077#70077Abhimanyu Pallavi Sudhir2013-07-05T08:04:48ZAnswer by Abhimanyu Pallavi Sudhir for The General Relativity from String Theory Point of View
https://physics.stackexchange.com/questions/70060/the-general-relativity-from-string-theory-point-of-view/70061#70061
4<blockquote>
<p><strong>UPDATE: I have written a more complete answer here: <a href="https://physics.stackexchange.com/questions/44782/how-do-the-einsteins-equations-come-out-of-string-theory/71370#71370">How do Einstein's field equations come out of string theory?</a></strong></p>
</blockquote>
<p>The effective gravitational terms of the spacetime action, which can be derived from the Polykov action (gravitons are bosons) are --</p>
<p>$$S_{G}=\lambda\int\left(R+\ell_s^2R_{\mu\nu\rho\sigma}R^{\mu\nu\rho\sigma}\right)\mbox{ d}^D x$$</p>
<p>Where we neglected terms of order $\ell_s^4$ and greater. To first-order in $\ell_s$, the string length --</p>
<p>$$S_{EH}=\lambda\int R\mbox{ d}^D x$$</p>
<p>Which is the $n$-dimensional Einstein-Hilbert Action. </p>
<hr>
<p>The vacuum EFE may also be derived directly from setting the beta functional, which measures the breaking of conformal invariance, to zero:
$$\beta^G_{\mu\nu} = \ell_s^2 R_{\mu\nu}+\ell_s^4R_{\mu\nu}R_{\mu\nu\rho\sigma}R^{\mu\nu\rho\sigma}+... = 0$$</p>
<p>For weak gravity --</p>
<p>$$R_{\mu\nu}=0$$</p>Fri, 05 Jul 2013 05:17:09 GMThttps://physics.stackexchange.com/questions/70060/-/70061#70061Abhimanyu Pallavi Sudhir2013-07-05T05:17:09ZAnswer by Abhimanyu Pallavi Sudhir for Do the laws of physics evolve?
https://physics.stackexchange.com/questions/10078/do-the-laws-of-physics-evolve/69964#69964
3<p>If the laws of physics "evolved", then the law governing this evolution would be your new law of physics, provided it is positivistically meaningful (i.e. it isn't last-Thursdayism) and we have <a href="https://physics.stackexchange.com/questions/413886/why-do-scientists-think-that-all-the-laws-of-physics-that-apply-in-our-galaxy-ap/414249#414249">enough evidence to say it is probable</a>. </p>
<p>Note regarding your claim about biology and geology -- the laws of biology and geology do not evolve, much like the laws of physics (including those of biology and geology) don't evolve. Biological and geological <em>structures</em> evolve, much like physical structures (including biological and geological ones) evolve. I don't know how you conflate the two.</p>
<p>There are some hypotheses which claim an evolving set of values for certain physical constants -- (they're probably wrong, but fun to think about)</p>
<p><strong>Dirac's large numbers hypothesis</strong></p>
<p>Some numerological coincidences like $\frac{r_H}{r_e} \approx 10^{42} \approx \frac {R_U}{r_e}$, $r_e = \frac {e^2}{4 \pi \epsilon_0 m_e c^2}$, $r_H = \frac {e^2}{4 \pi \epsilon_0 m_H c^2}$, $m_H c^2 = \frac {Gm_e^2}{r_e}$ are used to claim that "values of constants changing over time", as some of these constants (like $R_U$, the radius of the universe, and anything with a subscript $H$, a hypothetical particle with the radius of the universe) clearly vary. Dirac also hypothesised that these coincidences could be explained with a varying gravitational constant, $G = \left(\frac{c^3}{M_U}\right)t$ (which is odd, because you expect a symmetry between space and time).</p>
<p><strong>Brans-Dicke theory</strong></p>
<p>This modifies GR by replacing $1/G$ with a scalar field $\phi$ picked via the field equation $\frac{\partial ^ 2}{\partial a^2}\phi^a_a= \frac{8\pi}{3+2\omega}T$ for some coupling constant $\omega$.</p>Thu, 04 Jul 2013 05:23:38 GMThttps://physics.stackexchange.com/questions/10078/-/69964#69964Abhimanyu Pallavi Sudhir2013-07-04T05:23:38ZAnswer by Abhimanyu Pallavi Sudhir for Why does $\oint\mathbf{E}\cdot{d}\boldsymbol\ell=0$ imply $\nabla\times\mathbf{E}=\mathbf{0}$?
https://math.stackexchange.com/questions/434291/why-does-oint-mathbfe-cdotd-boldsymbol-ell-0-imply-nabla-times-mathbfe/434296#434296
6<p>$$\oint_C\mathbf{E}\cdot{d}\boldsymbol\ell=0$$
By 3-dimensional stokes's theorem applied to surfaces and their boundaries,
$$\iint_S\left(\nabla\times\mathbf E\right)\cdot \hat{\mathbf{n}}\mbox{ d}S=0$$
For this to always be true,
$$\left(\nabla\times\mathbf{E}\right)\cdot \hat{\mathbf{n}}=0$$
For the curl of this vector field to be orthogonal to the normal vector, the curl must be 0 (consider shifting your coordinate system to be such that $\hat{\mathbf{n}}$ points in the $z$-direction if you want this to be more clear) and thus,
$$\nabla\times\mathbf{E}=0$$
As required./Q.E.D..</p>Tue, 02 Jul 2013 04:25:38 GMThttps://math.stackexchange.com/questions/434291/why-does-oint-mathbfe-cdotd-boldsymbol-ell-0-imply-nabla-times-mathbfe/434296#434296Abhimanyu Pallavi Sudhir2013-07-02T04:25:38ZAnswer by Abhimanyu Pallavi Sudhir for Is there a way to fill Tank 2 from Tank 1 through Gravity alone?
https://physics.stackexchange.com/questions/69622/is-there-a-way-to-fill-tank-2-from-tank-1-through-gravity-alone/69627#69627
2<ol>
<li><p>The main variables you're missing are those relating to energy losses to friction, such as the total surface area of the path. Why not use a straight pipe than a winding one?</p></li>
<li><p>Of course the "head" is important, this is the potential difference driving the motion. If it were zero between Tank 1 and any point in the path, there would be no flow.</p></li>
<li><p>Larger pipes are better.</p></li>
</ol>Sun, 30 Jun 2013 14:20:00 GMThttps://physics.stackexchange.com/questions/69622/is-there-a-way-to-fill-tank-2-from-tank-1-through-gravity-alone/69627#69627Abhimanyu Pallavi Sudhir2013-06-30T14:20:00ZAnswer by Abhimanyu Pallavi Sudhir for Did Hilbert publish general relativity field equation before Einstein?
https://physics.stackexchange.com/questions/56892/did-hilbert-publish-general-relativity-field-equation-before-einstein/69603#69603
3<p>There is no giant controversy about who discovered what first, and we don't know anything about who wrote what -- maybe Einstein broke his arm and forced someone else to write for him, I don't know. What we <em>can</em> talk about is their discovery of the relevant laws or the insights leading to the relevant laws.</p>
<p>The difference between Einstein and Hilbert is in their choice of axiom. This is akin to Einstein vs. Minkowski on the formulation of special relativity, for instance -- there is no doubt that Einstein first wrote his field equation, you can see this in "On the foundation of the General theory of relativity", but Hilbert first discovered the Einstein-Hilbert action, which is the more elegant formulation of the theory, because it directly states the link between curvature and gravity in a simple way, in terms of the latter's Lagrangian.</p>Sun, 30 Jun 2013 11:07:03 GMThttps://physics.stackexchange.com/questions/56892/-/69603#69603Abhimanyu Pallavi Sudhir2013-06-30T11:07:03ZAnswer by Abhimanyu Pallavi Sudhir for Curiosity episode with Stephen Hawking. The Big-Bang
https://physics.stackexchange.com/questions/47967/curiosity-episode-with-stephen-hawking-the-big-bang/69589#69589
1<p>It's not standard gravitational potential energy that cancels out the positive energy of matter, this can be confirmed fairly easily by e.g. considering a two-mass system in freefall. </p>
<p>It's rather just a vacuous (and useless, since it defeats the point of defining energy) definition of $-G_{\mu\nu}/8\pi$ as a sort of "negative gravitational energy" which cancels out $G_{\mu\nu}$. This is pointless, and is not the right answer to the question "How did the big bang create stuff?".</p>
<p>See also <a href="https://physics.stackexchange.com/a/2844">https://physics.stackexchange.com/a/2844</a>.</p>Sun, 30 Jun 2013 07:44:11 GMThttps://physics.stackexchange.com/questions/47967/-/69589#69589Abhimanyu Pallavi Sudhir2013-06-30T07:44:11ZAnswer by Abhimanyu Pallavi Sudhir for Is Newton's universal gravitational constant the inverse of permittivity of mass in vacuum?
https://physics.stackexchange.com/questions/69541/is-newtons-universal-gravitational-constant-the-inverse-of-permittivity-of-mass/69544#69544
9<p>Yes, and this formulation is the accepted one in <a href="http://en.wikipedia.org/wiki/Gravitomagnetism" rel="nofollow noreferrer">gravitoelectromagnetism</a>, an approximation to general relativity. Just to give you a quick introduction to the theory, gravitoelectromagnetism separates "gravity due to mass" ($T^{00}$, accounting for Newtonian gravity) and the "gravity due to momentum" ($T^{0i}=T^{i0}$) into two separate forces "gravitoelectricty" and "gravitomagnetism" and unify the two much like Maxwell's electromagnetism, so that Newton's gravity only accounts for the "gravitoelectricity".</p>
<p>The results of the analogy are pretty splendid -- in Newtonian gravity, you have Poisson's equation for the "gravitoelectric field", $\nabla\cdot\vec G_E=4\pi G\rho_m$. Compare this to Gauss's law for electricity, $\nabla\cdot\vec E=\frac{\rho_q}{\epsilon_q}$ -- it is thus natural to set:</p>
<p>$$\epsilon_m=\frac{1}{4\pi G}$$</p>
<p>So Poisson's law for Newton's gravity becomes Gauss's law for gravitoelectricity:</p>
<p>$$\nabla\cdot\vec G_E=\frac{\rho_m}{\epsilon_m}$$</p>
<p>And there's the Ampere-Maxwell equation --
$$\nabla \times {\vec B_q} = \mu_q {\vec{J_q}} + \frac{1}{{{c^2}}}\frac{{\partial {\vec{E_q}}}}{{\partial t}}$$</p>
<p>In gravitoelectromagnetism,</p>
<p>$$\nabla \times {{\vec G_B}} = { - \mu_m {\vec{J}_m} + \frac{1}{{{c^2}}}\frac{{\partial {{\vec{G}}_{E}}}}{{\partial t}}}$$</p>
<p>(Sometimes people to add a factor of 4 to multiply the right-hand-side of the gravitoelectromagnetic Ampere-Maxwell equation -- an alternative, however, is to use the form given above and define the gravitoelectromagnetic Lorentz force as $\vec F_g = m(\vec G_E+4\vec v\times\vec G_B)$.)</p>Sat, 29 Jun 2013 12:21:21 GMThttps://physics.stackexchange.com/questions/69541/-/69544#69544Abhimanyu Pallavi Sudhir2013-06-29T12:21:21ZAnswer by Abhimanyu Pallavi Sudhir for Is it possible to accelerate a mass indefinitely using gravitational field?
https://physics.stackexchange.com/questions/69201/is-it-possible-to-accelerate-a-mass-indefinitely-using-gravitational-field/69204#69204
2<p><strong>(The below answer from five years ago is <em>wrong</em>, I will write a corrected answer soon.)</strong> </p>
<p>No. It is not possible. The misconception you have is because you are thinking of the gravitational field as being constant. However, this is not true. Ignoring air resistance for the time being, consider a projectile launched off the earth. Now, if it is launched at the escape velocity from the earth, then it will take an infinite amount of time to reach 0 speed, and by then it will be infinitely far away.</p>
<p>Now, consider the opposite situation. If the projectile is dropped from infinitely far away, it will take an infinite time to fall, but when it falls, its velocity would be the escape velocity, i.e. it would be exactly
$$v=\sqrt{\frac{2GM}{r}}$$
Now, we know that this escape velocity cannot be greater than the speed of light. It is the speed of light only when $$r=r_s=\frac{2GM}{c_0^2}$$
Now, consider the situation where the object had some initial velocity towards the planet (not due to the gravity) when it was infinitely far away? Naive Newtonian Mechanics (NNM) would tell you that it would be $v=v_0+\sqrt{\frac{2GM}{r}}\mbox{ could be greater than } c_0$. However, Special Relativity (SR) is required at such velocities. It tells you that the actual final velocity would be:
$$v=\frac{v_0+\sqrt{\frac{2GM}{r}}}{\sqrt{1+\frac{v_0\sqrt{\frac{2GM}{r}}}{c_0^2}}}=c_0\frac{v_0+\sqrt{\frac{2GM}{r}}}{\sqrt{c_0^2+{v_0\sqrt{\frac{2GM}{r}}}}}\leq c_0$$
As required.</p>Wed, 26 Jun 2013 05:27:50 GMThttps://physics.stackexchange.com/questions/69201/-/69204#69204Abhimanyu Pallavi Sudhir2013-06-26T05:27:50ZAnswer by Abhimanyu Pallavi Sudhir for Does the curvature of spacetime theory assume gravity?
https://physics.stackexchange.com/questions/7781/does-the-curvature-of-spacetime-theory-assume-gravity/69092#69092
1<p>No. While the curvature of spacetime -- or even Newtonian gravity, for that matter -- indeed can be modeled as a "potential well", the tendency of matter to lower this potential is an axiom of general relativity, and is not gravity. </p>
<p>The mathematics of general relativity can be derived from four important physical axioms -- (1) the Einstein-Hilbert action, or "gravity is the curvature of spacetime", or equivalently the Einstein-Field Equation, "matter curves spacetime" -- see <a href="https://physics.stackexchange.com/questions/3009/how-exactly-does-curved-space-time-describe-the-force-of-gravity/68707#68707">my answer here</a> for a derivation of the EFE from the action, (2) the geodesic equation, or "the geometry of spacetime moves matter", (3) Newtonian gravity is effective at low energies and (4) special relativity. So while it is true that general relativity assumes some law on whose basis matter moves (the geodesic equation), this law is not "gravity".</p>Tue, 25 Jun 2013 05:58:21 GMThttps://physics.stackexchange.com/questions/7781/-/69092#69092Abhimanyu Pallavi Sudhir2013-06-25T05:58:21ZAnswer by Abhimanyu Pallavi Sudhir for Gravity is an intrinsic property of every atoms?
https://physics.stackexchange.com/questions/68998/gravity-is-an-intrinsic-property-of-every-atoms/69017#69017
2<p>Atoms are overrated among laymen, gravity is a property of all matter, regardless of what particle structures it is comprised of -- for example, light, dark matter, government bureaucrats and other exotic forms of matter all exhibit gravity.</p>
<p>That answers your first question -- I have no idea what the others are supposed to mean, what their words mean and how they are connected.</p>Mon, 24 Jun 2013 09:40:18 GMThttps://physics.stackexchange.com/questions/68998/-/69017#69017Abhimanyu Pallavi Sudhir2013-06-24T09:40:18ZAnswer by Abhimanyu Pallavi Sudhir for Proving that $\iint_S (\nabla \times F) \cdot \hat{n} dS =0$
https://math.stackexchange.com/questions/427019/proving-that-iint-s-nabla-times-f-cdot-hatn-ds-0/427033#427033
1<p>Yes. Correct. Here's an alternative proof:</p>
<p>Choose $C$ to be a band across the surface, like an equator. So now you have two surfaces. So divide the surface integral into two:
$$\iint\limits_{S_1}\left(\nabla\times\vec f\right)\cdot \hat n\mbox{ d}S+\iint\limits_{S_2}\left(\nabla\times\vec f\right)\cdot \hat n\mbox{ d}S$$</p>
<p>Since the two surfaces have opposite orientations, they cancel out (alternatively, you may use Stoke's theorem to convert them to line integrals with opposite directions).
<img src="https://i.stack.imgur.com/5s3V1.png" alt="enter image description here"></p>Sat, 22 Jun 2013 17:28:32 GMThttps://math.stackexchange.com/questions/427019/proving-that-iint-s-nabla-times-f-cdot-hatn-ds-0/427033#427033Abhimanyu Pallavi Sudhir2013-06-22T17:28:32ZAnswer by Abhimanyu Pallavi Sudhir for How exactly does curved space-time describe the force of gravity?
https://physics.stackexchange.com/questions/3009/how-exactly-does-curved-space-time-describe-the-force-of-gravity/68707#68707
3<p>It is straightforward to see how the <em>geometry</em> of spacetime describes the force of gravity -- you just need to understand the geodesic equation, which in general relativity describes the paths of things subject to gravity and nothing else. This is the "spacetime affects matter" side of the theory.</p>
<p>To understand why curvature in particular, as a property of the geometry, is important, you need to understand the "matter affects spacetime" side of general relativity. The postulate is that the Gravitational Lagrangian of the theory is equal to the scalar curvature -- this is called the "Einstein-Hilbert Action" --</p>
<p>$$S=\int{\left( {\lambda R + {{\mathcal{L}}_M}} \right)\sqrt { - g}\, d{x^4}} {\text{ }} $$</p>
<p>You set the variation in the action to zero, as with any classical theory, and solve for the equations of motion. The conventional way to do this goes something like this --</p>
<p>$$\int{\left( {\frac{{\delta \left( {\left( {{{\mathcal{L}}_M} + \lambda R} \right)\sqrt { - g} } \right)}}{{\delta {g_{\mu \nu }}}}} \right)\delta {g_{\mu \nu }}\,d{x^4}} = 0$$
$$ \sqrt { - g} \frac{{\delta {{\mathcal{L}}_M}}}{{\delta {g_{\mu \nu }}}} + \lambda \sqrt { - g} \frac{{\delta R}}{{\delta {g_{\mu \nu }}}} + \left( {{{\mathcal{L}}_M} + \lambda R} \right)\frac{{\delta \sqrt { - g} }}{{\delta {g_{\mu \nu }}}} = 0 $$
$$ \frac{{\delta R}}{{\delta {g_{\mu \nu }}}} + \frac{R}{{\sqrt { - g} }}\frac{{\delta \sqrt { - g} }}{{\delta {g_{\mu \nu }}}} = - \frac{1}{\lambda }\left( {\frac{1}{{\sqrt { - g} }}{{\mathcal{L}}_M}\frac{{\delta \sqrt { - g} }}{{\delta {g_{\mu \nu }}}} + \frac{{\delta {{\mathcal{L}}_M}}}{{\delta {g_{\mu \nu }}}}} \right)$$</p>
<p>$$ {R_{\mu \nu }} - \frac{1}{2}R{g_{\mu \nu }} = \frac{1}{{2\lambda }}{T_{\mu \nu }}$$</p>
<p>To fix the value of $\kappa=1/{2\lambda}$, we impose Newtonian gravity at low energies, for which we only consider the time-time component, which Newtonian gravity describes (I'll use $C$ for the gravitational constant, reserving $G$ for the trace of the Einstein tensor) -- </p>
<p>$$\begin{gathered}
{G_{00}} = \kappa c^4\rho \\
{R_{00}} = {G_{00}} - \frac{1}{2}Gg_{00} \\
\Rightarrow {R_{00}} \approx \kappa \left( {c^4\rho - \frac{1}{2}\frac{1}{{c^2}}c^4\rho c^2} \right) \approx \frac{1}{2}\kappa c^4\rho \\
\end{gathered} $$</p>
<p>Imposing Poisson's law from Newtonian gravity with $\partial^2\Phi$ approximating $\Gamma _{00,\alpha }^\alpha $,</p>
<p>$$ 4\pi C\rho \approx {\nabla ^2}\Phi \approx \Gamma _{00,\alpha }^\alpha \approx {R_{00}} \approx \frac{\kappa }{2}c^4\rho \\
\Rightarrow \kappa = \frac{{8\pi G}}{{c^4}} \\
$$</p>
<p>(The fact that this is possible is fantastic -- it means that simply postulating that spacetime is curved in a certain sense produces a force that agrees with our observations regarding gravity at low energies.) Giving us the Einstein-Field Equation,</p>
<p>$${G_{\mu \nu }} = \frac{{8\pi G}}{{c^4}}{T_{\mu \nu }}$$</p>Fri, 21 Jun 2013 13:35:32 GMThttps://physics.stackexchange.com/questions/3009/-/68707#68707Abhimanyu Pallavi Sudhir2013-06-21T13:35:32ZAnswer by Abhimanyu Pallavi Sudhir for String theory: why not use $n$-dimensional blocks/objects/branes?
https://physics.stackexchange.com/questions/66948/string-theory-why-not-use-n-dimensional-blocks-objects-branes/68699#68699
3<p>There are, actually. Dilaton already covered the reason through T-duality, so I will discuss the requirement of $p$-branes imposed by Ramond-Ramond potentials. </p>
<p>The worldsheet of a string can couple to a Neveu-Schwarz B-field:
$$q\int_{}^{} {{{h^{ab}}}\frac{{\partial {X^\mu }}}{{\partial {\xi ^a}}}\frac{{\partial {X^\nu }}}{{\partial {\xi ^b}}}B_{\mu \nu }\sqrt { - \det {h_{ab}}} {{\text{d}}^2}\xi } $$</p>
<p>($q$ is the electric charge) The worldsheet of a string can couple to graviton field (spacetime metric):
$$m\int_{}^{} {{{h^{ab}}}\frac{{\partial {X^\mu }}}{{\partial {\xi ^a}}}\frac{{\partial {X^\nu }}}{{\partial {\xi ^b}}}g_{\mu \nu }\sqrt { - \det {h_{ab}}} {{\text{d}}^2}\xi } $$</p>
<p>You can change the "$m$" to any form you like, in terms of the tension/Regge Slope parameter/string length etc.</p>
<p>For a dilaton field,
$${q }\ell _P^2\int_{}^{} {\Phi R\sqrt { - \det {h_{\alpha \beta }}} {\text{ }}{{\text{d}}^2}\xi } $$
Ignore conformal invariance for the time being.</p>
<p>But what about Ramond-Ramond potentials? All is fine with the Ramond-Ramond fields, but the Ramond-Ramond potentials $C_k$are associated with the Ramond-Ramond field $A_{k+1}$ and it is clear that they can't couple similarly to the worldsheet. But it can couple to a higher dimensional worlvolume --
$${q_{{\text{RR}}}}\int_{}^{} {C_{{\mu _1}...{\mu _p}}^{p + 1}\frac{{\partial {x^{{\mu _1}}}}}{{\partial {\xi ^{{a_1}}}}}...\frac{{\partial {x^{{\mu _p}}}}}{{\partial {\xi ^{{a_p}}}}}{h^{{a_0}...{a_p}}}\sqrt { - \det {h^{{a_0}...{a_p}}}} {{\text{d}}^{p + 1}}\xi } $$</p>
<p>Which requires membranes and other higher-dimensional objects. It's interesting to note that while 10-dimensional string theories permit all sorts of these branes, M-theory only permits 2 and 5 dimensional branes.</p>Fri, 21 Jun 2013 10:46:35 GMThttps://physics.stackexchange.com/questions/66948/-/68699#68699Abhimanyu Pallavi Sudhir2013-06-21T10:46:35ZAnswer by Abhimanyu Pallavi Sudhir for What happens with the force of gravity when the distance between two objects is 0?
https://physics.stackexchange.com/questions/68519/what-happens-with-the-force-of-gravity-when-the-distance-between-two-objects-is/68522#68522
2<p>If the distance to a (point-sized) object actually were zero, indeed you'd have a situation of a singularity. But in your example -- being at the centre of the Earth -- the force is actually zero, since equal forces are acting at you from all directions. In general, the inverse-square law is applicable to point particles, and needs to be integrated over all points in the Earth placing a gravitational force on the object to get the resultant force.</p>
<p>$${\vec F_{{\rm{res}}}} = Gm\iiint_{|x|<\,1}{\frac{{dm}}{{{{\left| {\vec r - \vec x} \right|}^2}}}}
$$</p>
<p>Evaluating this in the special cases: where $|\vec r|>1$, you can use the inverse square law replacing the sphere by its center and using its total mass; where $|\vec r|<1$, you can use Newton's shell theorem and discount the part of the Earth above your head.</p>
<p>Regarding your new questions -- if you're at the midpoint between equal masses, you're stationary; if you're on the perpendicular bisector, you'll be drawn towards the midpoint because that's how vectors add up; if you're buried inside one of the two masses, the only forces acting on you will be from the sphere of mud under you and the other mass far away (the other mass isn't part of the shell above you).</p>
<p><strong>Gallery</strong></p>
<p><img src="https://i.stack.imgur.com/tkhpw.png" alt="enter image description here"></p>
<p>Forces being zero inside a spherical shell, since the smaller masses are closer to compensate for their mass, and the larger masses are further away.</p>
<p><img src="https://i.stack.imgur.com/h0Zhy.png" alt="enter image description here"></p>
<p>Newton's shell theorem -- the blue guy only feels gravity from the mud-sphere under his feet.</p>
<p><img src="https://i.stack.imgur.com/IbQeN.png" alt="enter image description here"></p>
<p>Three-body problem where one of the guys is a useless test mass who lives on the perpendicular bisector of the two bodies.</p>
<p><img src="https://i.stack.imgur.com/SOTgP.png" alt="enter image description here"></p>
<p>The orange horse inside the black mass feels gravity from both the mud-sphere under it and from the red mass.</p>Wed, 19 Jun 2013 05:38:52 GMThttps://physics.stackexchange.com/questions/68519/-/68522#68522Abhimanyu Pallavi Sudhir2013-06-19T05:38:52ZAnswer by Abhimanyu Pallavi Sudhir for Is velocity of light constant?
https://physics.stackexchange.com/questions/66856/is-velocity-of-light-constant/68513#68513
1<p>There are two questions here -- is the velocity of light <em>constant</em>, and is it <em>invariant</em>?</p>
<p>The direction/velocity of light changes whenever it interacts with something. This includes gravitational deflection, since things have to change direction in curved spacetime in one sense or another. The velocity isn't constant.</p>
<p>Is it invariant under Lorentz boosts in perpendiculal directions? <em>No.</em> The speed is invariant, but the velocity isn't. This should be fairly clear, but you can prove it with brute force --</p>
<p>We need to apply a boost to light's four-velocity, but the four-velocity of light is actually infinite -- it's (infinity, infinity, 0, 0), except the infinities satisfy a certain relation in the sense of being related through a limit. So we consider an object traveling at speed $w$ in the $x$-direction, boost $v$ in the $y$-direction and let $w\to c$. The four-velocity transforms under this boost as:</p>
<p>$$\left[ {\begin{array}{*{20}{c}}{\gamma (w)}\\{w\gamma (w)}\\0\\0\end{array}} \right] \to \left[ {\begin{array}{*{20}{c}}{\gamma (v)\gamma (w)}\\{w\gamma (w)}\\{ - v\gamma (v)\gamma (w)}\\0\end{array}} \right]$$</p>
<p>The conventional 3-velocity can be extracted here by considering $dx/dt$, $dy/dt$:</p>
<p>$$\frac{{dx}}{{dt}} = \frac{{dx/d\tau }}{{dt/d\tau }} = \frac{{w\gamma (w)}}{{\gamma (v)\gamma (w)}} = \frac{w}{{\gamma (v)}}$$
$$\frac{{dy}}{{dt}} = \frac{{dy/d\tau }}{{dt/d\tau }} = \frac{{ - v\gamma (v)\gamma (w)}}{{\gamma (v)\gamma (w)}} = - v$$</p>
<p>Taking the limit as $w\to 1$, you get a 3-velocity of $(1/\gamma(v),-v, 0)$ -- one may confirm that this is not equivalent to the original three-velocity that was $(1,0,0)$, but nonetheless has the same magnitude (speed is invariant).</p>Wed, 19 Jun 2013 04:17:58 GMThttps://physics.stackexchange.com/questions/66856/-/68513#68513Abhimanyu Pallavi Sudhir2013-06-19T04:17:58ZAnswer by Abhimanyu Pallavi Sudhir for Momentum of a particle?
https://physics.stackexchange.com/questions/68403/momentum-of-a-particle/68427#68427
2<p>The point of defining momentum is to have a conserved vector quantity relating to motion -- the formal definition of this comes from Noether's theorem, where momentum is the conserved charge resulting from translational invariance.</p>
<p>It's often conventional in mechanics to refer to momentum as an "amount of motion" or "how much a mass moves", but this is a rather vague statement, since there's no reason the same description can't be made of kinetic energy, for example.</p>Tue, 18 Jun 2013 06:28:42 GMThttps://physics.stackexchange.com/questions/68403/-/68427#68427Abhimanyu Pallavi Sudhir2013-06-18T06:28:42ZAnswer by Abhimanyu Pallavi Sudhir for Capacitors' working in a circuit
https://physics.stackexchange.com/questions/68387/capacitors-working-in-a-circuit/68426#68426
2<p>The answer is just "yes, obviously, the voltage is zero". The answer below is unnecessarily computational, but I'm keeping it in case someone likes that.</p>
<hr>
<p><strong>Archived answer</strong></p>
<p>I'll assume you are talking about an circuit with a capacitor and resistor inside. Then, let $Q$ be the charge, $t$ be the time, $C$ be the capacitance, $R$ be the resistance, $T$ be the time constant, and $V$ be the electromotive force. You must know of the differential equation:
$$\frac{{{\text{d}}Q}}{{{\text{d}}t}} = \frac{{ - Q + CV}}{T}$$</p>
<p>Separating the equation and integrating,
$$\frac{{{\text{d}}Q}}{{{\text{d}}t}} = \frac{V}{R}\exp \left( { - \frac{t}{T}} \right)$$</p>
<p>Since the current is the rate of change in the charge with respect to time, we can rewrite this equation in the following form:</p>
<p>$$I = \frac{V}{R}\exp \left( { - \frac{t}{T}} \right)$$</p>
<p>So, the potential of the battery being equal to the potential of the capacitor simply means that $V=0$, so
$$I = \frac{0}{R}\exp \left( { - \frac{t}{T}} \right)=0$$</p>
<p>So yes, although it will take an infinite amount of time to reach this point.</p>Tue, 18 Jun 2013 06:20:12 GMThttps://physics.stackexchange.com/questions/68387/-/68426#68426Abhimanyu Pallavi Sudhir2013-06-18T06:20:12ZAnswer by Abhimanyu Pallavi Sudhir for Measuring extra-dimensions
https://physics.stackexchange.com/questions/22542/measuring-extra-dimensions/68414#68414
4<p>The standard way to measure compactified dimensions is to test some inverse-square law (e.g. Newton's, electromagnetic, diffusion) at the scale and see if it breaks down and starts approaching some other (higher power) inverse-power law.</p>
<p>In fact, the inverse-square law has only been verified down to a scale of 0.1mm -- here's a recent experimental paper doing this: <a href="http://arxiv.org/abs/hep-ph/0011014v1" rel="nofollow noreferrer">[1]</a>.</p>
<p>(Yes, you can measure time in metres, by multiplying by the speed of light. This is where "lightseconds" and other such measurements of distance come from. An example motivation for treating this as the unit of the time dimension is from the Minkowski metric, $ds^2=c^2dt^2-dx^2-dy^2-dz^2$, where $ct$ is a dimension analogous to the spatial ones.)</p>Tue, 18 Jun 2013 04:16:35 GMThttps://physics.stackexchange.com/questions/22542/-/68414#68414Abhimanyu Pallavi Sudhir2013-06-18T04:16:35ZAnswer by Abhimanyu Pallavi Sudhir for A change in the gravitational law
https://physics.stackexchange.com/questions/41109/a-change-in-the-gravitational-law/68326#68326
5<p>Such a change requires a 4+1-dimensional spacetime instead of a 3+1-dimensional one -- this would have several serious implications --</p>
<ol>
<li><p>The Riemann curvature tensor gains new "parts" with interesting physical implications with each new spacetime dimension -- 1-dimensional manifolds have no curvature in this sense, 2-dimensional manifolds have a scalar curvature, 3-dimensional manifolds gain the full Ricci tensor, 4-dimensional manifolds get components corresponding to a new Weyl tensor and 5-dimensional geometry gets even more components, and general relativity in this spacetime is capable of explaining electromagnetism, too, so electromagnetism (along with the radion field) starts behaving as a part of gravity.</p></li>
<li><p>Apparently a 5-dimensional spacetime is unstable, according to wikipedia's "privileged character of 3+1-dimensional spacetime"<a href="http://en.wikipedia.org/wiki/Spacetime#Privileged_character_of_3.2B1_spacetime" rel="nofollow noreferrer">[1]</a> (now a transclusion of <a href="https://en.wikipedia.org/wiki/Anthropic_principle#Dimensions_of_spacetime" rel="nofollow noreferrer">[2]</a>).</p></li>
<li><p>The string theory landscape would be a bit smaller, since there are less dimensions to compactify.</p></li>
<li><p>The Ricci curvature in a vacuum on an Einstein Manifold would no longer be exactly $\Lambda g_{ab}$. There will be a coefficient of 2/3.</p></li>
<li><p>The magnetic field, among other things "cross product-ish", could not be written as a vector, unlike the electric field. This is because it would have 6 components whereas the spatial dimension is only 4. So, perhaps humans would become familiar with exterior algebras earlier than us who live in 3+1 dimensions. Either that or we would be trying to find out how magnetism works. Or we would just die out, for all the other reasons.</p></li>
<li><p>In string theory (see e.g. <a href="http://arxiv.org/abs/hep-th/0207249v1" rel="nofollow noreferrer">[3]</a>), gravitational constants in successively higher dimensions are calculated as $G_{n+1}=l_sG_n$, where $l_s$ is the string length (the units must be different in order to accomodate the extra factor of $r$ in Newton's gravitational law). For distance scales greater than the string length, this causes gravity to be much weaker than in our number of dimensions, but stronger for length scales shorter than the string length. It's interesting how gravity's long-range ability peaks at 4 dimensions (it is a contact force below 4 dimensions).</p></li>
</ol>
<p>See also some recent tests of the inverse square law at short length scales (to check for compactification -- <a href="http://arxiv.org/abs/hep-ph/0011014" rel="nofollow noreferrer">[4]</a>.</p>Mon, 17 Jun 2013 10:12:52 GMThttps://physics.stackexchange.com/questions/41109/-/68326#68326Abhimanyu Pallavi Sudhir2013-06-17T10:12:52ZAnswer by Abhimanyu Pallavi Sudhir for Mass of a superstring between two branes?
https://physics.stackexchange.com/questions/46118/mass-of-a-superstring-between-two-branes/68240#68240
2<p>It's similar -- </p>
<p>$${m^2} = \left( {N - a} \right) + {\left( {\frac{y}{{2\pi }}} \right)^2}$$</p>
<p>The important difference is that the number operator and normal ordering constant change for a superstring, and vary by sector.</p>Sun, 16 Jun 2013 11:12:27 GMThttps://physics.stackexchange.com/questions/46118/-/68240#68240Abhimanyu Pallavi Sudhir2013-06-16T11:12:27ZAnswer by Abhimanyu Pallavi Sudhir for Multiplying vectors (answered own question)
https://math.stackexchange.com/questions/414475/multiplying-vectors-answered-own-question/414476#414476
3<p>The dot product and cross product both appear as components of the tensor product of two vectors, $v^\mu w^\nu$, which gives a rank-2 tensor. The dot product is the contraction/trace $v^\mu w_\mu$, which is useful due to its invariance properties, and the cross product appears as the tensor $v^\mu w^\nu - v^\nu w^\mu$ (which in three dimensions allows for an ugly misrepresentation in terms of axes of rotation, which allows this tensor to be written as a vector, even though it doesn't transform as one).</p>
<p>An alternative formulation of this is in the geometric algebra notation, where the cross product is written as the "wedge product", the dot product is still the inner product and the tensor product is called the "geometric product" and is the sum of the two. </p>
<hr>
<p>ARCHIVED ANSWER (Jun 2013) follows, I no longer endorse the below contents --</p>
<ol>
<li><p>Dot product (Scalar Product)</p>
<p>The dot product, you could say, very hand-wavily measures both the overall size of 2 vectors and how parallel they are.</p>
<p>The dot product is related to the magnitudes and angles of the two vectors by:
$$\vec a\cdot\vec b=||\vec a||\mbox{ }||\vec b||\cos\theta$$</p>
<p>So, if the two vectors are orthogonal, their dot product is 0. If they are parallel, their dot product is the product of the two magnitudes. The latter case always happens in scalars. So, in, this sense, the dot product actually is a generalisesation of the normal ordinary scalars' product for scalars in $\mathbb C$</p>
<p>Of course, it is better to use the dot product when measuring orthogonality, only. In fact, often, orthogonality (and not perpendicularity) is defined in terms of the dot product being equal to 0.</p>
<p>Also, note that for complex vectors,
$$\Re(\vec a\cdot\vec b)=||\vec a||\mbox{ }||\vec b||\cos\theta$$</p>
<p>Generally, the dot product is calculated by:
$${\mathbf{a}}\cdot{\mathbf{b}} = \sum {{a_i}\overline {{b_i}} }$$ </p></li>
<li><p>Cross Product (Vector Product)</p>
<p>The cross product of two vectors in $\mathbb R^3$ is a vector orthogonal to these two vectors and has a magnitude of
$$||\vec a\times\vec b||=||\vec a||\mbox{ }||\vec b||\sin\theta$$</p>
<p>It can be calculated as:
$${\mathbf{a}} \times {\mathbf{b}} = \left| {\begin{array}{ccccccccccccccc}
{{{\hat e}_1}}&{{{\hat e}_2}}&{{{\hat e}_3}} \\
\leftarrow &{{{\mathbf{a}}^T}}& \to \\
\leftarrow &{{{\mathbf{b}}^T}}& \to
\end{array}} \right|$$</p>
<p>(not really a determinant -- just a mnemonic, etc. etc.)</p>
<p>Thus, the magnitude of the cross products describes their "orthogonal-ness" and their overall "size". It is 0 whenever the two vectors are parallel. Oh, and of course, these definitions become relatively very complicated in more than 3 dimensions. In more than 3-dimensions, one has to use:
$$\vec a\times\vec b=/(\vec a\wedge\vec b)$$</p>
<p>Here, $\wedge$ is the exterior product and $/$ is a duality between the cross products and the exterior (wedge $\wedge$) products. I once showed that the following generalisation is possible:
$$/\left( {{{{\mathbf{\hat e}}}_m} \wedge {{{\mathbf{\hat e}}}_n}} \right) = {\left( { - 1} \right)^{m + n + 1}}\mathop \bigwedge \limits_{k \ne m,n}^{} {{{\mathbf{\hat e}}}_k}$$</p>
<p>...in any dimension...</p></li>
<li><p>Exterior Product (Wedge Product)</p>
<p>The Exterior Product of 2 vectors is the bivector spanned by them.</p>
<p>Of course, there are many more products, such as the tensor product (The outer product is a special case for vectors and the Kronecker Product is for matrices), the natural product, the Clifford Product etc. Actually, the natural product was defined by me in <a href="http://ccsenet.org/journal/index.php/jmr/article/view/18102" rel="nofollow noreferrer">http://ccsenet.org/journal/index.php/jmr/article/view/18102</a> in a hope to obtain a geometric interpretation of matrices, though it works only for singular matrices.</p></li>
</ol>Sat, 08 Jun 2013 05:30:30 GMThttps://math.stackexchange.com/questions/414475/-/414476#414476Abhimanyu Pallavi Sudhir2013-06-08T05:30:30ZMultiplying vectors (answered own question)
https://math.stackexchange.com/questions/414475/multiplying-vectors-answered-own-question
2<p>I recently realised that asking a question and answering our own question is allowed here, so here is a question I've seen commonly on many sites:</p>
<p>"How does one multiply two vectors?"</p>
<p>This is very open-ended (but basic) question, but here's my answer to it (below in the answers section).</p>linear-algebramatricestensor-productsexterior-algebracross-productSat, 08 Jun 2013 05:30:30 GMThttps://math.stackexchange.com/q/414475Abhimanyu Pallavi Sudhir2013-06-08T05:30:30ZAnswer by Abhimanyu Pallavi Sudhir for How is it that angular velocities are vectors, while rotations aren't?
https://physics.stackexchange.com/questions/286/how-is-it-that-angular-velocities-are-vectors-while-rotations-arent/65738#65738
6<p>You are mixing up different things. A rotation transformation is a transformation of vectors in a linear space -- such a transformation doesn't need to have any angular velocities or anything, and it doesn't even need to have anything to do with a mechanical rotation.</p>
<p>The angular velocity is the rate of a physical rotation, measured as $\vec\omega=d\vec\theta/dt$, where $\vec\theta$ is <em>also</em> a vector, the rotational analog of displacement.</p>
<p>In any case, the $\vec\theta$ is not the same as the matrix of rotation. The latter is a <em>function</em> of $\vec\theta$, but a matrix can be used to represent a lot more things than just a rotation. Note that a rotation can still be modelled as a time-dependent matrix itself, like $\vec{x}(t)=A(t)\vec{x}(0)$, but the matrix is still not the same as the angle of rotation.</p>
<hr>
<p>Note: I've been a bit sneaky in claiming that $\vec\theta$ is a "vector" -- it's really not, although it happens to have 3 components in 3 dimensions so it's conventional to write the "xy" component as the "z" component, "xz" as the "y" component, "yz" as "x", but in general it's best to think of angles as (2, 0) tensors $\theta^{\mu\nu}$. Interestingly, the rotation transformation is a (1, 1) tensor $A^{\mu}{}_{\nu}$.</p>Fri, 24 May 2013 12:20:55 GMThttps://physics.stackexchange.com/questions/286/-/65738#65738Abhimanyu Pallavi Sudhir2013-05-24T12:20:55ZAnswer by Abhimanyu Pallavi Sudhir for Can someone please explain magnetic vs electric fields?
https://physics.stackexchange.com/questions/53916/can-someone-please-explain-magnetic-vs-electric-fields/65091#65091
3<p>The electric and magnetic fields arise as Lorentz duals of each other, with them mixing and transforming between each other through Lorentz boosts. The full picture of the field comes from the electromagnetic field tensor</p>
<p>$$F_{\mu\nu} = \begin{bmatrix}
0 & E_x/c & E_y/c & E_z/c \\
-E_x/c & 0 & -B_z & B_y \\
-E_y/c & B_z & 0 & -B_x \\
-E_z/c & -B_y & B_x & 0
\end{bmatrix}$$</p>
<p>Which satisfies simple identities (see <a href="https://en.wikipedia.org/wiki/Electromagnetic_tensor#Significance" rel="nofollow noreferrer">[1]</a>) equivalent to Maxwell's equations. The electric and magnetic fields are different components of this tensor, placed in similar positions as e.g. the momemtnum and shear stress in the 4d stress tensor.</p>Sun, 19 May 2013 05:01:31 GMThttps://physics.stackexchange.com/questions/53916/-/65091#65091Abhimanyu Pallavi Sudhir2013-05-19T05:01:31Z