Abhimanyu Pallavi Sudhir
http://www.rssmix.com/
This feed was created by mixing existing feeds from various sources.RSSMixComment by Abhimanyu Pallavi Sudhir on Example of a non-zero infinitely differentiable function $f:\mathbb R \to \mathbb R$ that is $0$ on an open interval
https://math.stackexchange.com/questions/2984894/example-of-a-non-zero-infinitely-differentiable-function-f-mathbb-r-to-mathb/2984899#2984899
@ClementC. I don't understand your point about bump functions -- surely the $f(x)f(1-x)$ construction works just as well in both cases.Wed, 22 May 2019 21:02:49 GMThttps://math.stackexchange.com/questions/2984894/example-of-a-non-zero-infinitely-differentiable-function-f-mathbb-r-to-mathb/2984899?cid=6656896#2984899Abhimanyu Pallavi Sudhir2019-05-22T21:02:49ZComment by Abhimanyu Pallavi Sudhir on Example of a non-zero infinitely differentiable function $f:\mathbb R \to \mathbb R$ that is $0$ on an open interval
https://math.stackexchange.com/questions/2984894/example-of-a-non-zero-infinitely-differentiable-function-f-mathbb-r-to-mathb/2984899#2984899
Any particular reason to choose $e^{-1/x^2}$ instead of $e^{-1/x}$? Asking because I see the former example everywhere even while the latter works.Wed, 22 May 2019 20:45:18 GMThttps://math.stackexchange.com/questions/2984894/example-of-a-non-zero-infinitely-differentiable-function-f-mathbb-r-to-mathb/2984899?cid=6656859#2984899Abhimanyu Pallavi Sudhir2019-05-22T20:45:18ZAnswer by Abhimanyu Pallavi Sudhir for Are there other kinds of bump functions than $e^\frac1{x^2-1}$?
https://math.stackexchange.com/questions/101480/are-there-other-kinds-of-bump-functions-than-e-frac1x2-1/3236066#3236066
1<p>Here's how you can generate as many different kinds of bump functions as you want, for whatever definition of "kind" you may have:</p>
<ol>
<li>Start with any function <span class="math-container">$f(x)$</span> that <strong>grows faster than all polynomials</strong>, i.e. <span class="math-container">$\forall N, \ \lim_{x\to\infty}\frac{x^N}{f(x)}=0$</span>. Example: <span class="math-container">$e^x$</span>.</li>
<li>Then consider the function <span class="math-container">$g(x)=\frac1{f(1/x)}$</span>. This is a function that is flatter than all polynomials near zero, i.e. <span class="math-container">$\forall N,\ \lim_{x\to0}\frac{g(x)}{x^N}=0$</span>. This is a a <strong>smooth non-analytic</strong> function. For our example, we get <span class="math-container">$e^{-1/x}$</span>.</li>
<li>Consider the function <span class="math-container">$h(x)=g(1+x)g(1-x)$</span>. This, after zeroing out stuff outside the interval <span class="math-container">$(-1,1)$</span>, is a <strong>bump function</strong>. For our example, <span class="math-container">$e^{2/(x^2-1)}$</span>.</li>
<li>Scale and transform to your liking.</li>
</ol>
<p>Just do this with different "kinds" of growth functions <span class="math-container">$f$</span>, and you'll get different "kinds" of bump functions <span class="math-container">$h$</span>. So here are some functions I could generate with this method -- try to guess which functions they're from:</p>
<p><span class="math-container">$$\begin{array}{l}
h(x) = {e^{2/({x^2} - 1)}} \\
h(x) = (1 + x)^{1/(1 + x)}(1 - x)^{1/(1 - x)} \\
h(x) = \frac1{\frac1{1 + x}!\frac1{1-x}!} \\
h(x)=e^{-[\ln^2(1+x)+\ln^2(1-x)]}
\end{array}$$</span></p>
<p>And the more rapidly your <span class="math-container">$f(x)$</span> grows, the nicer your bump function <span class="math-container">$h(x)$</span> looks.</p>
<hr>
<p>Here's a Desmos applet to try this with different functions <span class="math-container">$f$</span>: <a href="https://www.desmos.com/calculator/ccf2goi9bj" rel="nofollow noreferrer"><strong>desmos.com/calculator/ccf2goi9bj</strong></a>. </p>
<p>If you're interested in smooth non-analytic functions, have a look at my post <a href="https://thewindingnumber.blogspot.com/2019/05/whats-with-e-1x-on-smooth-non-analytic.html" rel="nofollow noreferrer"><strong>What's with e^(-1/x)? On smooth non-analytic functions: part I</strong></a>.</p>Wed, 22 May 2019 18:36:36 GMThttps://math.stackexchange.com/questions/101480/are-there-other-kinds-of-bump-functions-than-e-frac1x2-1/3236066#3236066Abhimanyu Pallavi Sudhir2019-05-22T18:36:36ZComment by Abhimanyu Pallavi Sudhir on 'Obvious' theorems that are actually false
https://math.stackexchange.com/questions/820686/obvious-theorems-that-are-actually-false/820713#820713
This is the first result in this thread I actually find unintuitive. What the heck?Wed, 22 May 2019 17:44:21 GMThttps://math.stackexchange.com/questions/820686/obvious-theorems-that-are-actually-false/820713?cid=6656425#820713Abhimanyu Pallavi Sudhir2019-05-22T17:44:21ZAnswer by Abhimanyu Pallavi Sudhir for Why does Taylor’s series “work”?
https://physics.stackexchange.com/questions/480163/why-does-taylor-s-series-work/481556#481556
0<p>Adding to <a href="https://physics.stackexchange.com/a/480187/">Sympathiser's answer</a> -- one can see why the existence of functions like <span class="math-container">$e^{-1/x}$</span> is not surprising by rephrasing them as "<strong>functions that approach zero near zero faster than any polynomial</strong>". This is not fundamentally more surprising than e.g. functions that grow faster than every polynomial -- in fact, for any function <span class="math-container">$f(x)$</span> that grows faster than every polynomial, the function <span class="math-container">$\frac1{f(1/x)}$</span> approaches zero near zero faster than any polynomial.</p>
<p>So for rapidly growing <span class="math-container">$f(x)=e^x$</span>, one gets the corresponding smooth non-analytic <span class="math-container">$e^{-1/x}$</span>. For <span class="math-container">$x^x$</span>, one gets <span class="math-container">$x^{1/x}$</span>. For <span class="math-container">$x!$</span>, one gets <span class="math-container">$\frac{1}{(1/x)!}$</span>, and so on.</p>
<p>See my post <a href="https://thewindingnumber.blogspot.com/2019/05/whats-with-e-1x-on-smooth-non-analytic.html" rel="nofollow noreferrer"><strong>What's with e^(-1/x)? On smooth non-analytic functions: part I</strong></a> for a fuller explanation.</p>Wed, 22 May 2019 00:11:33 GMThttps://physics.stackexchange.com/questions/480163/-/481556#481556Abhimanyu Pallavi Sudhir2019-05-22T00:11:33ZComment by Abhimanyu Pallavi Sudhir on Why does Taylor’s series “work”?
https://physics.stackexchange.com/questions/480163/why-does-taylor-s-series-work/480187#480187
Great answer -- do you have a reference for the terms "Cauchy point" and "Pringsheim point"? I wasn't able to find anything.Wed, 22 May 2019 00:01:17 GMThttps://physics.stackexchange.com/questions/480163/why-does-taylor-s-series-work/480187?cid=1081559#480187Abhimanyu Pallavi Sudhir2019-05-22T00:01:17ZWhat's with e^(-1/x)? On smooth non-analytic functions: part I
https://thewindingnumber.blogspot.com/2019/05/whats-with-e-1x-on-smooth-non-analytic.html
0When you first learned about the Taylor series, your intuition probably went something like this: you have $f(x)$, the derivative at this point tells you how $f$ changes from $x$ to $x+dx$ (which tells you $f(x+dx)$), the second derivative tells you how $f'$ changes from $x$ to $x+dx$, which recursively tells you $f(x+2\ dx)$, the third derivative tells you $f(x+3\ dx)$, and so on -- so if you have an <i>infinite </i>number of derivatives, you know how <i>each</i> derivative changes, so you should be able to predict the <i>full global behaviour of the function</i>, assuming it is infinitely differentiable (smooth) throughout.<br /><br />Everything is nice and dandy in this picture. But then you come across two disastrous, life-changing facts that make you cry for those good old days:<br /><ol><li><b>Taylor series have <i>radii of convergence</i> -- </b>If I can predict the behaviour of a function up until a certain point, why can't I predict it a bit afterwards? It makes sense if the function becomes rough at that point, like if it jumps to infinity, but even functions like $1/(1+x^2)$ have this problem. Sure, we've heard the explanation involving complex numbers, but why should we care about the complex singularities (here's a question: do we care about quaternion singularities?)? </li><li><b>Weird crap --</b> Like $e^{-1/x}$. Here, the Taylor series <i>does</i> converge, but it converges to the wrong thing -- in this case, to zero. These are <b>smooth non-analytic functions</b> or <b>defective functions</b>.</li></ol><div>In this article, we'll address the <b>weird crap -- </b>$e^{-1/x}$ (or "$e^{-1/x}$ for $x>0$, 0 for $x= 0$" if you want to be annoyingly formal about it) will be the example we'll use throughout, so if you haven't already seen this, go plot it on Desmos and get a feel for how it looks near the origin.<br /><br /></div><hr /><br /><div>The thing to realise about $e^{-1/x}$ is that the Taylor series -- $0 + 0x + 0x^2 + ...$ -- <i>isn't wrong</i>. The truncated Taylor series of degree $n$ is the <i>best polynomial approximation </i>for the function near zero, and none of the logic here fails for $e^{-1/x}$. There is honestly no other polynomial that better approximates the shape of the function as $x\to 0$.<br /><div><br /></div><div>If you think about it this way, it isn't too surprising that such a function exists -- what we have is a function that <b>goes to zero</b> as $x\to 0$ <b>faster than any polynomial</b> does. I.e. a function $g(x)$ such that</div><div>$$\forall n, \lim\limits_{x\to0}\frac{g(x)}{x^n}=0$$</div><div>This is not fundamentally any weirder than a function that escapes to infinity faster than all polynomials. In fact, such functions are quite directly connected. Given a function $f(x)$ satisfying:</div><div>$$\forall n, \lim\limits_{x\to\infty} \frac{x^n}{f(x)} = 0$$</div><div>We can make the substitution $x\leftrightarrow 1/x$ to get</div><div>$$\forall n, \lim\limits_{x\to0} \frac{1}{x^n f(1/x)} = 0$$</div><div>So $\frac1{f(1/x)}$ is a valid $g(x)$. Indeed, we can generate plenty of the standard smooth non-analytic functions this way: $f(x)=e^x$ gives $g(x)=e^{-1/x}$, $f(x)=x^x$ gives $g(x)=x^{1/x}$, $f(x)=x!$ gives $g(x)=\frac1{(1/x)!}$ etc.<br /><br /></div><div><hr /><br /><div></div></div><div>To better study what exactly is going on here, consider Taylor expanding $e^{-1/x}$ around some point other than 0, or equivalently, expanding $e^{-1/(x+\varepsilon)}$ around 0. One can see that:</div></div><div>$$\begin{array}{*{20}{c}}{f(0) = {e^{ - 1/\varepsilon }}}\\{f'(0) = \frac{1}{{{\varepsilon ^2}}}{e^{ - 1/\varepsilon }}}\\{f''(0) = \frac{{ - 2\varepsilon + 1}}{{{\varepsilon ^4}}}{e^{ - 1/\varepsilon }}}\\{f'''(0) = \frac{{6{\varepsilon ^2} - 6\varepsilon + 1}}{{{\varepsilon ^6}}}{e^{ - 1/\varepsilon }}}\\ \vdots \end{array}$$</div><div>Or ignoring higher-order terms for our purposes,</div><div>$$f^{(N)}(0)\approx(1/\varepsilon)^{2N}e^{-1/\varepsilon}$$</div><div>Each derivative $\frac{e^{-1/\varepsilon}}{\varepsilon^{2N}}\to0$ as $\varepsilon\to0$, but they each approach zero <i>slower</i> than the previous derivative, and somehow that is enough to give the sequence of derivatives the "kick" that they need in the domino effect that follows -- from somewhere at $N=\infty$ (putting it non-rigorously) -- to make the function grow as $x$ leaves zero, even though all the derivatives were zero at $x=0$.</div><div><br /></div><div><hr /><br /></div><div><i>But</i> we can still make it work -- by letting $N$, the upper limit of the summation approach $\infty$ <i>first</i>, before $\varepsilon\to 0$. In other words, instead of directly computing the derivatives $f^{(n)}(0)$, we consider the terms</div><div>$$\begin{array}{*{20}{c}}{f_\varepsilon^{(0)} = f(0)}\\{{{f}_\varepsilon^{(1)} }(0) = \frac{{f(\varepsilon ) - f(0)}}{\varepsilon }}\\{{{f}_\varepsilon^{(2)} }(0) = \frac{{f(2\varepsilon ) - 2f(\varepsilon ) + f(0)}}{{{\varepsilon ^2}}}}\\{{{f}_\varepsilon^{(3)} }(0) = \frac{{f(3\varepsilon ) - 3f(2\varepsilon ) + 3f(\varepsilon ) - f(0)}}{{{\varepsilon ^3}}}}\\ \vdots \end{array}$$</div><div>And write the generalised <b>Hille-Taylor series</b> as:</div><div>$$f(x) = \mathop {\lim }\limits_{\varepsilon \to 0} \sum\limits_{n = 0}^\infty {\frac{{{x^n}}}{{n!}}f_\varepsilon ^{(n)}(0)} $$</div><div>Then $N\to\infty$ before $\varepsilon\to0$ so you "reach" $N\to\infty$ first (or rather, you get large $n$th derivatives for increasing $n$) before $\varepsilon$ gets to 0.</div><div><br /></div><div>Another way of thinking about it is that the "local determines global" stuff makes sense to predict the value of the function at $N\varepsilon$, countable $N$, but it's a stretch to talk about uncountably many $\varepsilon$s away, which is what a finite neighbourhood is. But with these difference operators in the Hille-Taylor series, one ensures that each neighbourhood is a finite multiple of $h$ away at any point, so the differences determine $f$.<br /><br /><hr /></div><div><b>Very simple (but fun to plot on Desmos) exercise: </b>use $e^{-1/x}$ or another defective function to construct a "<b>bump function</b>", i.e. a smooth function that is 0 outside $(0, 1)$, but takes non-zero values everywhere in that range.<br /><br />Similarly, construct a "<b>transition function</b>", i.e. a smooth function that is 0 for $x\le0$, 1 for $x\ge1$.<br /><br />If you're done, play around with this (but no peeking): <a href="https://www.desmos.com/calculator/ccf2goi9bj"><b>desmos.com/calculator/ccf2goi9bj</b></a></div>analysisanalytic functionscalculusconvergencemathematicssmooth functionstaylor seriesTue, 21 May 2019 23:57:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-8831094740684334216Abhimanyu Pallavi Sudhir2019-05-21T23:57:00ZComment by Abhimanyu Pallavi Sudhir on Why do my books introduce the equation $\nabla \cdot \mathbf{E}=\frac{\rho}{\epsilon_0}$ without showing partial derivatives of $\mathbf{E}$ exist?
https://physics.stackexchange.com/questions/481414/why-do-my-books-introduce-the-equation-nabla-cdot-mathbfe-frac-rho-eps
My guess would be that in a mathematically rigorous formulation of electromagnetism, smoothness (perhaps except at countably many points) would be an axiom.Tue, 21 May 2019 10:04:52 GMThttps://physics.stackexchange.com/questions/481414/why-do-my-books-introduce-the-equation-nabla-cdot-mathbfe-frac-rho-eps?cid=1081278Abhimanyu Pallavi Sudhir2019-05-21T10:04:52ZComment by Abhimanyu Pallavi Sudhir on Can someone please explain magnetic vs electric fields?
https://physics.stackexchange.com/questions/53916/can-someone-please-explain-magnetic-vs-electric-fields/65091#65091
@zwep. No. One can see e.g. that this tensor is antisymmetric while the stress energy tensor is symmetric. I can't tell why you think they are related at all.Mon, 20 May 2019 23:07:57 GMThttps://physics.stackexchange.com/questions/53916/can-someone-please-explain-magnetic-vs-electric-fields/65091?cid=1081135#65091Abhimanyu Pallavi Sudhir2019-05-20T23:07:57ZData formats of inputs to arrange function in dplyr
https://stackoverflow.com/questions/56158114/data-formats-of-inputs-to-arrange-function-in-dplyr
0<p>Given a table <code>monkeys</code> with column <code>brain_size</code>, one can write something like <strong><code>arrange(monkeys, brain_size)</code></strong>. </p>
<p>I don't understand how this makes sense -- <strong><code>brain_size</code> isn't a declared variable</strong> (if I refer to it, I get an error). It's just the name of a column -- shouldn't you rather have <code>arrange(monkeys, 'brain_size')</code>? <strong><em>Isn't</em> the column name just a string?</strong></p>
<p>Another related weirdness -- </p>
<pre><code>arrange(monkeys, desc(brain_size))
</code></pre>
<p>Once again, what exactly is the <strong><code>desc</code> function</strong>? How can it take <code>brain_size</code> as an input? Shouldn't you have something like <code>arrange(monkeys, 'brain_size', desc = true)</code>?</p>
<p>Am I missing something? Perhaps <code>brain_size</code> is a variable in some way but can only be accessed when you're unambiguously "inside" <code>monkeys</code>.</p>rfunctiontypesdplyrWed, 15 May 2019 21:51:42 GMThttps://stackoverflow.com/q/56158114Abhimanyu Pallavi Sudhir2019-05-15T21:51:42ZThe Cauchy Riemann Equations: what do they really mean?
https://thewindingnumber.blogspot.com/2019/05/what-do-cauchy-riemann-equations-really.html
0<b>Question: <a href="https://math.stackexchange.com/a/3197879/78451">Geometrical Interpretation of Cauchy Riemann equations?</a></b><br /><br />One might think that being differentiable on $\mathbb{R}^2$ is sufficient for differentiability on $\mathbb{C}$. But the Jacobian of an arbitrary such function doesn't have a natural complex number representation.<br /><br />$$<br />\left[ {\begin{array}{*{20}{c}}<br />{\partial u/\partial x} & {\partial u/\partial y} \\<br />{\partial v/\partial x} & {\partial v/\partial y}<br />\end{array}} \right]<br />$$<br />Another way of putting this is that no complex-valued derivative (see below for an example) you can define for an arbitrary function fully captures the local behaviour of the function that is represented by the Jacobian.<br /><br />$$<br />\frac{df}{dz} = \left(\frac{\partial u}{\partial x} + \frac{\partial v}{\partial y} \right) + i\left(\frac{\partial v}{\partial x}-\frac{\partial v}{\partial y}\right)<br />$$<br />The idea is that we should be able to define a complex-valued derivative "purely" for the value $z$, without considering directions, i.e. we want to consider $\mathbb{C}$ one-dimensional in some sense (the sense being "as a vector space"). More precisely, the derivative in some direction in $\mathbb{C}$ should determine the derivative in all other directions in a natural manner -- whereas on $\mathbb{R}^2$, the derivatives in *two* directions (i.e. the gradient) determines the directional derivatives in all directions. <br /><br />If you think about it, this is quite a reasonable idea -- it's analogous to how not every linear transformation on $\mathbb{R}^2$ is a linear transformation on $\mathbb{C}$ -- only spiral transformations are.<br /><br />$$<br />\left[ {\begin{array}{*{20}{c}}<br />{a} & {-b} \\<br />{b} & {a}<br />\end{array}} \right]<br />$$<br />How would we generalise differentiability to an arbitrary manifold? Here's an idea: <b>a function is differentiable if it is locally a linear transformation</b>. So on $\mathbb{R}^2$, any Jacobian matrix is a linear transformation. But on $\mathbb{C}$, only Jacobians of the above form are linear transformations -- i.e. the only linear transformation on $\mathbb{C}$ is <b>multiplication by a complex number</b>, i.e. a spiral/amplitwist. So a complex differentiable function is one that is locally an amplitwist (geometrically), which can be stated in terms of the components of the Jacobian as:<br /><br />$$<br />\begin{align}<br />\frac{\partial u}{\partial x} & = \frac{\partial v}{\partial y} \\<br />\frac{\partial u}{\partial y} & = - \frac{\partial v}{\partial x} \\<br />\end{align}<br />$$<br />This is precisely why you shouldn't (and can't) view complex differentiability as some basic first-degree smoothness -- there is a much richer structure to these functions, and it's better to think of them via the transformations they have on grids.calculuscauchy-riemanncomplex analysisjacobianlinear algebraSun, 12 May 2019 23:35:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-3734514288105692758Abhimanyu Pallavi Sudhir2019-05-12T23:35:00ZThe Lie Bracket
https://thewindingnumber.blogspot.com/2019/05/an-easy-way-to-see-closure-under-lie.html
0(If you're just here for the easy way to see closure, skip ahead to <a href="#closure">Closure under the Lie Bracket</a>)<br /><br />In the <a href="https://thewindingnumber.blogspot.com/2019/04/introduction-to-lie-groups.html">previous article</a>, I introduced Lie Groups and Lie Algebras by talking about Lie Algebras as a parameterisation for the Lie Group -- we said that the elements of the Lie Group could be written as exponentials of these parameters (not uniquely, sure, but they can be written in this way). Some things to note here:<br /><ul><li>What we've called "Lie Groups" refers only to <b><i>connected</i> Lie Groups</b>, as motivation. In general, the theory of Lie groups considers <b>any group that is also a manifold </b>-- for instance, the non-zero real numbers are also a Lie Group (even though their Lie Algebra is identical to that of the positive real numbers -- can you see why?). We will hereby use this more general definition.</li><li>It's not really true that any Lie group can be parameterised in this fashion by writing each element as an exponential of a Lie Algebra element -- even for connected groups. This shouldn't be surprising -- given a term of the form $\exp X$ and a term $\exp Y$, their product $\exp X\exp Y$ is in the group by closure, but it isn't necessarily equivalent to $\exp(X+Y)$ on a non-Abelian group (could it be the exponential of something else? We'll find out later).</li><li>A <i>parameterisation</i> of this form is not the same as a <i>co-ordinate system</i>.</li></ul>The last point is what we will concentrate on in this article.<br /><br />What is a co-ordinate system on a manifold? Well, they key point is that any element of the manifold can be decomposed in terms of its components along the co-ordinates. On a Lie Group, this means that there should exist a "basis" for the Lie Group $\exp(X_1),\ldots\exp(X_n)$ corresponding to the basis $X_1,\ldots X_n$ for the Lie Algebra vector space such that every element of the Lie Group can be written as products of powers of these elements, and any rearrangement of the terms in the product should leave it invariant (i.e. the elements should commute with each other).<br /><br /><div class="twn-pitfall">Note that it <em>is</em> possible to decompose elements of a connected Lie Group as a product of <em>some</em> exponentials, but this is different from there being specifically $n$ elements that one can write any Lie group element as products of.</div><br />But clearly, this can only be possible if the group is <i>Abelian</i>, commutative. This is a special case of the more general fact that only a <b>holonomic basis</b> gives rise to a co-ordinate system on a manifold. The idea is -- a closed loop should produce no overall group action. If you <b>flow</b> $\varepsilon$ in the $X$ direction, then flow $\varepsilon$ in the $Y$ direction, then flow $\varepsilon$ back in the $X$ direction and flow $\varepsilon$ back in the $Y$ direction, you should end up back where you started. If you don't, then the resulting difference is the infinitesimal "<b>group commutator</b>" of the Lie Group:<br /><br />$$e^{\varepsilon X}e^{\varepsilon Y}e^{-\varepsilon X}e^{\varepsilon Y}$$<br />One can check via a Taylor expansion that this is equal, to second order, to:<br /><br />$$1+\varepsilon^2(XY-YX)$$<br />The first thing to note about this is that the $\varepsilon^1$ term is zero -- this may seem like a surprising coincidence, but perhaps it isn't that surprising (I mean, there's nothing else it <i>could</i> be, right? If the commutator was to first-order $1+\varepsilon z$, $\exp z$ would be equal to 1, and so it would give no characterisation at all of the amount of non-commutativity of the flows $X$ and $Y$) -- it's analogous to vector calculus, where the <b>curl</b> of a vector field is proportional to $\varepsilon^2$ (i.e. a line integral along the curve is proportional to its area, so you divide it by this area in the definition of curl, etc.).<br /><br />The second-order term, $XY-YX$, is more interesting. This may seem weird because so far, we've been considering the Lie algebra purely as a <b>vector space</b>, with addition and scalar multiplication being the only things going on. But clearly, this cannot be the entire picture, or a connected Lie group would be characterised entirely by the dimension of its Lie algebra. This operation -- the <b>Lie Bracket</b> or <b>Lie Algebra commutator</b> represented by $[X.Y]$ -- as we will see, gives some additional structure to the Lie Algebra, and in fact characterises it (we'll see what this means).<br /><br />So far, we've obtained no motivation for why this operation $XY-YX$ is actually of any significance. Sure, it appeared in our second-order approximation for the group commutator, but is the group commutator we defined really so great? Surely there could be other ways one could measure the non-commutativity of a group. And the $\varepsilon^2$ business is <i>weird</i>. Things that arise proportional to $\varepsilon$ live in the tangent space, in the Lie Algebra. Where does $[X,Y]$ even live?<br /><br />Two facts will convince us that the Lie Bracket is indeed the "right" measure of non-commutativity of a Lie Algebra:<br /><br /><ul><li><b>The Lie Algebra is closed under the Lie Bracket -- </b>we will see that in fact, $[X,Y]$ lives <i>in the lie algebra</i>, so it is in fact a binary operation on the Lie Algebra, and really does add structure to the Lie Algebra.</li><li><b>It characterises the entire Lie Algebra -- </b>not only is it <i>part</i> of the structure of the Lie Algebra, it characterises the entire structure of the Lie Algebra. What this means is that defining the Lie Bracket on the vector space allows a full characterisation of the part of the group connected to the identity (the "connected part" of the group), so we can say that any Lie Algebras with the same dimension and Lie Bracket are isomorphic.</li></ul><div><br /></div><hr /><br /><a id="closure" name="closure"><b>Closure under the Lie Bracket</b></a><br /><br />If you're like me, you might've thought of several analogous situations to our $1+\varepsilon^2(XY-YX)$ expression -- e.g. in (complex) analysis, at a point where the derivative of a function is zero, the function is characterised by its <i>second</i> derivative (consult Needham's <i>Complex Analysis</i>, p. 205-207 for an explanation). Another example is -- if the first derivative of a function is zero, the second derivative satisfies the product rule (this is actually directly related, in a way we won't go into now).<br /><br />Here's an idea you <i>might</i> think of: as we discussed earlier, the infinitesimal group commutator is $e^{\varepsilon X}e^{\varepsilon Y}e^{-\varepsilon X}e^{-\varepsilon Y}= 1+\varepsilon^2 (XY - YX) + O(\varepsilon^3)\in G$. But for a moment let $\varepsilon$ not be infinitesimal. So $\varepsilon (XY - YX) + O(\varepsilon^2)\in \mathfrak{g}$, the Lie Algebra corresponding to Lie Group $G$, so by scaling $XY-YX+O(\varepsilon)\in\mathfrak{g}$ and by connectedness of the vector space $XY-YX\in\mathfrak{g}$.<br /><br />But this argument is <b>incorrect</b> -- this becomes obvious if you try to formally write it down -- In general, $1+\varepsilon T\in G$ does <b>not</b> imply $T\in\mathfrak{g}$ for non-infinitesimal $\varepsilon$. It's close to an element in $\mathfrak{g}$ (for small $\varepsilon$), but how close? You might get the feeling that it is "sufficiently close", in that the limit $\varepsilon\to0$ of the sequence $\left(c_\varepsilon(X,Y)-1\right)/\varepsilon^2$ (where $c_\varepsilon(X,Y)$ is the group commutator) indeed ends up in the Lie Algebra.<br /><br />To make this feeling formal, consider instead the curve parameterised differently as $\gamma(\varepsilon)=e^{\sqrt\varepsilon X}e^{\sqrt\varepsilon Y}e^{-\sqrt\varepsilon X}e^{-\sqrt\varepsilon Y}$. Then $\gamma'(0)=XY-YX$, and we're done.<br /><br /><div class = "twn-furtherinsight">think about the Taylor expansion here of this new curve for a while</div>holonomic co-ordinateslie algebralie bracketlie groupslie theorymathematicsMon, 06 May 2019 21:21:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-2049429296050963446Abhimanyu Pallavi Sudhir2019-05-06T21:21:00ZComment by Abhimanyu Pallavi Sudhir on How can I solve a chessboard puzzle of size 100X100?
https://math.stackexchange.com/questions/3213369/how-can-i-solve-a-chessboard-puzzle-of-size-100x100
Do rectangles with slanted (but still perpendicular) sides count as rectangles?Sat, 04 May 2019 14:05:06 GMThttps://math.stackexchange.com/questions/3213369/how-can-i-solve-a-chessboard-puzzle-of-size-100x100?cid=6611626Abhimanyu Pallavi Sudhir2019-05-04T14:05:06ZComment by Abhimanyu Pallavi Sudhir on How can I solve a chessboard puzzle of size 100X100?
https://math.stackexchange.com/questions/3213369/how-can-i-solve-a-chessboard-puzzle-of-size-100x100
Are you asking for the number of ways this can be done?Sat, 04 May 2019 14:01:39 GMThttps://math.stackexchange.com/questions/3213369/how-can-i-solve-a-chessboard-puzzle-of-size-100x100?cid=6611618Abhimanyu Pallavi Sudhir2019-05-04T14:01:39ZComment by Abhimanyu Pallavi Sudhir on What is wrong with this reasoning?
https://physics.stackexchange.com/questions/476864/what-is-wrong-with-this-reasoning/476869#476869
Looks OK to me. Reviewers, please look at the question before recommending deletion -- the question asks "doesn't $d^2r/dt^2=0$ imply acceleration is 0?" and this answer clarifies that $d^2r/dt^2$ isn't the acceleration, since $r$ is not a vector.Tue, 30 Apr 2019 05:44:54 GMThttps://physics.stackexchange.com/questions/476864/what-is-wrong-with-this-reasoning/476869?cid=1070930#476869Abhimanyu Pallavi Sudhir2019-04-30T05:44:54ZComment by Abhimanyu Pallavi Sudhir on Is a theory the same as a hypothesis?
https://physics.stackexchange.com/questions/266089/is-a-theory-the-same-as-a-hypothesis
A theory is a mathematical system, a hypothesis the empirical statement that the theory applies in our universe.Mon, 29 Apr 2019 10:34:00 GMThttps://physics.stackexchange.com/questions/266089/is-a-theory-the-same-as-a-hypothesis?cid=1070571Abhimanyu Pallavi Sudhir2019-04-29T10:34:00ZComment by Abhimanyu Pallavi Sudhir on Why we never use the product between vectors like between elements of direct groups product?
https://math.stackexchange.com/questions/3200431/why-we-never-use-the-product-between-vectors-like-between-elements-of-direct-gro
The formal answer is "you can define it if you want". The informal answer is "that's just useless, because it doesn't have any sensible invariances, etc." (so for instance it doesn't mean anything geometrically)Wed, 24 Apr 2019 11:39:06 GMThttps://math.stackexchange.com/questions/3200431/why-we-never-use-the-product-between-vectors-like-between-elements-of-direct-gro?cid=6585541Abhimanyu Pallavi Sudhir2019-04-24T11:39:06ZAnswer by Abhimanyu Pallavi Sudhir for Geometrical Interpretation of Cauchy Riemann equations?
https://math.stackexchange.com/questions/1026134/geometrical-interpretation-of-cauchy-riemann-equations/3197879#3197879
0<p>One might think that being differentiable on <span class="math-container">$\mathbb{R}^2$</span> is sufficient for differentiability on <span class="math-container">$\mathbb{C}$</span>. But the Jacobian of an arbitrary such function doesn't have a natural complex number representation.</p>
<p><span class="math-container">$$
\left[ {\begin{array}{*{20}{c}}
{\partial u/\partial x} & {\partial u/\partial y} \\
{\partial v/\partial x} & {\partial v/\partial y}
\end{array}} \right]
$$</span></p>
<p>Another way of putting this is that no complex-valued derivative (see below for an example) you can define for an arbitrary function fully captures the local behaviour of the function that is represented by the Jacobian.</p>
<p><span class="math-container">$$
\frac{df}{dz} = \left(\frac{\partial u}{\partial x} + \frac{\partial v}{\partial y} \right) + i\left(\frac{\partial v}{\partial x}-\frac{\partial v}{\partial y}\right)
$$</span></p>
<p>The idea is that we should be able to define a complex-valued derivative "purely" for the value <span class="math-container">$z$</span>, without considering directions, i.e. we want to consider <span class="math-container">$\mathbb{C}$</span> one-dimensional in some sense (the sense being "as a vector space"). More precisely, the derivative in some direction in <span class="math-container">$\mathbb{C}$</span> should determine the derivative in all other directions in a natural manner -- whereas on <span class="math-container">$\mathbb{R}^2$</span>, the derivatives in <em>two</em> directions (i.e. the gradient) determines the directional derivatives in all directions. </p>
<p>If you think about it, this is quite a reasonable idea -- it's analogous to how not every linear transformation on <span class="math-container">$\mathbb{R}^2$</span> is a linear transformation on <span class="math-container">$\mathbb{C}$</span> -- only spiral transformations are.</p>
<p><span class="math-container">$$
\left[ {\begin{array}{*{20}{c}}
{a} & {-b} \\
{b} & {a}
\end{array}} \right]
$$</span></p>
<p>How would we generalise differentiability to an arbitrary manifold? Here's an idea: <strong>a function is differentiable if it is locally a linear transformation</strong>. So on <span class="math-container">$\mathbb{R}^2$</span>, any Jacobian matrix is a linear transformation. But on <span class="math-container">$\mathbb{C}$</span>, only Jacobians of the above form are linear transformations -- i.e. the only linear transformation on <span class="math-container">$\mathbb{C}$</span> is <strong>multiplication by a complex number</strong>, i.e. a spiral/amplitwist. So a complex differentiable function is one that is locally an amplitwist (geometrically), which can be stated in terms of the components of the Jacobian as:</p>
<p><span class="math-container">$$
\begin{align}
\frac{\partial u}{\partial x} & = \frac{\partial v}{\partial y} \\
\frac{\partial u}{\partial y} & = - \frac{\partial v}{\partial x} \\
\end{align}
$$</span></p>
<p>This is precisely why you shouldn't (and can't) view complex differentiability as some basic first-degree smoothness -- there is a much richer structure to these functions, and it's better to think of them via the transformations they have on grids.</p>Tue, 23 Apr 2019 05:55:35 GMThttps://math.stackexchange.com/questions/1026134/-/3197879#3197879Abhimanyu Pallavi Sudhir2019-04-23T05:55:35ZTrace, Laplacian, the Heat equation, divergence theorem
https://thewindingnumber.blogspot.com/2019/04/trace-laplacian-heat-equation.html
0The aim of this article is to help build an intuition for the trace of a matrix, "the sum of the elements on the diagonal" -- the basic idea is that the trace is an "average" of some sort, an average of the action of an operator or a quadratic form. We'll make this idea clearer with an example from classical physics: the heat equation.<br /><br /><hr /><br />Consider an $n$-dimensional space with some temperature distribution $T(\vec{x},t)$. We wish to set up a differential equation for this function.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-WDr_mgo-qEg/XL2NKs0J8iI/AAAAAAAAFfc/RQlvQLKSZDklWwkAOd7jkVaq-XXcwCzcACLcBGAs/s1600/lawofcooling.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="177" data-original-width="361" height="156" src="https://4.bp.blogspot.com/-WDr_mgo-qEg/XL2NKs0J8iI/AAAAAAAAFfc/RQlvQLKSZDklWwkAOd7jkVaq-XXcwCzcACLcBGAs/s320/lawofcooling.png" width="320" /></a></div>In the case that $n = 1$, this differential equation is exceedingly easy to write down, considering the difference $(T(x+dx)-T(x))-(T(x)-T(x-dx))$ as the double-derivative upon division by $dx^2$. More rigorously, what we're doing here is applying a <b>localised version of the fundamental theorem of calculus</b>. I.e. we're writing down:<br /><br />$$\begin{align}<br />\lim_{\Delta x \to 0} \frac{1}{\Delta x}(T'(x + \Delta x) - T'(x)) &= \lim_{\Delta x \to 0} \frac{1}{{\Delta x}}\int_x^{\Delta x} {T''(x)dx} \\<br />& = T''(x)<br />\end{align}<br />$$<br />More generally, we may consider the $n$-dimensional case.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-HZE4Or8E8tU/XL2YeooEtuI/AAAAAAAAFfw/akx8XXDjp5clCqNjy4LzSmkFV3Fk0-KzACLcBGAs/s1600/laplace.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="522" data-original-width="605" height="276" src="https://1.bp.blogspot.com/-HZE4Or8E8tU/XL2YeooEtuI/AAAAAAAAFfw/akx8XXDjp5clCqNjy4LzSmkFV3Fk0-KzACLcBGAs/s320/laplace.png" width="320" /></a></div>Analogously to before, one may try to look at temperature flows in each direction -- here, we have an <i>integral</i>, done on the boundary of an infinitesimal region $V$ (this symbol will also represent the volume of the region):<br /><br />$$ \frac{{\partial T}}{{\partial t}} = \lim_{V \to 0} \frac{\alpha }{V}\int_{\partial V} {\hat u\,dS \cdot \vec \nabla T} $$<br />At this point, one may apply the divergence theorem, converting this to:<br /><br />$$\frac{{\partial T}}{{\partial t}} = \mathop {\lim }\limits_{V \to 0} \frac{\alpha }{V}\int\limits_V {\vec \nabla \cdot \vec \nabla T\;dV} = \alpha{\left| {\vec \nabla } \right|^2}T$$<br />In this sense, the divergence theorem is analogous to the fundamental theorem of calculus for manifolds with boundaries that are more than one-dimensional (see the bottom of the page for a link to a formalisation/an abstraction based on this analogy). But there are more ways to intuitively understand this. Note how the Laplacian is the trace of the Hessian matrix (note: we use $\vec{\nabla}^2$ to refer to the Hessian and $\left|\vec\nabla\right|^2$ to refer to the Laplacian):<br /><br />$${\left| {\vec \nabla } \right|^2}T = {\mathop{\rm tr}} \left({\vec{\nabla} ^2}T\right)$$<br />The trace of a matrix is fundamentally linked to some notion of <i>averaging</i> -- the simplest interpretation of this is that it is the mean of the eigenvalues. But more relevant to our situation, it can be shown that the trace of a matrix is the expected value of the quadratic form defined by the matrix on the unit sphere -- or on a general sphere $S$:<br /><br />$${\mathop{\rm tr}} A = \frac{1}{S}\int_S {\frac{{\Delta {x^T}A\,\Delta x}}{{\Delta {x^T}\Delta x}}\,dS} $$<br />One may check that taking the limit as $\Delta x \to 0$, substituting $\nabla^2$ for the operator and writing ${\overrightarrow \nabla ^2}f\,d\vec x = \overrightarrow \nabla f$, one gets the original "average of directional derivatives" expression.<br /><br /><div class = "twn-furtherinsight">Can you interpret the other coefficients of the characteristic polynomial in terms of statistical ideas?</div><br /><hr /><br /><div><b>Further reading:</b></div><div><ul><li>Using the "infinitesimal region" idea to define divergence, curl and Laplacian rigorously: <a href="https://www.khanacademy.org/math/multivariable-calculus/greens-theorem-and-stokes-theorem/formal-definitions-of-divergence-and-curl/a/formal-definition-of-divergence-in-two-dimensions">Khan Academy</a></li><li>An abstraction based on the "analogy" between FTC, Divergence Theorem, Navier-Stokes Theorem, etc. <a href="https://en.wikipedia.org/wiki/Stokes%27_theorem">Stokes' theorem (Wikipedia)</a></li></ul></div>calculusdivergence theoremheat equationlaplacianlinear algebraphysicsstatisticsstokes theoremtraceMon, 22 Apr 2019 11:57:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-6647801447602057527Abhimanyu Pallavi Sudhir2019-04-22T11:57:00ZComment by Abhimanyu Pallavi Sudhir on What is an intuitive explanation to why elimination retains solution to the original system of equations?
https://math.stackexchange.com/questions/1516892/what-is-an-intuitive-explanation-to-why-elimination-retains-solution-to-the-orig/1516896#1516896
@Pinocchio It does. Invertible matrices are automorphisms of a vector space. Multiplying by a singular matrices causes you to lose information, because you lose a dimension.Sun, 21 Apr 2019 12:05:07 GMThttps://math.stackexchange.com/questions/1516892/what-is-an-intuitive-explanation-to-why-elimination-retains-solution-to-the-orig/1516896?cid=6576812#1516896Abhimanyu Pallavi Sudhir2019-04-21T12:05:07ZSVD, polar decomposition, normal matrices; a re-look at transposes and FTLA
https://thewindingnumber.blogspot.com/2019/04/svd-polar-decomposition-normal-matrices.html
0Back in <b><a href="https://thewindingnumber.blogspot.com/2017/08/symmetric-matrices-null-row-space-dot-product.html">Null, row spaces, transpose, fundamental theorem of algebra</a></b>, we first introduced some hand-wavy intuition for the transpose and the orthogonality of the row space and the null space (and the following fundamental theorem of linear algebra). Here, we solidify this intuition a bit more clearly.<br /><br />Consider the "symmetric collapse" discussed in the above article. Our study of the transformation relied specifically on looking at it in a specific basis -- <b>an <i>orthogonal </i>basis</b> -- comprised of the column space and the null space. In this basis, the transformation is a scaling on both axes. In the more general case of an asymmetric collapse -- in which we rotated our space before collapsing, we looked at a basis formed by the row space and the null space -- the basis got <b>rotated and scaled</b> into the new basis, that was the column space and an arbitrary other vector (that could be perpendicular to the column space).<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-aYNoceHiRtc/XLtjjyOB04I/AAAAAAAAFeQ/LzZjyHchbq0ABijtIg5WFdvCc9SwvVU0ACLcBGAs/s1600/collapse.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="520" data-original-width="615" height="270" src="https://3.bp.blogspot.com/-aYNoceHiRtc/XLtjjyOB04I/AAAAAAAAFeQ/LzZjyHchbq0ABijtIg5WFdvCc9SwvVU0ACLcBGAs/s320/collapse.png" width="320" /></a></div>A sensible question to ask is if any transformation can be written in this fashion -- as a transformation of an orthogonal basis into another orthogonal basis. Analogous to how an <b>eigenvalue decomposition of a matrix writes it as scalings in some basis</b>, we're looking to represent the matrix as a <b>spiral (i.e. a scaling combined with a rotation) in some basis</b>. But let's stick with the first formulation of our question -- for any linear transformation $A: \mathbb{R}^n \to \mathbb{R}^m$, can we find an orthogonal basis on $\mathbb{R}^n$ that is mapped to an orthogonal basis $\mathbb{R}^m$?<br /><br />One could, e.g. consider the images under $A$ of the angle of each orthonormal basis in $\mathbb{R}^2$ (i.e. look at the function $AU(\theta)\vec{e}_1 \cdot AU(\theta)\vec {e}_2$ for varying $\theta$ where $U(\theta)$ is the rotation matrix by angle $\theta$) and apply the intermediate value theorem, etc. And such a proof could in principle be extended to $\mathbb{R}^n$.<br /><br />(See <a href="http://www.ams.org/publicoutreach/feature-column/fcarc-svd">here</a> for a thorough explanation.)<br /><br />Here's another, more insightful way you might come to prove this -- we've been visualising linear transformations so far by looking at the image of the basis vectors, but another way to do visualise these transformations is by looking at the <b>image of the unit circle</b> under the transformation.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://upload.wikimedia.org/wikipedia/commons/e/e9/Singular_value_decomposition.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="330" data-original-width="400" height="264" src="https://upload.wikimedia.org/wikipedia/commons/e/e9/Singular_value_decomposition.gif" width="320" /></a></div><br /><div class="twn-furtherinsight">Why does this make sense? Well, one can find an ellipse passing through any two vectors centered at the origin. Elaborate on this argument. Is the resulting ellipse unique? (Hint: no, unless you mark points on the circumference)</div><br />Specifically, consider the <b>axes of the image ellipse </b>$\sigma_1 u_1$, $\sigma_2 u_2$ where $u_1$, $u_2$ are unit vectors. In the original unit circle, any pair of orthogonal vectors on the circle can be axes, so consider the <b>pre-image of the axes</b> of the image ellipse $v_1$, $v_2$. So we have:<br />$$ A v_1 = \sigma_1 u_1\\<br />A v_2 = \sigma_2 u_2 $$<br />Or in general:<br />$$AV = U\Sigma\\<br />A = U\Sigma V^*$$<br />Where $\Sigma$ is diagonal and positive-definite, while $U$ and $V$ are orthogonal/unitary. This is called the <b>Singular-Value Decomposition (SVD)</b> of $A$. <br /><br />In a sense, one can view this as an alternative to the eigen-decomposition. In the eigendecomposition, one looks for a <i>single</i> basis in which the transformation is a scaling. In the singular value decomposition, one looks at scaling and then "re-interpreting" in another basis, but requires that the bases be orthogonal, and that the diagonal matrix be positive and real-valued.<br /><br />The entries $\Sigma$ are called the <b>singular values </b>of $A$, the columns of $U$ and $V$ respectively are the <b>left-singular vectors</b> and the <b>right-singular vectors</b> of $A$ respectively.<br /><br />(<b>Exercise: </b>You know that $\Sigma$ is the scaling of the orthogonal basis, i.e. of the right-singular basis. Convince yourself that the rotation of the basis is given by $UV^*$.)<br /><br />This gives us a much better intuition for the transpose. The SVD of $A^*$ is clearly:<br /><br />$$A^* = V\Sigma U^*$$<br />I.e. the transpose has precisely the opposite rotational effect as $A$ and the same scaling. This is as opposed to the inverse matrix, which has both the opposite rotational and scaling effect as the matrix. For a <b>rotation matrix </b>(or generally an orthogonal matrix $A^*A=I$), the transpose equals the inverse, analogous to how the conjugate equals the inverse for a unit complex number $\bar{z}z=1$. A Hermitian matrix, $A=A^*$, by contrast, is one for which is irrotational, $UV^*=1$, i.e. for which the SVD equals the eigendecomposition.<br /><br /><div class="twn-furtherinsight">Use the SVD to get some intuition for transpose identities like $(AB)^*=B^*A^*$</div><br /><hr /><br />It's instructive to consider the SVD in the case of our original motivating example -- an asymmetric matrix representing a collapse of $\mathbb{R}^2$ into a line. What are the singular bases of this transformation? Well, it maps the orthogonal basis formed by <b>row space and the null space</b> into the orthogonal basis formed by the <b>column space and the left-nullspace</b> (i.e. the orthogonal complement of the column space).<br /><br /><div class="twn-furtherinsight">Think about its transpose.</div><br />Arranging the singular values from largest to smallest, we then have the following relation between the SVD and the items in the fundamental theorem of linear algebra, where $n$ is the dimension of the domain, $m$ is the dimension of the codomain, and $r$ is the dimension of the image/column space:<br /><ul><li>The last $n - r$ <b>singular values are zeroes</b>. </li><li>The first $r$ <b>singular values are positive</b>.</li><li>The last $n - r$ <b>right-singular vectors</b> span the <b>null space</b>.</li><li>The first $r$ <b>right-singular vectors</b> span the <b>row space</b>.</li><li>The last $m-r$ <b>left-singular vectors </b>span the <b>left-null space</b>.</li><li>The first $r$ <b>left-singular vectors</b> span the <b>column space</b>.</li></ul>Note that the terms <b>kernel</b>, <b>coimage</b>, <b>cokernel</b> and <b>image</b> are also used for the <b>null space</b>, <b>row space</b>, <b>left-null space</b> and <b>column space</b>, sometimes in a more general setting.<br /><br />This is the full form of the <b>Fundamental Theorem of Linear Algebra</b>.<br /><br /><div class="twn-furtherinsight">Spend some time thinking about the SVD of non-square matrices, relating them to square collapse matrices. Think about their transposes.</div><br /><div class="twn-exercises">Show that the right-singular vectors $V$ of a matrix $A$ are given by the eigenvectors of $A^*A$ (hint: start by considering the two-dimensional case, relating the right-singular vectors to a maximisation/minimisation problem, and extend the idea to more dimensions).<br /><br />From this, it is clear that the left-singular eigenvectors $U$ (which are the right-singular eigenvectors of $A^*$) are given by the eigenvectors of $AA^*$ and the singular values are the square roots of the singular values of $A^*A$. Well, some of them are (which ones?).</div><br />We observed earlier that the rotational effect of the matrix $A = U \Sigma V^*$ can be given by $UV^*$. The scaling effect is given by $\Sigma$ on the basis of $V$. Hence we can write:<br /><br />$$A = (UV^*)(V\Sigma V^*)$$<br />Letting $W=UV^*$ and $R = V\Sigma V^*$, this gives us a representation of $A$ as:<br /><br />$$A=WR$$<br />Where $W$ is orthogonal and $R$ is positive-semidefinite. This is known as the <b>right-polar decomposition</b> of $A$. Analogously, one may consider the <b>left-polar decomposition</b>:<br /><br />$$A = (U\Sigma U^*)(UV^*) = R'W$$<br /><div class="twn-furtherinsight">Interpret the above decomposition like we did the right decomposition. Note how $R' = WRW^*$, and how a right-polar decomposition leads to a left-polar decomposition of the transpose, and vice versa.</div><br /><div class="twn-furtherinsight">When is the polar decomposition unique? Compare the situation to the polar decomposition of complex numbers.</div><br />Here's a question: when is $R=R'$? One way of putting it is that $WRW^* = R$, i.e. $R$ commutes with $W$. This is a bit difficult to work with. Instead, one may show that the matrices $R$ and $R'$ can be given by $(A^*A)^{1/2}$ and $(AA^*)^{1/2}$ respectively (prove it!) -- noting that a principal square root can uniquely be defined for positive semidefinite matrices. So $R=R'\Leftrightarrow A^*A=AA^*$. These are known as <b>normal matrices</b>. Below is an exercise that provides more intuition into the behaviour of normal matrices.<br /><br /><hr /><br />What does it mean for a matrix to commute with its transpose? Earlier, when discussing commuting matrices, we referred to it as "matrices that do not disturb each other" -- specifically, they preserve each other's (generalised) eigenspaces. In the case of commuting with its transpose, it's easy to show that this means having the <i>same</i> eigenvectors (prove this!).<br /><br />Here's a fact about the eigenvectors of the Hermitian transpose: the eigenvector of $A$ corresponding to the eigenvalue $\lambda$ is orthogonal to all eigenvectors of $A^*$ corresponding to any eigenvalue other than $\lambda^*$ (prove this!).<br /><br />From these two facts, it follows that the following are equivalent (write down the proofs clearly!):<br /><br /><ul><li>$A$ is <b>normal</b>.</li><li>$A$ commutes with $A^*$.</li><li>$R$ commutes with $W$.</li><li>$A$ is <b>unitarily diagonalisable</b>.</li></ul><div>This is known as the <b>spectral theorem.</b></div><hr /><br />(Anyone ever thought about how weird the word "normal" is? Sometimes, it means "perpendicular", sometimes -- as in "orthonormal" and "normalisation" -- it means "unit length", because "norm". What does it mean in this context? It's probably just referring to its eigenvectors being normal/orthogonal, but I like to think it's referring to the fact that the two alternative Hermitian-valued "norms" of the matrix, $A^*A$ and $AA^*$, are equal, so the matrix has a single "norm".)<br /><div><br /></div>Fundamentally, normal matrices are "analogous" to complex numbers, or they "generalise" complex numbers, in the sense that each of its eigenvalues acts as a complex number, transforming, acting as spirals on each of the orthogonal eigenvectors within its own copy of $\mathbb{C}$. One may construct the following table of analogies between normal matrices and complex numbers:<br /><br /><div class="twn-analogies"><style type="text/css">.tg {border-collapse:collapse;border-spacing:0;margin:0px auto;} .tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 18px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;} .tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 18px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;} .tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:top} .tg .tg-uys7{border-color:inherit;text-align:center} .tg .tg-l04w{font-weight:bold;background-color:#efefef;border-color:inherit;text-align:center} </style><br /><table class="tg"><tbody><tr> <th class="tg-l04w">Complex numbers</th> <th class="tg-l04w">Normal matrices</th> </tr><tr> <td class="tg-uys7">Zero (sorta)</td> <td class="tg-uys7">Singular</td> </tr><tr> <td class="tg-uys7">Non-zero</td> <td class="tg-uys7">Invertible</td> </tr><tr> <td class="tg-uys7">Real</td> <td class="tg-uys7">Hermitian</td> </tr><tr> <td class="tg-uys7">Positive real</td> <td class="tg-uys7">Positive-definite</td> </tr><tr> <td class="tg-c3ow">Nonnegative real</td> <td class="tg-c3ow">Positive-semidefinite</td> </tr><tr> <td class="tg-uys7">Imaginary</td> <td class="tg-uys7">Anti-Hermitian</td> </tr><tr> <td class="tg-c3ow">Unit</td> <td class="tg-c3ow">Unitary</td> </tr><tr> <td class="tg-c3ow">Conjugate</td> <td class="tg-c3ow">Hermitian transpose</td> </tr><tr> <td class="tg-c3ow">Norm-squared</td> <td class="tg-c3ow">Gram matrix $A^*A$</td> </tr><tr> <td class="tg-c3ow">Magnitude</td> <td class="tg-c3ow">$(A^*A)^{1/2}=R=V\Sigma V^*$</td> </tr><tr> <td class="tg-c3ow">Argument</td> <td class="tg-c3ow">$AR^{-1}=W=UV^*$</td> </tr><tr> <td class="tg-c3ow">Real Part</td> <td class="tg-c3ow">$\frac12(A+A^*)$</td> </tr><tr> <td class="tg-c3ow">Imaginary Part times $i$</td> <td class="tg-c3ow">$\frac12(A-A^*)$</td> </tr></tbody></table></div><br /><b>Exercise:</b> an <em>EP matrix</em> or range-Hermitian matrix is a weakened version of a Hermitian matrix -- the row space of the matrix equals the column space. Although this was a bit hard to understand in <a href="https://thewindingnumber.blogspot.com/2017/08/symmetric-matrices-null-row-space-dot-product.html">the first article</a> and was only briefly mentioned towards the end, we now have the intuition to comprehend them. Explain why a matrix is range-Hermitian if and only if it is unitarily similar to a matrix of the form of a block matrix:<br /><br />$$\left[ {\begin{array}{*{20}{c}}C&0\\0&0\end{array}} \right]$$<br />Where $C$ is a non-singular square matrix and the zeroes are zero block matrices. This decomposition is called the <b>core-nilpotent decomposition</b>. Hence, show that being range-Hermitian is a weakened form of being normal.analogiesfundamental theorem of linear algebralinear algebranormal matrixpolar decompositionsingular value decompositionsvdtransposeSun, 21 Apr 2019 09:28:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7301194014626662521Abhimanyu Pallavi Sudhir2019-04-21T09:28:00ZAnswer by Abhimanyu Pallavi Sudhir for Computing the Lie bracket on the Lie group $GL(n, \mathbb{R})$
https://math.stackexchange.com/questions/1884253/computing-the-lie-bracket-on-the-lie-group-gln-mathbbr/3193887#3193887
1<p>I think the sensible way to get an intuition for this is to just look at the Taylor expansion of the group commutator:</p>
<p><span class="math-container">$$e^{\varepsilon x} e^{\varepsilon y} e^{-\varepsilon x} e^{-\varepsilon y}$$</span></p>
<p>Which to second order is <span class="math-container">$1+\varepsilon^2(xy-yx)$</span>. Presumably you know how to prove that the second derivative of the above expression is equivalent to the derivative-of-the-adjoint definition.</p>Fri, 19 Apr 2019 18:26:17 GMThttps://math.stackexchange.com/questions/1884253/-/3193887#3193887Abhimanyu Pallavi Sudhir2019-04-19T18:26:17ZAnswer by Abhimanyu Pallavi Sudhir for Determinant-like expression for non-square matrices
https://math.stackexchange.com/questions/903028/determinant-like-expression-for-non-square-matrices/3191959#3191959
0<p>See <a href="https://arxiv.org/abs/1904.08097" rel="nofollow noreferrer">1904.08097</a> for a review I authored of generalised determinant functions of tall matrices, and their properties -- this should provide a self-contained introduction to three different generalised determinants. </p>
<p>The function mentioned by Joonas Ilmavirta is the square of the "determinant-like function" that I first wrote about in 2013, albeit with an erroneous factor of <span class="math-container">$\sqrt{|m-n|!}$</span> at the front, which is corrected in the above review. It is also the norm-squared of the vector determinant, and the product of the singular values of the matrix.</p>
<p>If you want a non-trivial determinant for "wide matrices", i.e. flattenings, you will need to be a bit creative in the definition of the determinant, such as by defining it as the scaling of <span class="math-container">$m$</span>-volumes where <span class="math-container">$m$</span> is the dimension of the flattened space.</p>Thu, 18 Apr 2019 03:38:38 GMThttps://math.stackexchange.com/questions/903028/-/3191959#3191959Abhimanyu Pallavi Sudhir2019-04-18T03:38:38ZComment by Abhimanyu Pallavi Sudhir on Did the new image of black hole confirm the general theory of relativity? (M87)
https://physics.stackexchange.com/questions/472323/did-the-new-image-of-black-hole-confirm-the-general-theory-of-relativity-m87/472339#472339
@sgf Sure, if your theory is deterministic, it can be falsified. But we know that the universe is probabilistic, and that any "deterministic theory" is really a model, and models can't really be falsified.Mon, 15 Apr 2019 18:54:19 GMThttps://physics.stackexchange.com/questions/472323/did-the-new-image-of-black-hole-confirm-the-general-theory-of-relativity-m87/472339?cid=1063204#472339Abhimanyu Pallavi Sudhir2019-04-15T18:54:19ZComment by Abhimanyu Pallavi Sudhir on Did the new image of black hole confirm the general theory of relativity? (M87)
https://physics.stackexchange.com/questions/472323/did-the-new-image-of-black-hole-confirm-the-general-theory-of-relativity-m87/472339#472339
@sgf This is a meaningless question -- truth is absolute, but any physical statements that are rationally inferred make statements about truth are fundamentally statistical. The statement "all inference is fundamentally statistical" is not a physical statement, so the question is moot.Mon, 15 Apr 2019 12:21:51 GMThttps://physics.stackexchange.com/questions/472323/did-the-new-image-of-black-hole-confirm-the-general-theory-of-relativity-m87/472339?cid=1062973#472339Abhimanyu Pallavi Sudhir2019-04-15T12:21:51ZComment by Abhimanyu Pallavi Sudhir on Is a full rank square matrix necessarily a positive definite matrix?
https://math.stackexchange.com/questions/3188374/is-a-full-rank-square-matrix-necessarily-a-positive-definite-matrix
Of course not. But a Gram matrix is a matrix of the form $M^H M$, so it is always positive-semidefinite, and if it is non-singular, it is positive-definite.Mon, 15 Apr 2019 07:17:03 GMThttps://math.stackexchange.com/questions/3188374/is-a-full-rank-square-matrix-necessarily-a-positive-definite-matrix?cid=6561670Abhimanyu Pallavi Sudhir2019-04-15T07:17:03ZComment by Abhimanyu Pallavi Sudhir on Did the new image of black hole confirm the general theory of relativity? (M87)
https://physics.stackexchange.com/questions/472323/did-the-new-image-of-black-hole-confirm-the-general-theory-of-relativity-m87/472339#472339
There's nothing wrong with the word "confirm" -- "falsification" is just a confirmation of the negation, and falsification is not the nature of science, Bayesian inference (or <a href="https://en.wikipedia.org/wiki/Abductive_reasoning" rel="nofollow noreferrer">abductive reasoning</a>) is. Data cannot completely prove or completely disprove a theory, it only affects the Bayesian confidence in a theory -- it's just that negative data tends to affect statistical confidence more than positive data as a result of the law of multiple explanations.Sat, 13 Apr 2019 08:35:58 GMThttps://physics.stackexchange.com/questions/472323/did-the-new-image-of-black-hole-confirm-the-general-theory-of-relativity-m87/472339?cid=1061963#472339Abhimanyu Pallavi Sudhir2019-04-13T08:35:58ZAnswer by Abhimanyu Pallavi Sudhir for Intuitive explanation of a positive semidefinite matrix
https://math.stackexchange.com/questions/9758/intuitive-explanation-of-a-positive-semidefinite-matrix/3181937#3181937
1<p>Positive-definite matrices are matrices that are <strong>congruent to the identity matrix</strong>, i.e. that can be written as <span class="math-container">$P^HP$</span> for invertible <span class="math-container">$P$</span> (for some reason, a lot of authors define congruence as <span class="math-container">$N=P^TMP$</span>, but here we go by the Hermitian definition <span class="math-container">$N=P^HMP$</span>). </p>
<p>One reason this is useful is that if two forms <span class="math-container">$M$</span> and <span class="math-container">$N$</span> are congruent, their corresponding "generalised unitary groups" <span class="math-container">$\{A^HMA=M\}$</span> and <span class="math-container">$\{B^HNB=N\}$</span> are isomorphic (via conjugation by <span class="math-container">$P$</span>). So positive-definite matrices (as well as negative-definite matrices, because <span class="math-container">$-I$</span> is preserved by the unitary group as well) define a dot product whose geometry is isomorphic to Euclidean geometry.</p>
<p>Similarly, a <strong>positive semidefinite matrix</strong> defines a geometry that Euclidean geometry is <em>homeomorphic</em> to -- to put it in slightly imprecisely, such a geometry has all the symmetries of Euclidean geometry, and perhaps then some.</p>
<p>See a fuller treatment <strong><a href="https://thewindingnumber.blogspot.com/2019/04/geometry-positive-definiteness-and.html" rel="nofollow noreferrer">here</a></strong>.</p>Wed, 10 Apr 2019 06:48:08 GMThttps://math.stackexchange.com/questions/9758/-/3181937#3181937Abhimanyu Pallavi Sudhir2019-04-10T06:48:08ZGeometry, positive definiteness, and Sylvester's law of inertia
https://thewindingnumber.blogspot.com/2019/04/geometry-positive-definiteness-and.html
0Something I found absurdly dissatisfying when first studying linear algebra was the idea of a <em>positive-definite matrix</em> (or a nonnegative-definite one). Here are some explanations of the idea I found online:<br /><ul><li><b>it's a generalisation of a positive real number </b>-- correct, but why? In what sense? Sure, you could say "all the eigenvalues are positive real numbers in an orthogonal eigenbasis" -- but why is this really important? It just doesn't feel complete. This will pretty much be our second motivation, but with more backstory.</li><li><b>it keeps vectors "roughly" near where they started -- </b>this is by far the <i>worst</i> explanation I've found of the idea -- the condition that $x\cdot Ax > 0$ means that any vector remains within a right angle from where it started. Not only is this terrible because it seems absurdly arbitrary (to choose $\pi/2$ as our special angle), but it also fails to make clear why we're interested only in positive-definite <i>symmetric</i> matrices, analogous to positive real numbers. On the reals, certainly, there are plenty of transformations that achieve this with real vectors (rotations of less than $\pi/2$, for instance), but we don't care about them. <i>The most serious problem</i> with this explanation, though, is that it tries to reinterpret the condition $x^TAx>0$ in terms of the conventional Euclidean dot product, while the <em>whole point</em> is to look at generalised dot products, with bilinear forms other than the Euclidean one.</li></ul>In this article, I'll motivate the idea of a positive-definite matrix by considering "generalised geometries" and generalised inner products -- through a series of exercises (well, I'll try to keep them exercises, but maybe I won't be able to stop myself from answering them myself).<br /><br /><hr /><br /><b>"Definite geometries"</b><br /><br /><ol><li>First, we come up with a <b>definition of geometry </b>(no pun intended). Much of the linear algebra we've dealt with -- specifically the dot product -- was with Euclidean geometry in mind, and it's interesting to think about what kind of linear algebra we would come up with if we considered other sorts of geometries.</li><ol type="a"><li>The first image in your head when hearing of geometry is that of a <i>space</i>, or <i>manifold -- </i>perhaps $\mathbb{R}^n$. But a space is just a set of points. Most geometric properties you've dealt with deal with properties like <i>length</i> and <i>angles</i> and <i>shapes</i>. These properties don't depend on e.g. where you place an object on the manifold, i.e. translations -- as well as some other transformations. <b>Can you characterise all the linear transformations under which geometric properties (like the ones we mentioned) are invariant under?</b> -- i.e. the symmetries of Euclidean geometry.</li><li>These transformations are known as "rigid transformations" and form a group (prove it if you want, but come on -- they're <i>symmetries</i>, of course they form a group). Can you identify this group (discard translations if you prefer)?</li><li>So Euclidean geometry can be defined as a <b>the symmetries of $\mathbb{R}^n$ under the group $O(n)$ acting on it</b>. It is then natural to generally define a geometry as <b>the symmetries of a manifold under a group acting on it</b>. </li></ol><li>Now that we have generalised our definition of a geometry, let's specialise to a specific sort of geometry somewhat "analogous" to the traditional orthogonal-group (i.e. Euclidean) geometry. We will let our space be $\mathbb{R}^n$ or $\mathbb{C}^n$ but experiment with our group. By <i>analogous</i>, we mean not that the geometries are identical, but at least that the same notions -- like lengths and angles and shapes -- can be defined for them, that the ideas in the geometry aren't completely foreign to us.</li><ol type="a"><li>Much like the Orthogonal group can be defined by the invariance of the identity bilinear form $\mathrm{diag}(1,1,1,1)$ under a "<b>bilinear form similarity transformation</b>", more commonly known as a <b>congruence</b> (i.e. $A^T I A = I$), we can consider groups that are defined by some general $A^T M A = M$. </li><li>Obviously, not all matrix groups can be written this way -- for example, any subgroup of $O(n)$ cannot be. But groups of this form define in some sense geometries not very different from Euclidean geometry -- why? Because the form preserved by the Orthogonal group -- the identity form -- is the <i>dot product</i> on Euclidean geometry. <b>Preservation of the identity form is equivalent to the preservation of the Euclidean dot product</b> -- prove this -- which also means lengths and angles are preserved. </li><li>As any dot product is necessarily a bilinear form, it can be represented by a bilinear form $M$ called the <b>metric</b> as $v^T M w$, and its preservation is equivalent to the preservation of $M$ under bilinear form conjugation, i.e. $A^T M A = M$ -- prove this! (the proofs are absurdly trivial).</li><li>Examples of such groups include: the "<b>indefinite orthogonal group</b>", which you may know as the <b>Lorentz group</b> $O(1,3)$ from special relativity, the group of linear transformations that preserve the bilinear form $\mathrm{diag}(-1,1,1,1)$ (called the Minkowski metric). Indeed, Minkowskian geometry has notions of length (the spacetime interval) and angles (some combination of rapidity and angles).</li></ol><li>Next, we are interested in thinking about when two geometries are "basically the same".</li><ol type="a"><li>Try to write down some simple two-dimensional geometries -- consider e.g. $M = \left[ {\begin{array}{*{20}{c}}0&{ - 1}\\1&0\end{array}} \right]$. <b>Study some of its properties. </b>(this is "symplectic geometry", by the way) Do you think this is the "same" as Euclidean geometry?</li><li>Think about what kind of properties you looked at to establish the answer as "No". Use them to come up with a simple and snappy definition of two geometries being the same, or isomorphic. </li><li>You should have come to the conclusion -- <b>two geometries on the same manifold are isomorphic iff their groups are isomorphic</b> (If you read this before figuring out the answer for yourself, bleach your brain and try again.) (For legal reasons, that's a joke.)</li><li>Let's study some examples of such an isomorphism. The trivial case is where the groups are equal, e.g. if $M = kN$ or $M = N^T$ (prove these). What about some <b>non-trivial isomorphism</b>? Here's an idea: groups defined by <b>congruent metrics</b> are isomorphic. I.e. if $M=P^TNP$ for some change of basis matrix $P$, the groups $\{A^TMA=M\}$ and $\{A^TNA=N\}$ are isomorphic. Prove this (again, the proof is trivial) -- you will see that the isomorphism is a similarity relation $A \leftrightarrow P^{-1}AP$.</li><li>(Is this an <i>iff</i> statement? If two groups bilinear form-preserving groups are isomorphic, is there a way to write them as $\{A^HMA=M\}$, $\{A^HNA=N\}$ such that $M$ and $N$ congruent? I'm not sure. It would suffice to prove that all isomorphisms of a matrix group are similarity transformations. Perhaps this is implied by <a href="https://groupprops.subwiki.org/wiki/Isomorphic_iff_potentially_conjugate"><b>Isomorphic iff potentially conjugate</b></a>, but how do we know the conjugacy isn't in some weird group that $GL_{\mathbb{R}}(n)$ is homomorphic to?)</li><li>One can also consider the case of <b>non-invertible</b> $P$ -- do we have an isomorphism between $M$-preservers and $N$-preservers if $M=P^TNP$ for non-invertible $P$? No? <b>What about a homomorphism</b>? In which direction?</li></ol><li>So which geometries are isomorphic to Euclidean geometry? What matrices are congruent to the identity form?</li><ol type="a"><li>No prizes for saying "metric tensors of the form $P^TP$ (or more generally $P^H P$)" for invertible $P$. These matrices -- those that are congruent to the identity matrix -- are called <b>positive-definite </b>matrices. Along the lines of part f above, one can also consider the case of non-invertible $P$ -- these are called <b>positive-semidefinite</b> or <b>nonnegative-definite</b> matrices, and Euclidean geometry is homomorphic to positive-semidefinite geometries.</li><li><div class="twn-pitfall">To be fair, these <em>aren't</em> the only geometries isomorphic to Euclidean geometry -- remember the trivial isomorphisms? So for instance, negative-definite geometries are also isomorphic to Euclidean geometry.</div></li><li>A neat way to visualise these "isomorphisms and homomorphisms of geometries" is by looking at the contours of the geometries, i.e. the set $x^TMx = C$. Positive definite (and negative definite) matrices correspond to <b>elliptical contours</b> (while positive semidefinite matrices correspond to the degenerate cases -- <b>a degenerate ellipse has all the symmetries of a non-degenerate one</b>, but not vice versa), which can easily be stretched into a Euclidean circle. Other matrices, on the other hand, may have hyperbolic contours which cannot be similarly deformed into a Euclidean circle.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-lp2lNn8m3bQ/XK2tBLfnMzI/AAAAAAAAFdc/y6RJTagcuak0TbswEPhYhvW5kEvSk3A0gCLcBGAs/s1600/ellipses.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="414" data-original-width="772" height="213" src="https://2.bp.blogspot.com/-lp2lNn8m3bQ/XK2tBLfnMzI/AAAAAAAAFdc/y6RJTagcuak0TbswEPhYhvW5kEvSk3A0gCLcBGAs/s400/ellipses.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">any symmetry of the ellipse/circle can be mapped homomorphically to a symmetry of the straight lines, but not vice versa.</td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"></div></li><li>Recall that the equation of an ellipse takes the form $\mathop \sum \limits_{i = 1}^n {a_i}{x_i}^2 = C$ for positive $a_i$. So our interpretation above is equivalent to stating that a matrix of the form $P^T P$ or $P^H P$ has <b>positive</b> (for invertible $P$) or <b>non-negative</b> (degenerate ellipse) <b>real eigenvalues</b>(this should be pretty easy to prove).</li><li>More generally, any two congruent matrices have the same numbers of positive, negative and zero eigenvalues (called the positive, <b>negative and zero indices of inertia</b> respectively). This is known as <b>Sylvester's law of inertia</b> (prove it!), and shows that all real-eigenvalued matrices are <b>congruent to a diagonal matrix</b> with some number of 1's, -1's and 0's (and arrangement doesn't matter) -- see also the <a href="https://en.wikipedia.org/wiki/Metric_signature"><b>metric signature</b></a>. This gives us a condition to tell if <i>any two</i> matrices are congruent, or any two form-preserving geometries are isomorphic/homomorphic.</li></ol><li>Is it really true, though -- that any geometry with elliptical contours is isomorphic to Euclidean geometry? Come up with a counter-example (and how did you come up with it?)</li><ol type="a"><li>You might consider, e.g. $M=\left[\begin{array}{*{20}{c}}1&{ - 1}\\1&1\end{array}\right]$ -- this produces <i>exactly the same contours</i> as Euclidean geometry -- same unit circle, same everything. But it's not symmetric, and <b>all positive-definite matrices are symmetric/Hermitian </b>(proof is trivial). In fact, <b>any $M$ produces the same contours as $\frac12 (M+M^T)$ -- its "symmetric part" </b>(why?). What's going on? Does the norm (quadratic form) <i>not</i> completely define the dot product (bilinear form)?</li><li>If you think about this for a while, you might get an idea of what's going on -- the <b>symmetric/Hermitian part of a matrix defines the contours on the <i>real part</i> of the vector space</b>, but the antisymmetric/anti-Hermitian part begins to matter in $\mathbb{C}^n$. <b>The contours of the quadratic form in $\mathbb{C}^n$ completely determine the dot product</b>, i.e. if $v^HMv=v^HNv$ for all complex vectors $v$, then $v^HMw=v^HNw$ for all complex vectors $v$ and $w$, i.e. $M=N$. The proof is trivial.</li></ol><li>Next, let's consider some properties and alternate characterisations of positive-definite matrices.</li><ol type="a"><li>From the ellipse depiction, it's reasonable to wonder if a matrix is positive-definite if the norm it induces is positive for all non-zero real vectors, i.e. $v^TMv>0$ (certainly the forward implication -- only if -- is clear, from the $P^TP$ factorisation). As it turns out, though, there are other matrices -- such as a rotation by less than $\pi/2$, that also satisfy this condition. It is certainly clear that the condition $v^TMv>0$ <i>combined with</i> $A$ being symmetric imply a matrix is positive-definite -- by completing the square on a symmetric bilinear form -- but any matrix $A$ for which $(M+M^T)/2$ is positive-definite also satisfies $v^TMv>0$ by the same argument (if this seems anything but completely obvious, think about the corresponding quadratic expressions).</li><li>The reason for this annoyance is that in $\mathbb{R}^n$, you have matrices that are pure rotations, with no eigenvectors, so the non-positiveness of your eigenvalues don't have to matter. On the other hand, if you extend our domain to $\mathbb{C}^n$ -- i.e. $v^HMv>0$ for all complex vectors $v$, all the eigenvalues are "accounted for". Find a way to write this idea down precisely.</li><li>Here's another way to think about it: if $v^HMv\in\mathbb{R}$ for all complex vectors $v$, the matrix $M$ is Hermitian. The proof is basically just algebraic simplification, considering the imaginary part of $(v+iw)^HA(v+iw)$, which is $v^HAw-w^HAv$ -- which is the "symplectic part" of the general complex dot product. See <a href="https://math.stackexchange.com/a/2843380/"><b>this math stackexchange answer</b></a> for a fuller explanation.</li></ol></ol>To be completely honest, I'm being disingenous in claiming that this is "the" motivation for positive-definite matrices. There is a completely independent motivation arising from optimisation in multivariable calculus (the second-derivative/Hessian test), a completely independent motivation arising from systems of differential equations, a completely independent motivation arising from covariance matrices, and so on.<br /><br />Perhaps a way of thinking of this is that a positive-definite matrix is simply a normal matrix with positive eigenvalues, and much of this article is really a justification for why positive-definite metric tensors are (un-)interesting.euclidean geometrygeometryhermitian matrixinner productpositive definitenessquadratic formssylvester's law of inertiasymmetric matrixTue, 09 Apr 2019 18:12:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-6647999266945160927Abhimanyu Pallavi Sudhir2019-04-09T18:12:00ZComment by Abhimanyu Pallavi Sudhir on A question about the remainder of a Taylor Polynomial.
https://math.stackexchange.com/questions/3178319/a-question-about-the-remainder-of-a-taylor-polynomial
Your argument is just -- "Why do we do X?" "To find Y." "But finding Y is unnecessary". What's necessary isn't the point, the question asked "why would you find the maximum of this expression?", and the answer is "for example, to put an upper bound on the error".Mon, 08 Apr 2019 04:15:58 GMThttps://math.stackexchange.com/questions/3178319/a-question-about-the-remainder-of-a-taylor-polynomial?cid=6543925Abhimanyu Pallavi Sudhir2019-04-08T04:15:58ZComment by Abhimanyu Pallavi Sudhir on A question about the remainder of a Taylor Polynomial.
https://math.stackexchange.com/questions/3178319/a-question-about-the-remainder-of-a-taylor-polynomial
@Martin You're getting fixated on semantics. The maximum value of the expression in the question is the upper bound on the error. I'm not even sure what we're arguing about.Sun, 07 Apr 2019 19:30:24 GMThttps://math.stackexchange.com/questions/3178319/a-question-about-the-remainder-of-a-taylor-polynomial?cid=6543087Abhimanyu Pallavi Sudhir2019-04-07T19:30:24ZComment by Abhimanyu Pallavi Sudhir on A question about the remainder of a Taylor Polynomial.
https://math.stackexchange.com/questions/3178319/a-question-about-the-remainder-of-a-taylor-polynomial
@Martin I don't think you understood the OP's question (3). He's specifically asking why you would try to find the maximum value of the error. The answer is precisely "to find the upper bound".Sun, 07 Apr 2019 19:10:32 GMThttps://math.stackexchange.com/questions/3178319/a-question-about-the-remainder-of-a-taylor-polynomial?cid=6543047Abhimanyu Pallavi Sudhir2019-04-07T19:10:32ZComment by Abhimanyu Pallavi Sudhir on A question about the remainder of a Taylor Polynomial.
https://math.stackexchange.com/questions/3178319/a-question-about-the-remainder-of-a-taylor-polynomial
@Martin What? I'm responding to (3) -- to find an upper bound, you find the $x^*$ that maximises the expression you find for the error. This is what the OP is referring to.Sun, 07 Apr 2019 15:44:48 GMThttps://math.stackexchange.com/questions/3178319/a-question-about-the-remainder-of-a-taylor-polynomial?cid=6542566Abhimanyu Pallavi Sudhir2019-04-07T15:44:48ZComment by Abhimanyu Pallavi Sudhir on A question about the remainder of a Taylor Polynomial.
https://math.stackexchange.com/questions/3178319/a-question-about-the-remainder-of-a-taylor-polynomial
3) Because that's the maximum error. You're trying to put an upper bound on the error. That's the point of everything else.Sun, 07 Apr 2019 15:22:56 GMThttps://math.stackexchange.com/questions/3178319/a-question-about-the-remainder-of-a-taylor-polynomial?cid=6542507Abhimanyu Pallavi Sudhir2019-04-07T15:22:56ZAnswer by Abhimanyu Pallavi Sudhir for Can non-linear transformations be represented as Transformation Matrices?
https://math.stackexchange.com/questions/450/can-non-linear-transformations-be-represented-as-transformation-matrices/3177854#3177854
0<p>The point of transformation matrices is that the images of the <span class="math-container">$n$</span> basis vectors is sufficient to determine the action of the entire transformation -- this is true for linear transformations, but not an arbitrary transformation.</p>
<p>However, nonlinear transformations (the smooth ones, anyway), can be locally approximated as linear transformations. With a bit of calculus, you get the "Jacobian matrix", which acts on the tangent vector space at every point on a manifold. This is a generalisation of transformation matrices in the sense that linear transformation's Jacobian is equal to its matrix representation, i.e. in the same sense that the derivative generalises the slope (which completely determines a linear function <span class="math-container">$y=mx$</span>)</p>Sun, 07 Apr 2019 06:48:31 GMThttps://math.stackexchange.com/questions/450/-/3177854#3177854Abhimanyu Pallavi Sudhir2019-04-07T06:48:31ZAnswer by Abhimanyu Pallavi Sudhir for Why does $A^TA=I, \det A=1$ mean $A$ is a rotation matrix?
https://math.stackexchange.com/questions/68119/why-does-ata-i-det-a-1-mean-a-is-a-rotation-matrix/3177807#3177807
2<p>You could just write out the components to confirm that this is so -- a much more interesting way to understand things, however, is to write down the condition as:</p>
<p><span class="math-container">$$A^TIA=I$$</span></p>
<p>The idea is that the matrix <span class="math-container">$A$</span> <em>preserves the identity quadratic form</em> -- note that <span class="math-container">$I$</span> is a quadratic form here and not a linear transformation, as this is the transformation law for quadratic forms (<span class="math-container">$A^TMA$</span> instead of <span class="math-container">$A^{-1}MA$</span>).</p>
<p>The hyperconic section corresponding to the identity quadratic form is the unit sphere -- thus the orthogonal transformations are all those that preserve the unit sphere. Another way of putting this is that <span class="math-container">$(Ax)^TI(Ay)=x^TA^TIAy=x^TIy$</span>, i.e. the Euclidean dot product <span class="math-container">$I$</span> is preserved by <span class="math-container">$A$</span>. This is equivalent to preserving the unit sphere, because the unit sphere is determined by the dot product on the given space.</p>
<p>What sort of transformations preserve the unit sphere? </p>
<hr>
<p>The reason this is a good way of understanding things is that there are plenty of other "dot products" you can define. One elementary one from physics is the Minkowski dot product in special relativity, <span class="math-container">$\mathrm{diag}(-1,1,1,1)$</span> -- the corresponding quadric surface is a hyperboloid, and the transformations that preserve it, forming the Lorentz group, are boosts (skews between time and a spatial dimension), spatial rotations and reflections.</p>
<hr>
<p>As for discriminating between rotations and reflections, suppose we define rotations in a completely geometric way -- for a matrix to be a rotation, all its eigenvalues are either 1 or in pairs of unit complex conjugates. </p>
<p>What do the eigenvalues of orthogonal matrices look like? For each eigenvalue, you need <span class="math-container">$\overline{\lambda}\lambda=1$</span>, i.e. all the eigenvalues are unit complex numbers. If a complex eigenvalue isn't paired with a corresponding conjugate, you will not get a real-valued transformation on <span class="math-container">$\mathbb{R}^n$</span>. Meanwhile if an eigenvalue of -1 isn't paired with another -1 -- i.e. if there are an odd number of reflections -- you get a reflection. The orthogonal (or rather unitary) transformations that do not behave this way are precisely the rotations.</p>
<p>The similarity between unpaired unit complex eigenvalues and unpaired -1's is interesting, by the way -- when thinking about reflections, you might have gotten the idea that reflections are <span class="math-container">$\pi$</span>-angle rotations in a higher-dimensional space -- like the vector was rotated through a higher-dimensional space and then landed on its reflection -- like it was a discrete snapshot of a process as smooth as any rotation. </p>
<p>Well, now you know what this higher-dimensional space is -- precisely <span class="math-container">$\mathbb{C}^n$</span>. And the determinant of a unitary matrix also takes a continuous spectrum -- the entire unit circle. In this sense (among other senses) complex linear algebra is more "complete" than real linear algebra.</p>Sun, 07 Apr 2019 05:55:12 GMThttps://math.stackexchange.com/questions/68119/why-does-ata-i-det-a-1-mean-a-is-a-rotation-matrix/3177807#3177807Abhimanyu Pallavi Sudhir2019-04-07T05:55:12ZAnswer by Abhimanyu Pallavi Sudhir for Reasoning about Lie theory and the Exponential Map
https://math.stackexchange.com/questions/19575/reasoning-about-lie-theory-and-the-exponential-map/3177348#3177348
0<p>The identity element <em>does</em> have significance, in the sense that it is the only natural way to think of the elements of the Lie Algebra as infinitesimal generators.</p>
<p>As I explain <a href="https://thewindingnumber.blogspot.com/2019/04/introduction-to-lie-groups.html" rel="nofollow noreferrer">here</a>, the idea is that with elements of the form <span class="math-container">$1+\varepsilon\vec\theta$</span>, elements of the group are generated as </p>
<p><span class="math-container">$$g(\vec\theta)=(1+\varepsilon\vec\theta)^{1/\varepsilon}=\exp\vec\theta$$</span></p>
<p>This map only exists when elements close to the identity are taken, as every element other than the identity is itself a generator (thus elements of the group can simply be generated via real-powers, not infinitesimally).</p>
<p><img src="https://i.stack.imgur.com/0AC5rm.png" width="500" /></p>Sat, 06 Apr 2019 19:21:58 GMThttps://math.stackexchange.com/questions/19575/-/3177348#3177348Abhimanyu Pallavi Sudhir2019-04-06T19:21:58ZIntroduction to Lie groups
https://thewindingnumber.blogspot.com/2019/04/introduction-to-lie-groups.html
0When you first learned about cyclic groups, the picture in your head was that of the unit circle (complex numbers with norm one). Sure, the unit circle isn't actually a cyclic group, but it really <i>feels</i> like one. When I <a href="https://thewindingnumber.blogspot.com/2018/12/intuition-analogies-and-abstraction.html">first motivate group theory</a>, I even base the motivation on the close similarities between the circle group and the modular addition group $\mathbb{Z}/p\mathbb{Z}$. Indeed, the circle group is just the group of real numbers mod $2\pi$.<br /><br />The solution to this problem can be seen from the quickest proof that the unit circle isn't cyclic -- the fact that it isn't countable (while the integers are). Well, what if we <b>discard the centrality of the integers to our definition of a cyclic group and admit real powers on groups</b>?<br /><br />Ok, but how? It's easy to construct integer powers on an arbitrary group -- in terms of repeated addition (which defines natural powers) and inverses. But the real numbers are a wholly different beast -- they require a nice and connected <b>"smooth" structure, a geometry</b> on the group. We can certainly visualise this geometry on the unit circle or the positive real numbers (which is also "real-power cyclic"), but it's interesting to think about how one might introduce such a geometry on other groups (groups that admit such a geometry are called <b>Lie groups</b>).<br /><br />Well, if you think for a while, you might get the idea of defining a group via a real-number <i>paramterisation</i> $\mathbb{R}\to G$. The unit circle can be parameterised as $g(\theta)=\exp i\theta$, the positive real numbers can be parametarised as $g(\xi)=\exp\xi$, etc. This parameterisation would then give $\exp rt =(\exp t)^r$ for real powers $r$ of elements in the group.<br /><br />But here's the thing -- we could have introduced any sort of ugly and terrible parameterisation for our group. We knew how the parameterisation <i>should</i> look for the unit circle, but we could have as well have created something definitely not smooth -- like mapping $\pi i$ to $-1$ and mapping $(\pi + \varepsilon)i$ to $i$ (sorry, you can't use too much dramatic hyperbole on the unit circle... fine, let's map it to 30 gazillion, which isn't on the unit circle, but whatever), and the real-power would look ridiculous, not at all what we want, and we may not even have a "real-power cyclic" structure.<br /><br />What exactly do we <i>want</i> from our parameterisation?<br /><br /><hr /><br />Let's think about what a generator looks like with real powers on the unit circle. Well, really any non-identity element $e^{i\theta_0}$ can generate the group (take it to the power of $\theta/\theta_0$), but if we want to emulate the case of the integers under addition $\{...a^{-2},a^{-1},1,a,a^2,a^3,...\}$, we'd like to call the element really close to 1 the generator. Well, there's no element that's really close to 1, so we're talking about some kind of an infinitesimal thing. This is called an <b>infinitesimal generator</b> of the Lie group.<br /><br />In the first-order approximation, such an element would be of the form $1+i\varepsilon$. By making $\varepsilon$ sufficiently small, the element will be sufficiently close to being "on the unit circle", with an arc length of $\varepsilon$ away from the identity, and its real power $r$ of the element will have an arc length of $r\varepsilon$ away from the identity. So to generate the element with parameter $\theta$, we need to take $1+i\varepsilon$ to a real power of $r=\theta/\varepsilon$. I.e.<br /><br />$$g(\theta) = \lim\limits_{\varepsilon\to 0} (1+i\varepsilon)^{\theta/\varepsilon}=\lim\limits_{r\to\infty} \left(1+\frac{i\theta}{r}\right)^{r}=\exp i\theta$$<br />If you were studying calculus for the first time, this is really solid intuition for Euler's formula. Conversely, you can go in the other direction and say it's solid intuition for the compound-interest limit.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-tNdLgOkHWM0/XKj6m_2NmYI/AAAAAAAAFcs/9G_0nKihIV0aobMY4vaPBvAaXB54X9w-ACLcBGAs/s1600/llie.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="708" data-original-width="931" height="303" src="https://1.bp.blogspot.com/-tNdLgOkHWM0/XKj6m_2NmYI/AAAAAAAAFcs/9G_0nKihIV0aobMY4vaPBvAaXB54X9w-ACLcBGAs/s400/llie.png" width="400" /></a></div><br />$$\lim\limits_{\varepsilon\to 0} (1+\varepsilon\theta t)^{1/\varepsilon} = \exp(t\theta)$$<br />But here, we can view it in a more general light, and say this is the definition of the <b>exponential map </b>to a Lie group. What exactly is it a map <i>from</i>? I.e. what is the parameterising space? Well, as you can see, it maps an element $i\theta$ to the group parameterised by $\theta$ -- what is $i\theta$? It is<br /><br />$$\lim\limits_{\varepsilon \to 0} \frac{{(1 + i\varepsilon \theta ) - 1}}{\varepsilon }$$<br />I.e. these are the elements span the <i>tangent line</i> to the group at 1. In general, one may have more dimensions to this group, i.e. more parameters to put in the smooth parameterisation -- in this case we have:<br /><br />$$ g(\theta ) = \lim\limits_{\varepsilon \to 0} {\left( {1 + \varepsilon ({t_1}{\theta _1} + \ldots {t_n}{\theta _n})} \right)^{1/\varepsilon }} = \exp \vec \theta $$<br />Where $\vec\theta \in V$, which is a <i>vector space</i> with basis $\langle t_1 \dots t_n \rangle$ -- the tangent space to the group at the identity. This vector space is called the <b>Lie algebra</b> of the Lie group.<br /><br />Take a moment to appreciate the significance of this -- smoothness tells us (sorta) that a function or structure can be determined by the values of all its derivatives at a point. But when you add the group structure -- when you require an exponential structure for the parameterisation, i.e. <b>(1)</b> $g(\theta_1+\theta_2) = g(\theta_1)g(\theta_2)$; <b>(2)</b> $g(r\theta)=g(\theta)^r$; <b>(3)</b> $g(0)=1$ -- just the <i>first</i> derivative, the tangent plane, determines the entire parameterisation. This is precisely analogous to how given that a given smooth function has an exponential structure $e^{tx}$, it can be determined from its first derivative alone. The structure of a Lie group is "fundamentally exponential".<br /><br /><hr /><br />Here's another way to see how the additivity-multiplicativity condition allows the first derivative to determine the entire parameterisation. The Taylor series of the parameterisation is given by:<br /><br />$$g(\theta)=\sum\limits_{k=0}^\infty \frac{g^{(k)}(0)}{k!}\theta^k$$<br />Meanwhile the exponential map is:<br /><br />$$\exp \left(\theta g'(0)\right) =\sum\limits_{k=0}^\infty \frac{\left(\theta g'(0)\right)^k}{k!}$$<br />So a sufficient condition for the two to be equal is:<br /><br />$$g^{(k)}(0)=g'(0)^k$$<br />This is something that is true for exponential functions, of course, but what's the condition for it to be true in general? Writing both sides in limit form and using the Binomial theorem on the right,<br /><br />$$\frac{1}{{{h^k}}}\sum\limits_{k = 0}^k {{\binom{n}{k}}{{( - 1)}^k}g\left( {(n - k)h} \right)} = \frac{1}{{{h^k}}}\sum\limits_{k = 0}^k {{\binom{n}{k}}{{( - 1)}^k}g{{(h)}^{n - k}}g{{(0)}^k}} $$<br />Which is true since $g((n-k)h) = g(h)^{n-k}$ and $g(0)=1$.<br /><br /><hr /><br />(something to note: the "official" word for <em>real-power cyclic</em> is "one-parameter group" or "one-dimensional Lie group". Higher dimensional groups have more generators, i.e. more dimensions)<br /><br />Show, from the $(1+X/r)^r$ definition of the exponential map, that it can be given by the standard Taylor expansion:<br /><br />$$\exp X = 1 + X + \frac{X^2}{2!} + \ldots $$<br />You can't really assume the Binomial theorem (as it is only true on commutative rings, and the ring of $n$-dimensional matrices -- which is the ring that we embed Lie groups and their Lie algebras in -- isn't commutative), but perhaps a weaker result holds? What kind of elements still commute on general rings?differential geometryexponential maplie algebralie groupslie theorymathematicsSat, 06 Apr 2019 16:29:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1590002619267762711Abhimanyu Pallavi Sudhir2019-04-06T16:29:00ZComment by Abhimanyu Pallavi Sudhir on Rindler Metric and Minkowski metric
https://physics.stackexchange.com/questions/440772/rindler-metric-and-minkowski-metric/440779#440779
How are you defining the Riemann curvature? Under the standard definition, invariance under parallel transport is definitionally equivalent to zero Riemann curvature, not something that needs to be shown. More importantly, it seems this answer just shifts/generalises the work to proving that any manifold with zero Riemann curvature is locally Minkowski. While true, this is pretty much exactly what the asker seemed to want a proof of.Sat, 06 Apr 2019 06:23:19 GMThttps://physics.stackexchange.com/questions/440772/rindler-metric-and-minkowski-metric/440779?cid=1058597#440779Abhimanyu Pallavi Sudhir2019-04-06T06:23:19ZAnswer by Abhimanyu Pallavi Sudhir for Binomial product expansion
https://math.stackexchange.com/questions/1331401/binomial-product-expansion/3172053#3172053
0<p>It is not a generalisation of the Binomial theorem because the exponent of <span class="math-container">$c$</span> isn't really handled -- they just took it outside. If you were to expand out the right-hand-side, you would have a generalisation of the Binomial theorem.</p>Tue, 02 Apr 2019 16:08:25 GMThttps://math.stackexchange.com/questions/1331401/-/3172053#3172053Abhimanyu Pallavi Sudhir2019-04-02T16:08:25ZComment by Abhimanyu Pallavi Sudhir on Why isn't acceleration always zero whenever velocity is zero, such as the moment a ball bounces off a wall?
https://physics.stackexchange.com/questions/469488/why-isnt-acceleration-always-zero-whenever-velocity-is-zero-such-as-the-moment
The derivative of the zero <i>function</i> is zero, not the derivative of a function whenever it is zero.Sat, 30 Mar 2019 15:35:24 GMThttps://physics.stackexchange.com/questions/469488/why-isnt-acceleration-always-zero-whenever-velocity-is-zero-such-as-the-moment?cid=1055601Abhimanyu Pallavi Sudhir2019-03-30T15:35:24ZComment by Abhimanyu Pallavi Sudhir on Galilei Invariance and Newton Third Law
https://physics.stackexchange.com/questions/469471/galilei-invariance-and-newton-third-law
Newton's third law is conservation of momentum, which is equivalent to translational invariance by Noether's theorem -- so yes, this follows from your homogeneity assumptions.Sat, 30 Mar 2019 05:07:01 GMThttps://physics.stackexchange.com/questions/469471/galilei-invariance-and-newton-third-law?cid=1055426Abhimanyu Pallavi Sudhir2019-03-30T05:07:01ZAnswer by Abhimanyu Pallavi Sudhir for Intuition for the exponential of a matrix
https://math.stackexchange.com/questions/1213264/intuition-for-the-exponential-of-a-matrix/3165551#3165551
1<p>When I first learned about cyclic groups, the picture that I always had in my head was of the unit circle in the complex plane -- imagine my shock when I realised it wasn't a cyclic group at all! But I really <em>wanted</em> it to be cyclic, because it shared some really interesting properties with cyclic groups (see my post <em><a href="https://thewindingnumber.blogspot.com/2018/12/intuition-analogies-and-abstraction.html" rel="nofollow noreferrer">Intuition, analogies and abstraction</a></em>).</p>
<p>The solution to the problem can be seen directly from the quickest proof that the unit circle isn't cyclic -- the fact that it isn't countable (while the integers are). So here's an idea: let's admit <em>real powers on groups</em>!</p>
<p>Ok, but how? We know the construction of integer powers on an arbitrary group, and we know how real powers work on the unit circle, or the real line (which is also real-power cyclic*, by the way), and it's conventionally equal to <span class="math-container">$x^r=\exp(r\log x)$</span> with <span class="math-container">$\exp$</span> given by its power series expansion.</p>
<p>But sticking just to our intuition for now, it would seem like the natural way to define a real power is to introduce a real-number parameterisation to our group -- for example, the circle group can be parameterised by <span class="math-container">$\theta$</span> and each element of the group is given by some <span class="math-container">$g(\theta)$</span>. Then real powers would look like <span class="math-container">$g(\theta)^r=g(r\theta)$</span>. In the case of a one-parameter group, we also have <span class="math-container">$g(\theta_1+\theta_2)=g(\theta_1)g(\theta_2)$</span>, but don't get too attached to this.</p>
<p>If you think about it, we've now just given some <em>additional structure</em> to our group -- a geometric structure in addition to the group structure.</p>
<p>But frankly, introducing a parameterisation in this way is a bit hand-wavy. We knew what parameterisation to introduce for the circle group because we already have a picture of its geometry in our heads, but in principle, we could've introduced really any kind of ridiculous parameterisation and given it a really ugly structure and an ugly real-power. What we need is a sensible, systematic way to introduce this parameterisation -- i.e. to think about what this parameter space really <em>is</em>.</p>
<p>The answer to the question comes from Euler's formula, which relates addition on the imaginary line to multiplication on the unit circle. </p>
<p><span class="math-container">$$\exp(i\theta)=g(\theta)$$</span></p>
<p>What significance does the imaginary line have to the unit circle? Well, something interesting is that the tangent to the unit circle at 1 is parallel to the imaginary line, i.e. all its elements are of the form <span class="math-container">$1+it$</span>. So an idea for the parameterisation is that the parameter space is the tangent space at the identity of the group -- this is the Lie algebra of the group.</p>
<p>(You still need to prove that this actually works in general -- this has to do with proving that all derivatives of the exponential map at the identity can be recovered as <span class="math-container">$g^{(k)}(0)=(g'(0))^k$</span> -- this is a property of exponential functions of the form <span class="math-container">$g(t)=e^{bt}$</span>, and is part of the "exponential structure" of the Lie Algebra/Lie Group correspondence.)</p>
<p>This is not too bad! It's not completely absurd to think about the "vicinity of the identity" of at least matrix groups, so it's not absurd to think about tangent spaces to these groups. This is where you see arguments like <span class="math-container">$(1+\varepsilon t)^T(1+\varepsilon t)=1+\varepsilon(t+t^T)$</span> implying the tangent space to an Orthogonal Group is an algebra of antisymmetric matrices, etc. -- if you have some notion of perturbing an element in your group, you can construct a Lie algbera parameterisation of it.</p>
<hr>
<p>*To the best of my knowledge, "real-power cyclic" is not a real word -- the conventional term is "one-parameter Lie group".</p>
<p>See my post <a href="https://thewindingnumber.blogspot.com/2019/04/introduction-to-lie-groups.html" rel="nofollow noreferrer">Introduction to Lie groups</a> for a more complete treatment.</p>Thu, 28 Mar 2019 06:29:55 GMThttps://math.stackexchange.com/questions/1213264/-/3165551#3165551Abhimanyu Pallavi Sudhir2019-03-28T06:29:55ZInvariant and generalised eigenspaces; Jordan normal form
https://thewindingnumber.blogspot.com/2019/03/invariant-and-generalised-eigenspaces.html
0We defined "eigenvectors" -- or really "eigenlines" -- in order to understand the behaviour of linear transformations as scalings across certain axes (which may be complex, and the scalings may be complex too). But simply thinking of eigenlines as 1-dimensional spaces that a transformation leaves invariant (the fancy phrase here is: "the 1-dimensional subspaces on which it is an <i>endomorphism</i>"), it is natural to wonder about higher-dimensional invariant spaces -- subspaces on which some transformation $A$ acts as an endomorphism.<br /><br />The problem is that any transformation has all sorts of useless invariant subspaces -- for instance any transformation $A:F^n\to F^m$ (where $m \le n$) has the entirety of $F^n$ as an invariant subspace (for any $F$), and rotations -- although fully described by their eigenvectors and eigenvalues -- have a bunch of real and complex planes as unnecessary invariant subspaces. And if $A$ has an eigenvalue with geometric multiplicity $>1$, there are an infinite number of useless invariant subspaces.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-2XqH5nEblpk/XJuRC6SnkdI/AAAAAAAAFaw/eW2lmNYa3hg0xZV_Cc8XXWZNJ3SPlaZjACLcBGAs/s1600/img-jM6pgi.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="524" data-original-width="856" height="195" src="https://1.bp.blogspot.com/-2XqH5nEblpk/XJuRC6SnkdI/AAAAAAAAFaw/eW2lmNYa3hg0xZV_Cc8XXWZNJ3SPlaZjACLcBGAs/s320/img-jM6pgi.jpg" width="320" /></a></div><br />Specifically, if the goal is to find useful representations of <b>defective matrices</b> (non-diagonalisable), invariant subspaces seem completely useless -- they certainly have no hope of giving us any sort of unique representation. Perhaps more on the point, our "eigenlines" have <b>corresponding eigenvalues</b> that tell us <i>how</i> the transformation behaves within an eigenline. Our invariant subspaces currently have nothing of the sort -- the transformation can have <i>any sort of behaviour</i> on the invariant subspace -- rotation, skewing/scaling, shearing, skewering -- and we'd have no idea. We need a convenient way to <b>write down the behaviour of the transformation on an invariant subspace</b>.<br /><br />Here's something we can start to think about: ordinary eigenvectors satisfy $(A-\lambda I)v=0$, which gives us a one-dimensional solution space. In analogy with solutions to linear differential equations (linear homogenous if you use the conventional terminology, but I reserve "affine" for linear non-homogenous), an equation like<br /><br />$$(A-\lambda_1 I)(A-\lambda_2 I)v=0$$<br />(where ${\lambda _1},{\lambda _2}$ are both eigenvalues of $A$) would have a 2-dimensional solution space, etc.<br /><br /><div class="twn-pitfall">Note that when $\lambda_1, \lambda_2$ are not eigenvalues, we <em>don't</em> have a 2-dimensional solution space (what does the solution space look like then?). Why does it work with differential equations for any $\lambda_1, \lambda_2$? (hint: what do the eigenvalues look like?)</div><br />It's sensible to ask: are these solution spaces the same as our invariant subspaces? I.e. is every member of a $k$-dimensional invariant subspace a solution to an equation of the form<br /><br />$$(A - {\lambda _1}I)...(A - {\lambda _k}I)v = 0$$<br />for eigenvalues $\lambda_1,...\lambda_k$?<br /><br />The answer is <i>yes</i>. I encourage you to try and prove it for yourself -- it is instructive to first consider special cases: (i) rotation in a plane, where indeed $(A-iI)(A+iI)=A^2+1=0$ is the minimal polynomial of $A$ (ii) more generally, $F^n$ is an invariant subspace for all isomorphisms, and indeed for all $v$ in this subspace (i.e. $\forall v \in F^n$), $p(A)v=0$ where $p$ is the characteristic polynomial of $A$ <b>by the Cayley-Hamilton theorem</b>.<br /><br />The key to proving that every invariant subspace is given by solutions to an equation of the form $(A - {\lambda _1}I)...(A - {\lambda _k}I)v = 0$ (and vice versa) lies in recognising that on any $k$-dimensional invariant subspace, $A$ is acts as an endomorphism, and therefore the Cayley-Hamilton theorem applies to it, with a $k$-order characteristic polynomial.<br /><br /><div class="twn-furtherinsight">I encourage you to spend some time thinking about this -- try relating it to differential equations. Come up with another proof of the statement -- an inductive one. See if this results in a better intuition for the Cayley-Hamilton theorem.</div><br /><div class="twn-furtherinsight">Is it true that there $2^n$ invariant subspaces of any transformation on an $n$-dimensional linear space? What about the identity transformation?</div><br /><hr /><br />Now, let's discard the invariant subspaces we don't want. We already know how to handle cases with distinct eigenvalues -- i.e. we have distinct eigenvalues in $\lambda_1...\lambda_k$ -- we just get an eigenvector for each eigenvalue. So we're really just concerned with subspaces of the form ${(A - \lambda I)^k}v = 0$. This is analogous to linear differential equations with repeated roots being weirder than ones with distinct roots.<br /><br />Note that we still know how to handle ${(A - \lambda I)^k}v = 0$-like equations when the algebraic multiplicity is accounted for by geometric multiplicity -- when this is the case, you can reduce the power from $k$ (by subtracting from it the geometric multiplicity). This nuance doesn't exist with differential equations, because distinct eigenvectors have distinct eigenvalues.<br /><br /><b>Vectors satisfying such an equation are called generalised eigenvectors</b> of order $k$ where $k$ is the minimum value for which it does satisfy the equation, and the invariant subspaces formed by generalised eigenvectors of the same eigenvalue are called <b>generalised eigenspaces</b>. The dimension of the generalised eigenspace always equals the algebraic multiplicity, unlike the eigenspace, whose dimension equals the geometric multiplicity. <br /><br /><div class="twn-furtherinsight">Check this, and that $k$ is the difference between algebraic and geometric multiplicity.</div><br />What kind of transformations precisely do generalised eigenvectors with degree greater than 1 correspond to? Clearly, skews and rotations are out of the question. But some insight can be gained from looking at the nature of skews and rotations on a plane.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-TnN1ZoPfEog/XJvBG5uw8AI/AAAAAAAAFbQ/WDUumo2IU4wsp8YSSpr3JuXINgl-scXvACLcBGAs/s1600/shearskewrot2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1380" data-original-width="1020" height="640" src="https://2.bp.blogspot.com/-TnN1ZoPfEog/XJvBG5uw8AI/AAAAAAAAFbQ/WDUumo2IU4wsp8YSSpr3JuXINgl-scXvACLcBGAs/s640/shearskewrot2.png" width="472" /></a></div><br />In two dimensions, a characteristic polynomial with a positive discriminant yields a skew along some axis, a negative discriminant yields a rotation, and the case we're interested -- the presence of repeated roots -- corresponds to the point "between" skews and rotations (speaking hand-wavily), shears.<br /><br />In a more general setting, if one has: ${(A - \lambda I)^k}v_k = 0$ for generalised eigenvector $v_k$of degree $k$, then one can extract generalised eigenvectors of each degree lower:<br /><br />$$\begin{array}{l}{(A - \lambda )^i}{(A - \lambda )^{k - i}}{v_k} = 0<br />\\ \Rightarrow {v_i} = {(A - \lambda I)^{k - i}}{v_k}\end{array}$$<br />Implying that the generalised eigenvectors with the same eigenvalue (can) form a basis for the corresponding generalised eigenspace.<br /><br />$$\begin{array}{*{20}{r}}{(A - \lambda I){v_1} = 0 \Leftrightarrow A{v_1} = \lambda {v_1}}\\{{{(A - \lambda I)}^2}{v_2} = 0 \Leftarrow (A - \lambda I){v_2} = {c_1}{v_1} \Leftrightarrow A{v_2} = {c_1}{v_1} + \lambda {v_2}}\\{{{(A - \lambda I)}^3}{v_3} = 0 \Leftarrow (A - \lambda I){v_3} = {c_2}{v_2} \Leftrightarrow A{v_3} = {c_2}{v_2} + \lambda {v_3}}\\{ \vdots \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,}\\{{{(A - \lambda I)}^k}{v_k} = 0 \Leftarrow (A - \lambda I){v_k} = {c_{k - 1}}{v_{k - 1}} \Leftrightarrow A{v_k} = {c_{k - 1}}{v_{k - 1}} + \lambda {v_k}}\end{array}$$<br />This gives us a very clear picture of how these general "sheary" transformations look in the basis of the generalised eigenvectors -- there is a shear in each plane $\left\langle {{v_{k - 1}},{v_k}} \right\rangle$. Suitable scalings could of course be chosen to make all the $c_i=1$.<br /><br />It's hard to overstate the significance of this -- what we've just found is that <i>any</i> defective matrix can be decomposed into shears on each of its generalised eigenspaces. This <i>completely classifies</i> the diagonalisability of matrices. If it's defective, it's a shear.<br /><br /><div class="twn-furtherinsight">Draw out some of these transformations in three or more dimensions.</div><br /><div class="twn-furtherinsight">Notice the directions of the implication signs -- can we make them double-sided? What if we have some geometric multiplicity? How would our shears look like then? How many dimensions do you need to visualise this?</div><br /><div class="twn-furtherinsight">Think about why this characterisation of defective matrices makes sense. What effect does adding a 1 to the subdiagonal have on the determinant? Why? (hint: area of a parallelogram) What about the other coefficients of the characteristic polynomial? (hint: think of these in terms of traces of some matrix). So these matrices are precisely those which have the same characteristic polynomial as a diagonalisable matrix without actually being similar to one.</div><br /><hr /><br />Clearly, the generalised eigenspaces of a transformation are pairwise disjoint (i.e. intersect only at the origin). Since all eigenvalues are being considered, their union is all of $F^n$. Thus the union of their bases forms a basis for $F^n$. This gives us a representation of the transformation $A$ in this basis.<br /><br />From the last section, it is clear that the effective transformation on a $k$-dimensional eigenspace with eigenvalue $\lambda$ (called a "<b>Jordan block</b>") is given by:<br /><br />$$\left[ {\begin{array}{*{20}{c}}\lambda &0&0& \cdots &0\\1&\lambda &0& \cdots &0\\0&1&\lambda & \cdots &0\\ \vdots & \vdots & \vdots & \ddots & \vdots \\0&0&0& \cdots &\lambda \end{array}} \right]$$<br />The <b>Jordan normal form</b> of a transformation is then the matrix formed by putting all Jordan normal blocks along the diagonal -- i.e. the representation of $A$ in its generalised eigenbasis.<br /><br /><div class="twn-furtherinsight">Some writers define the Jordan normal form with the 1's on the <em>super</em>diagonal. It should be clear to you from the work we've done that this form is obtained by taking the basis vectors in reverse order (i.e. changing the basis to $\langle e_n,...,e_1 \rangle$. If this isn't clear to you, go back and work through the previous section once more. Loop.</div><br />(Stay tuned for the <a href="https://thewindingnumber.blogspot.com/2019/02/all-matrices-can-be-diagonalised.html">next article</a> to see how we can -- instead of defining generalised eigenspaces whose dimension is defined by the algebraic, rather than geometric multiplicity -- "force" algebraic multiplicity to equal geometric multiplicity, i.e. diagonalise any matrix -- with a ring extension.)diagonalisationeigenvaluesgeneralised eigenspacesgeneralised eigenvectorsinvariant subspacejordan normal formlinear algebraWed, 27 Mar 2019 18:15:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7402349937565513701Abhimanyu Pallavi Sudhir2019-03-27T18:15:00ZComment by Abhimanyu Pallavi Sudhir on What's the generalisation of the quotient rule for higher derivatives?
https://math.stackexchange.com/questions/5357/whats-the-generalisation-of-the-quotient-rule-for-higher-derivatives/3128733#3128733
Ah, wait -- it seems it can be proven from the expression in giuseppe's answer.Sun, 10 Mar 2019 15:32:07 GMThttps://math.stackexchange.com/questions/5357/whats-the-generalisation-of-the-quotient-rule-for-higher-derivatives/3128733?cid=6473740#3128733Abhimanyu Pallavi Sudhir2019-03-10T15:32:07ZComment by Abhimanyu Pallavi Sudhir on What's the generalisation of the quotient rule for higher derivatives?
https://math.stackexchange.com/questions/5357/whats-the-generalisation-of-the-quotient-rule-for-higher-derivatives/3128733#3128733
Did you find it?Mon, 04 Mar 2019 22:00:03 GMThttps://math.stackexchange.com/questions/5357/whats-the-generalisation-of-the-quotient-rule-for-higher-derivatives/3128733?cid=6459523#3128733Abhimanyu Pallavi Sudhir2019-03-04T22:00:03ZAnswer by Abhimanyu Pallavi Sudhir for What's the generalisation of the quotient rule for higher derivatives?
https://math.stackexchange.com/questions/5357/whats-the-generalisation-of-the-quotient-rule-for-higher-derivatives/3131947#3131947
1<p>I'm checking @Mohammad Al Jamal's formula with SymPy, and I can verify it's true (barring a missing <span class="math-container">$(-1)^k$</span> term) for up to <span class="math-container">$n = 16$</span>, at least (it gets really slow after that).</p>
<pre>
import sympy as sp
k = sp.Symbol('k'); x = sp.Symbol('x'); f = sp.Function('f'); g = sp.Function('g')
n = 0
while True:
fgn = sp.diff(f(x) / g(x), x, n)
guess = sp.summation((-1) ** k * sp.binomial(n + 1, k + 1) \
* sp.diff(f(x) * (g(x)) ** k, x, n)/(g(x) ** (k + 1)), (k, 0, n))
print("{} for n = {}".format(sp.expand(guess - fgn) == 0, n))
n += 1
</pre>
<p>This is quite surprising to me -- I didn't expect there to be such a simple and straightforward expression for <span class="math-container">$(f(x)/g(x))^{(n)}$</span>, and haven't seen his formula anywhere before. I tried some inductive proofs, but I haven't succeeded in proving it yet.</p>Sat, 02 Mar 2019 00:15:06 GMThttps://math.stackexchange.com/questions/5357/-/3131947#3131947Abhimanyu Pallavi Sudhir2019-03-02T00:15:06ZComment by Abhimanyu Pallavi Sudhir on Why didn't Lorentz conclude that no object can go faster than light?
https://physics.stackexchange.com/questions/461833/why-didnt-lorentz-conclude-that-no-object-can-go-faster-than-light
I don't think this should be migrated -- there's a serious question here about how relativity -- the <i>postulates</i> of relativity -- actually leads to the conclusion that no object can go faster than light. The explanation is not historical, I think.Wed, 20 Feb 2019 21:53:41 GMThttps://physics.stackexchange.com/questions/461833/why-didnt-lorentz-conclude-that-no-object-can-go-faster-than-light?cid=1037208Abhimanyu Pallavi Sudhir2019-02-20T21:53:41ZComment by Abhimanyu Pallavi Sudhir on Why didn't Lorentz conclude that no object can go faster than light?
https://physics.stackexchange.com/questions/461833/why-didnt-lorentz-conclude-that-no-object-can-go-faster-than-light/461863#461863
@JdeBP Funny! What search term did you use to uncover that?Wed, 20 Feb 2019 18:03:26 GMThttps://physics.stackexchange.com/questions/461833/why-didnt-lorentz-conclude-that-no-object-can-go-faster-than-light/461863?cid=1037125#461863Abhimanyu Pallavi Sudhir2019-02-20T18:03:26ZAnswer by Abhimanyu Pallavi Sudhir for Why didn't Lorentz conclude that no object can go faster than light?
https://physics.stackexchange.com/questions/461833/why-didnt-lorentz-conclude-that-no-object-can-go-faster-than-light/461863#461863
12<p>Because typically if you find an expression that seems to break down at some value of <span class="math-container">$v$</span>, you would conclude that the expression simply loses its validity for that value of <span class="math-container">$v$</span>, not that the value isn't attainable. Presumably this was the conclusion of Lorentz and others.</p>
<p>The reason Einstein concluded otherwise is that special relativity gives a physical argument for "superluminal speeds are equivalent to time running backwards" -- the argument is "does a superluminal ship hit the iceberg before or after its headlight does?" </p>
<p>This depends on the observer, and because the headlight would melt the iceberg, the consequences of each observation are noticeably different. The only possible conclusions are "superluminal ships don't exist", "time runs backwards for superluminal observers", or "iceberg-melting headlights don't exist".</p>Wed, 20 Feb 2019 10:43:02 GMThttps://physics.stackexchange.com/questions/461833/-/461863#461863Abhimanyu Pallavi Sudhir2019-02-20T10:43:02ZAll matrices can be diagonalised over R[X]/(X^n)
https://thewindingnumber.blogspot.com/2019/02/all-matrices-can-be-diagonalised.html
0This post follows from my answer to the math stackexchange question <a href="https://math.stackexchange.com/questions/472915/what-kind-of-matrices-are-non-diagonalizable/3097881" style="font-weight: bold;">What kind of matrices are non-diagonalisable?</a><br /><br /><hr />Non-diagonalisable 2 by 2 matrices can be diagonalised over the <a href="https://en.wikipedia.org/wiki/Dual_number">dual numbers</a> -- and the "weird cases" like the Galilean transformation are not fundamentally different from the nilpotent matrices.<br /><br />The intuition here is that the Galilean transformation is sort of a "boundary case" between real-diagonalisability (skews) and complex-diagonalisability (rotations) (which you can sort of think in terms of discriminants). In the case of the Galilean transformation $\left[\begin{array}{*{20}{c}}{1}&{v}\\{0}&{1}\end{array}\right]$, it's a small perturbation away from being diagonalisable, i.e. it sort of has "repeated eigenvectors" (you can visualise this with <a href="https://shadanan.github.io/MatVis/">MatVis</a>). So one may imagine that the two eigenvectors are only an "epsilon" away, where $\varepsilon$ is the unit dual satisfying $\varepsilon^2=0$ (called the "soul"). Indeed, its characteristic polynomial is:<br /><br />$$(\lambda-1)^2=0$$<br />Whose solutions among the dual numbers are $\lambda=1+k\varepsilon$ for real $k$. So one may "diagonalise" the Galilean transformation over the dual numbers as e.g.:<br /><br />$$\left[\begin{array}{*{20}{c}}{1}&{0}\\{0}&{1+v\varepsilon}\end{array}\right]$$<br />Granted this is not unique, this is formed from the change-of-basis matrix $\left[\begin{array}{*{20}{c}}{1}&{1}\\{0}&{\epsilon}\end{array}\right]$, but any vector of the form $(1,k\varepsilon)$ is a valid eigenvector. You could, if you like, consider this a canonical or "principal value" of the diagonalisation, and in general each diagonalisation corresponds to a limit you can take of real/complex-diagonalisable transformations. Another way of thinking about this is that there is an entire eigenspace spanned by $(1,0)$ and $(1,\varepsilon)$ in that little gap of multiplicity. In this sense, the geometric multiplicity is forced to be equal to the algebraic multiplicity*.<br /><br />Then a nilpotent matrix with characteristic polynomial $\lambda^2=0$ has solutions $\lambda=k\varepsilon$, and is simply diagonalised as:<br /><br />$$\left[\begin{array}{*{20}{c}}{0}&{0}\\{0}&{\varepsilon}\end{array}\right]$$<br />(Think about this.) Indeed, the resulting matrix has minimal polynomial $\lambda^2=0$, and the eigenvectors are as before.<br /><br /><hr /><br />What about higher dimensional matrices? Consider:<br /><br />$$\left[ {\begin{array}{*{20}{c}}0&v&0\\0&0&w\\0&0&0\end{array}} \right]$$<br />This is a nilpotent matrix $A$ satisfying $A^3=0$ (but not $A^2=0$). The characteristic polynomial is $\lambda^3=0$. Although $\varepsilon$ might seem like a sensible choice, it doesn't really do the trick -- if you try a diagonalisation of the form $\mathrm{diag}(0,v\varepsilon,w\varepsilon)$, it has minimal polynomial $A^2=0$, which is wrong. Indeed, you won't be able to find three linearly independent eigenvectors to diagonalise the matrix this way -- they'll all take the form $(a+b\varepsilon,0,0)$.<br /><br />Instead, you need to consider a generalisation of the dual numbers, sometimes called (in computing mathematics and non-standard analysis) the "hyperdual numbers", with the soul satisfying $\epsilon^n=0$. Then the diagonalisation takes for instance the form:<br /><br />$$\left[ {\begin{array}{*{20}{c}}0&0&0\\0&{v\epsilon}&0\\0&0&{w\epsilon}\end{array}} \right]$$<br /><hr /><br />*Over the reals and complexes, when one defines algebraic multiplicity (as "the multiplicity of the corresponding factor in the characteristic polynomial"), there is a single eigenvalue corresponding to that factor. This is of course no longer true over the hyperdual numbers, because they are not a field, and $ab=0$ no longer implies "$a=0$ or $b=0$".<br /><br />In general, if you want to prove things about these numbers, the way to formalise them is by constructing them as the quotient $\mathbb{R}[X]/(X^n)$, so you actually have something clear to work with.<br /><br />(Perhaps relevant: <a href="https://math.stackexchange.com/questions/46078/grassmann-numbers-as-eigenvalues-of-nilpotent-operators">Grassmann numbers as eigenvalues of nilpotent operators</a> -- the Hyperdual numbers are not the same as the Grassmann numbers, and the algebra of the Grassmann numbers is definitely different from that of nilpotent and shear matrices, but go see if you can make sense of it.)<br /><br />Something important to note is that the diagonalisation is not of the form $D=P^{-1}AP$, as the eigenvector matrices are not invertible. However, it is still true that $PD=AP$ -- nonetheless, this limitation prevents this formalism for being any good for e.g. dealing with polynomial-ish differential equations with repeated roots, for instance, as far as I can see. The infinitesimal-perturbation/"take a limit" approach we talked about in <a href="https://thewindingnumber.blogspot.com/2018/03/repeated-roots-of-differential-equations.html"><b>Limiting Cases II: repeated roots of a differential equation</b></a> are still the right approach for that.diagonalisationeigenvaluesgrassmann numbersjordan normal formlinear algebramatricesWed, 06 Feb 2019 17:25:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7617724931922457464Abhimanyu Pallavi Sudhir2019-02-06T17:25:00ZAnswer by Abhimanyu Pallavi Sudhir for What kind of matrices are non-diagonalizable?
https://math.stackexchange.com/questions/472915/what-kind-of-matrices-are-non-diagonalizable/3097881#3097881
3<p><strong>Edit:</strong> The algebra I speak of here is <em>not</em> actually the Grassmann numbers at all -- they are <span class="math-container">$\mathbb{R}[X]/(X^n)$</span>, whose generators <em>don't</em> satisfy the anticommutativity relation even though they satisfy all the nilpotency relations. The dual-number stuff for 2 by 2 is still correct, just ignore my use of the word "Grassmann".</p>
<hr>
<p>Non-diagonalisable 2 by 2 matrices can be diagonalised over the <a href="https://en.wikipedia.org/wiki/Dual_number" rel="nofollow noreferrer">dual numbers</a> -- and the "weird cases" like the Galilean transformation are not fundamentally different from the nilpotent matrices.</p>
<p>The intuition here is that the Galilean transformation is sort of a "boundary case" between real-diagonalisability (skews) and complex-diagonalisability (rotations) (which you can sort of think in terms of discriminants). In the case of the Galilean transformation <span class="math-container">$\left[\begin{array}{*{20}{c}}{1}&{v}\\{0}&{1}\end{array}\right]$</span>, it's a small perturbation away from being diagonalisable, i.e. it sort of has "repeated eigenvectors" (you can visualise this with <a href="https://shadanan.github.io/MatVis/" rel="nofollow noreferrer">MatVis</a>). So one may imagine that the two eigenvectors are only an "epsilon" away, where <span class="math-container">$\varepsilon$</span> is the unit dual satisfying <span class="math-container">$\varepsilon^2=0$</span> (called the "soul"). Indeed, its characteristic polynomial is:</p>
<p><span class="math-container">$$(\lambda-1)^2=0$$</span></p>
<p>Whose solutions among the dual numbers are <span class="math-container">$\lambda=1+k\varepsilon$</span> for real <span class="math-container">$k$</span>. So one may "diagonalise" the Galilean transformation over the dual numbers as e.g.:</p>
<p><span class="math-container">$$\left[\begin{array}{*{20}{c}}{1}&{0}\\{0}&{1+v\varepsilon}\end{array}\right]$$</span></p>
<p>Granted this is not unique, this is formed from the change-of-basis matrix <span class="math-container">$\left[\begin{array}{*{20}{c}}{1}&{1}\\{0}&{\epsilon}\end{array}\right]$</span>, but any vector of the form <span class="math-container">$(1,k\varepsilon)$</span> is a valid eigenvector. You could, if you like, consider this a canonical or "principal value" of the diagonalisation, and in general each diagonalisation corresponds to a limit you can take of real/complex-diagonalisable transformations. Another way of thinking about this is that there is an entire eigenspace spanned by <span class="math-container">$(1,0)$</span> and <span class="math-container">$(1,\varepsilon)$</span> in that little gap of multiplicity. In this sense, the geometric multiplicity is forced to be equal to the algebraic multiplicity*.</p>
<p>Then a nilpotent matrix with characteristic polynomial <span class="math-container">$\lambda^2=0$</span> has solutions <span class="math-container">$\lambda=k\varepsilon$</span>, and is simply diagonalised as:</p>
<p><span class="math-container">$$\left[\begin{array}{*{20}{c}}{0}&{0}\\{0}&{\varepsilon}\end{array}\right]$$</span></p>
<p>(Think about this.) Indeed, the resulting matrix has minimal polynomial <span class="math-container">$\lambda^2=0$</span>, and the eigenvectors are as before.</p>
<hr>
<p>What about higher dimensional matrices? Consider:</p>
<p><span class="math-container">$$\left[ {\begin{array}{*{20}{c}}0&v&0\\0&0&w\\0&0&0\end{array}} \right]$$</span></p>
<p>This is a nilpotent matrix <span class="math-container">$A$</span> satisfying <span class="math-container">$A^3=0$</span> (but not <span class="math-container">$A^2=0$</span>). The characteristic polynomial is <span class="math-container">$\lambda^3=0$</span>. Although <span class="math-container">$\varepsilon$</span> might seem like a sensible choice, it doesn't really do the trick -- if you try a diagonalisation of the form <span class="math-container">$\mathrm{diag}(0,v\varepsilon,w\varepsilon)$</span>, it has minimal polynomial <span class="math-container">$A^2=0$</span>, which is wrong. Indeed, you won't be able to find three linearly independent eigenvectors to diagonalise the matrix this way -- they'll all take the form <span class="math-container">$(a+b\varepsilon,0,0)$</span>.</p>
<p>Instead, you need to consider a generalisation of the dual numbers, called the Grassmann numbers, with the soul satisfying <span class="math-container">$\epsilon^n=0$</span>. Then the diagonalisation takes for instance the form:</p>
<p><span class="math-container">$$\left[ {\begin{array}{*{20}{c}}0&0&0\\0&{v\epsilon}&0\\0&0&{w\epsilon}\end{array}} \right]$$</span></p>
<hr>
<p>*Over the reals and complexes, when one defines algebraic multiplicity (as "the multiplicity of the corresponding factor in the characteristic polynomial"), there is a single eigenvalue corresponding to that factor. This is of course no longer true over the Grassmann numbers, because they are not a field, and <span class="math-container">$ab=0$</span> no longer implies "<span class="math-container">$a=0$</span> or <span class="math-container">$b=0$</span>".</p>
<p>In general, if you want to prove things about these numbers, the way to formalise them is by constructing them as the quotient <span class="math-container">$\mathbb{R}[X]/(X^n)$</span>, so you actually have something clear to work with.</p>
<p>(Perhaps relevant: <a href="https://math.stackexchange.com/questions/46078/grassmann-numbers-as-eigenvalues-of-nilpotent-operators">Grassmann numbers as eigenvalues of nilpotent operators?</a> -- discussing the fact that the Grassmann numbers are not a field).</p>
<p>You might wonder if this sort of approach can be applicable to LTI differential equations with repeated roots -- after all, their characteristic matrices are exactly of this Grassmann form. As pointed out in the comments, however, this diagonalisation is still not via an invertible change-of-basis matrix, it's still only of the form <span class="math-container">$PD=AP$</span>, not <span class="math-container">$D=P^{-1}AP$</span>. I don't see any way to bypass this. See my posts <a href="https://thewindingnumber.blogspot.com/2019/02/all-matrices-can-be-diagonalised.html" rel="nofollow noreferrer">All matrices can be diagonalised</a> (a re-post of this answer) and <a href="https://thewindingnumber.blogspot.com/2018/03/repeated-roots-of-differential-equations.html" rel="nofollow noreferrer">Repeated roots of differential equations</a> for ideas, I guess.</p>Sat, 02 Feb 2019 22:17:56 GMThttps://math.stackexchange.com/questions/472915/-/3097881#3097881Abhimanyu Pallavi Sudhir2019-02-02T22:17:56ZComment by Abhimanyu Pallavi Sudhir on Does the derivation of the Lorentz transformation depend on space having at least two spatial dimensions?
https://physics.stackexchange.com/questions/324076/does-the-derivation-of-the-lorentz-transformation-depend-on-space-having-at-leas
See my answer <a href="https://physics.stackexchange.com/questions/455712/relativity-from-a-basic-assumption/455753#455753">here</a>. In general, I think thinking about more than two spacetime dimensions in special relativity isn't really necessary or useful for someone learning it for the first time.Sat, 26 Jan 2019 08:23:04 GMThttps://physics.stackexchange.com/questions/324076/does-the-derivation-of-the-lorentz-transformation-depend-on-space-having-at-leas?cid=1025310Abhimanyu Pallavi Sudhir2019-01-26T08:23:04ZAnswer by Abhimanyu Pallavi Sudhir for Relativity from a basic assumption
https://physics.stackexchange.com/questions/455712/relativity-from-a-basic-assumption/455753#455753
1<p>I will give <em>a</em> derivation of the Lorentz boosts requiring (what at least seem to be) minimal assumptions, and we will look at what assumptions we used, and see if some of them can be derived from each other, etc. Note that by "the Lorentz transformations", I mean the Lorentz transformation of spacetime position -- Lorentz transformations of other four-touples (i.e. proving that they are Lorentz vectors) would require other assumptions, of course. I've given a more full explanation of the derivation <a href="https://thewindingnumber.blogspot.com/2017/09/introduction-to-special-relativity.html" rel="nofollow noreferrer">here</a>.</p>
<p><strong>(a)</strong> The first important fact you need to prove anything about the Lorentz transformations is that they are linear. Linearity is logically equivalent to the following conditions: (under the transformation),</p>
<ul>
<li><strong>all straight lines remain straight lines</strong> -- the physical interpretation of this is that if an object's velocity is constant in one inertial reference frame, it is constant in all inertial reference frames. This follows from the <em>principle of relativity</em>.</li>
<li><strong>the origin remains fixed</strong> -- this is true by definition of the transformations we are considering -- boosts passing through the same origin.</li>
</ul>
<p>With this, we know that we can use a matrix to write down the Lorentz transformations. Which matrix?</p>
<p><strong>(b)</strong> The tilt/angle of the <span class="math-container">$t'$</span>, <span class="math-container">$x'$</span> axes with respect to the <span class="math-container">$t$</span>, <span class="math-container">$x$</span> axes. The tilt of the <span class="math-container">$t'$</span> axes follows from the definition of velocity as the gradient of the worldline. To prove the tilt of the <span class="math-container">$x'$</span> axis is equal to this tilt, we need to first define the <span class="math-container">$x'$</span> axis within the unprimed co-ordinate system. </p>
<p>This is possible by considering invariant features under a boost, i.e. from the principle of relativity -- the obvious invariant is as follows: if you had emitted a light ray <span class="math-container">$a$</span> seconds in the past, it reflects off some object and returns to you <span class="math-container">$a$</span> seconds in the future, it was on your x-axis at time 0.</p>
<p><a href="https://i.stack.imgur.com/zC7TS.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/zC7TS.png" alt="enter image description here"></a></p>
<p>By the principle of relativity, this should apply in the primed reference frame as well. By the invariance of the speed of light, the slope of the light ray is the same in the primed reference frame. Now figuring out the angle of tilt of the <span class="math-container">$x'$</span> axis becomes an exercise in geometry.</p>
<p><a href="https://i.stack.imgur.com/QvRjN.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/QvRjN.png" alt="enter image description here"></a></p>
<p>And it's easy to prove, by drawing an appropriate circle, that the two tilts are equal.</p>
<p><strong>(c)</strong> We now know the lines the column vectors of our matrix land on -- they are multiples of <span class="math-container">$(1, v)$</span> and <span class="math-container">$(v, 1)$</span>, but which vector on that line exactly? In other words, what's the scale on the axes? This requires one extra assumption: if you boost into the frame with velocity <span class="math-container">$v$</span>, then boost <span class="math-container">$-v$</span> back, that's equivalent to not boosting at all, i.e. <span class="math-container">$L(v)L(-v)=I$</span>. Then it's just computation:</p>
<p><span class="math-container">\begin{gathered}
\left[ {\begin{array}{*{20}{c}}
1&0 \\
0&1
\end{array}} \right] = \left[ {\begin{array}{*{20}{c}}
\alpha &{\beta v} \\
{\alpha v}&\beta
\end{array}} \right]\left[ {\begin{array}{*{20}{c}}
\alpha &{ - \beta v} \\
{ - \alpha v}&\beta
\end{array}} \right] = \left[ {\begin{array}{*{20}{c}}
{{\alpha ^2} - \alpha \beta {v^2}}&{{\beta ^2}v - \alpha \beta v} \\
{{\alpha ^2}v - \alpha \beta v}&{{\beta ^2} - \alpha \beta {v^2}}
\end{array}} \right] \hfill \\
{\alpha ^2}v - \alpha \beta v = 0 = {\beta ^2}v - \alpha \beta v \Rightarrow {\alpha ^2} = \alpha \beta = {\beta ^2} \Rightarrow \alpha = \beta \hfill \\
{\alpha ^2} - \alpha \beta {v^2} = 1 = {\beta ^2} - \alpha \beta {v^2} \Rightarrow {\alpha ^2} = 1 + \alpha \beta {v^2} = {\beta ^2} \Rightarrow {\alpha ^2} = 1 + {\alpha ^2}{v^2} \hfill \\
\Rightarrow \alpha = \beta = \frac{1}{{\sqrt {1 - {v^2}} }} \hfill \\
\end{gathered}</span></p>
<p>Then the change of basis matrix is simply the inverse of this matrix, which is:</p>
<p><span class="math-container">$$\Lambda=\gamma \left[ {\begin{array}{*{20}{c}}
1&-v \\
-v&1
\end{array}} \right]$$</span></p>
<p>Or:</p>
<p><span class="math-container">\begin{gathered}
x' = \gamma \left( {x - vt} \right) \\
t' = \gamma \left( {t - vx} \right) \\
\end{gathered}</span></p>
<p><strong>(d)</strong> There's still one final step, however -- we need to verify that <span class="math-container">$y$</span> and <span class="math-container">$z$</span> aren't transformed under the Lorentz boost. To prove this, consider two twins with paintbrushes running towards each other, painting the wall at waist level -- if the orthogonal axis were transformed in any way, each twin would see his paint-streak as above the other's -- the fact that the paint-streaks' relative positioning can't be different can be seen, e.g. from supposing that the two paints cause an explosion in the mix. The fact that the presence of explosions (or any boolean quantity) is invariant under Lorentz transformations is a consequence of the principle of relativity.</p>
<hr>
<p>We used three physical assumptions:</p>
<ul>
<li>The principle of relativity</li>
<li>The invariance of the speed of light</li>
<li><span class="math-container">$L(v)L(-v)=L(0)$</span>, or "if I see you moving at <span class="math-container">$v$</span>, you see me moving at <span class="math-container">$-v$</span>"</li>
</ul>
<p>The first two are the assumptions you wanted. As far as I can see, the last assumption can't really be proven from the other two -- it requires some sort of symmetry principle. But that's okay.</p>Mon, 21 Jan 2019 22:48:47 GMThttps://physics.stackexchange.com/questions/455712/-/455753#455753Abhimanyu Pallavi Sudhir2019-01-21T22:48:47ZPi and collisions (the 3blue1brown problem)
https://thewindingnumber.blogspot.com/2019/01/pi-and-collisions-3blue1brown-problem.html
0Unless you've been living under a rock, you've probably heard of this problem -- perhaps from 3blue1brown (<a href="https://www.youtube.com/watch?v=HEfHFsfGXjs">link</a>) -- we have a wall (i.e. a thing with infinite mass), and two rocks of mass $m$ and some large multiple $Nm$. The smaller mass $m$ starts out stationary, while $Nm$ has some velocity $w$ in the direction towards the wall. The collisions are elastic. The question is to count the number of collisions there are as $N\to\infty$ (i.e. it approaches the rock you were living under) -- as Grant Sanderson (mysteriously, via some fancy thing he calls "digits of a number") tells us in the video, it approaches $\pi\sqrt{N}$.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-AgsqTm57nsM/XEOkNnITTmI/AAAAAAAAFVI/xWyKhdOI0WkGqi_SPGHT6hEF64FuwaQwgCLcBGAs/s1600/mnm.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="476" data-original-width="731" height="208" src="https://4.bp.blogspot.com/-AgsqTm57nsM/XEOkNnITTmI/AAAAAAAAFVI/xWyKhdOI0WkGqi_SPGHT6hEF64FuwaQwgCLcBGAs/s320/mnm.png" width="320" /></a></div>If you haven't watched the linked problem video (the solution isn't revealed in it), you should -- the animations are great. I'll assume in this answer you have a full understanding of what the problem is. Not a tall order.<br /><br />I could actually give you a picture proof right now -- the solution is that amazing -- but I won't. Let's build our insight up to it, so when you see it, you are ready.<br /><br />The moment I think of $\pi$, I think of circles. Well, where are the circles here (besides my shoddy drawings of the two balls)? Here's another thing to think about: how do you solve for any result of a collision? You consider <b>conservation of momentum and energy</b>, of course. Aha, that should click in your mind -- conservation of energy is sort of like a circle, it's an ellipse! The condition:<br /><br />$$\frac12mv_m^2+\frac12Nmv_{Nm}^2=\frac12Nmw^2$$<br />Is the equation for an ellipse. And conservation of momentum is the equation for a line. But there are <b>two sorts of collisions</b> that can occur in this system: collisions between $m$ and $Nm$, and collisions between $m$ and $\infty$. The former conserves momentum, the latter does not (I mean, the law isn't violated, but the momentum of just the two balls together isn't conserved -- it is <b>transferred to the wall</b>). Respectively under the two collisions, we have:<br /><br />$$mv_m+Nmv_{nm}=C\\<br />Nmv_{Nm}=C$$<br />Now, the <i>idea</i> we have is that when we impose conservation of energy and momentum, we are solving for the <b>intersection points</b> of the ellipse and the line -- one intersection point is the pre-collision configuration of velocities, and the other is the one after collision. So the idea we have in our mind is that of a bunch of lines, each corresponding to a different momentum value (because that is not conserved) intersecting a single ellipse, and we want to <b>count the number of intersections</b>.<br /><br /><div style="text-align: center;"><iframe frameborder="0" height="500px" src="https://www.desmos.com/calculator/hjsowcabsk?embed" style="border: 1px solid #ccc;" width="500px"></iframe><br /></div><br />The key idea here is that the collisions, as they occur <b>in real time</b>, correspond to the bouncing of an object off the ellipse as it moves across the lines -- first across the slanted blue line, bouncing off the ellipse, then across the red line, then bouncing off the ellipse onto the green line, then the other blue line, and then its last collision.<br /><br />However, to say this with confidence, we need to be sure that we can really <b>map every collision onto an intersection in the velocity space</b> above, we need the following lemma:<br /><br /><hr /><b>Lemma: </b>Among the inter-collision periods, any given configuration of velocities occurs no more than once, i.e. the velocity space configurations are unique (so there is a bijection (well, the surjection is obvious) between the intersections and the collisions).<br /><br /><b>Proof:</b><br /><i>Case 1: </i>the number of collisions is finite (why do I make these my cases? Because the first argument I thought of requires this assumption, so let's consider the other possibility separately). If a given velocity configuration occurs again, then the system will have to repeat itself (so there will be an infinite number of collisions). Why?<br /><br />Because the number of collisions from a point in time onwards depends only on the configuration of velocities at that point in time (think about why this is true -- specifically, it does not depend on the distance between the masses, or between the masses and the wall -- is this true when you have more than two masses? But don't we actually have three masses here, counting the wall? So it matters if the masses are mobile. Why? What do mobile things do different that immobile things don't).<br /><br /><i>Case 2:</i> we don't even need the finiteness assumption. We know the velocity (not speed) of $Nm$ is non-decreasing (with sign convention of right being positive). Suppose it stabilises at 0. Then both walls have stabilised at 0, and the collisions must be finitely many. Suppose it doesn't. Then the velocity is continually changing as it hits the smaller ball (if their velocities become equal -- "worldlines become parallel" -- then there must have been only finitely many collisions), so each configuration is unique.<br /><hr /><br />Ok, so we need to find the number of intersections between the ellipse with radii $v$ and $v\sqrt{N}$, and the lines, with slope $-1/N$. Truth be told, I spent a lot of time staring at the diagram at this point, having no way of how to proceed. And so should you.<br /><br /><div class="twn-furtherinsight">One thing <i>is</i> easy to see here, which is that the answer is <i>something</i> times $\sqrt{N}$, i.e. it goes to infinity at the rate $\sqrt{N}$ as $N$ goes to infinity. Why is this easy to see?</div><br />Here's how I found the realisation: there are two ways to find $\pi$ in something -- by looking at circles' areas and lengths, or by looking at angles. I had exhausted everything I could looking at lengths (and again, you should too), so let's think about <b>angles</b>. Let's read the angles between the lines.<br /><br />Well, first, let's scale the diagram so the ellipse becomes a circle -- let's make it a unit circle. This is good, because now the slope of the lines becomes $-1/\sqrt{N}$, which means there's only one crazy diverging infinite term to bother about -- rather than an infinite ellipse going crazy and an infinite number of infinite line segments each going infinitely crazy.<br /><br /><div class="twn-furtherinsight">Why is such a scaling okay? Because the number of intersections is invariant under a scaling. There are other things, like distances, that aren't invariant (which is why finding the perimeter of an ellipse is quite hard). What things are invariant? Why?</div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-LTg-u2t2jSE/XEPCL2xkM0I/AAAAAAAAFVU/S_qXY6cWgmgE-aRyYuiCYPLNecD9kGCNwCLcBGAs/s1600/thing.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="476" data-original-width="731" height="416" src="https://2.bp.blogspot.com/-LTg-u2t2jSE/XEPCL2xkM0I/AAAAAAAAFVU/S_qXY6cWgmgE-aRyYuiCYPLNecD9kGCNwCLcBGAs/s640/thing.png" width="640" /></a></div>Each of these angles is clearly $\arctan(1/\sqrt{N})$. The key is to think about the sum of these angles. You might see the trick -- I do quite like it when a funny geometric argument comes out in some bizarre space, a velocity space here, in a physics proof -- and it's "angles in the same segment are equal".<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-zfl85jQx6iE/XEPDnSWGYAI/AAAAAAAAFVg/8N2057eFufQFF17KHZ4RAdBZLv8BFqAKACLcBGAs/s1600/ellipsething_blue.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="476" data-original-width="731" height="416" src="https://4.bp.blogspot.com/-zfl85jQx6iE/XEPDnSWGYAI/AAAAAAAAFVg/8N2057eFufQFF17KHZ4RAdBZLv8BFqAKACLcBGAs/s640/ellipsething_blue.png" width="640" /></a></div>So the sum of those angles approaches the angle subtended by that big chord. What is that big chord? I wonder if it has a name. Well, as $N$ approaches infinity, the chord approaches the <b>diameter</b>, and that angle approaches $\pi/2$. There's another $\pi/2$ from the angles on the other side, so the total of the angles at all intersections is $\pi$.<br /><br />(Great, another fact from elementary high-school geometry.)<br /><br />So the total number of angles/intersections/collisions is $\frac{\pi}{\arctan(1/\sqrt{N})}$, which as $N\to\infty$, is the same thing as $\pi\sqrt{N}$.<br /><br />Which is great. It's just <i>great</i>.3blue1brownbilliardsblogmechanicsnewtonian mechanicsphysicspisurprising piSun, 20 Jan 2019 00:49:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7916544740241670088Abhimanyu Pallavi Sudhir2019-01-20T00:49:00ZComment by Abhimanyu Pallavi Sudhir on Varying constants in special relativity
https://physics.stackexchange.com/questions/455159/varying-constants-in-special-relativity/455176#455176
I didn't complain about your profile pic.Fri, 18 Jan 2019 23:45:08 GMThttps://physics.stackexchange.com/questions/455159/varying-constants-in-special-relativity/455176?cid=1021643#455176Abhimanyu Pallavi Sudhir2019-01-18T23:45:08ZCovectors, conjugates, and the metric tensor
https://thewindingnumber.blogspot.com/2019/01/covectors-conjugates-and-why-gradient.html
0The fact -- as is often introduced in an introductory general relativity or tensor calculus course -- that the gradient is a <i>covector</i> seems rather bizarre to someone who's always seen the gradient as the "steepest ascent vector". Surely, the direction of steepest ascent is, you know, a direction -- an arrow. And what even is a covector, anyway?<br /><div><br /><div><div>Let's think about differentiating with respect to vectors. The idea we have is that $\frac{\partial f}{\partial \vec x}$ needs to contain all the information -- each of the $\frac{\partial f}{\partial x_i}$. And analogously for derivatives with respect to tensors. You might think we could just create an array with the same dimensions containing each derivative -- much like the gradient, Hessian, etc. that we're used to -- i.e.</div></div></div><br />$$\nabla f=\left[ {{\partial ^i}f} \right]$$<br />$$\nabla^2f = \left[ {{\partial ^i}{\partial ^j}f} \right]$$<br />(I'm using $\nabla^2$ for the Hessian -- and will do so in the rest of the article -- but it's too widely used for its trace the Laplacian, which should be represented as $|\nabla|^2$) etc. But you might get the sense that this feels just fundamentally wrong -- like you're giving the "division by tensor" object the structure of the same tensor, but you should somehow be giving it an "inverse" structure. <br /><br />We want to construct a situation to see that the idea above -- of making the gradient ("derivative with respect to a vector") and Hessian ("derivative with respect to a rank-2 tensor") -- a vector and a rank-tensor doesn't work. We know such a situation can arrive when we have multiplication between the gradient and a vector, or the Hessian and a rank-2 tensor. For instance, for linear $f$:<br /><br />$$f(\vec{x})-f(0)=\vec{x}\cdot\nabla f$$<br />But this is <i>wrong</i> -- for any non-Euclidean manifold. For instance, if the metric tensor is something like $\rm{diag}(-1,1)$, this dot product gives:<br /><br />$$f(\vec{x})-f(0)= - x\frac{{\partial f}}{{\partial x}} + y\frac{{\partial f}}{{\partial y}}$$<br />Which is just wrong. So instead, the gradient is a <i>covector</i>, which we represent in Einstein notation using subscripts instead of superscripts:<br /><br />$$f(\vec x) - f(0) = {x^i}{\partial _i}f$$<br />(As you can see, I omitted Einstein notation when I was writing the wrong equations -- seeing repeated indices on the same vertical alignment is physically painful.) If we want the <i>vector</i> gradient -- for direction of steepest ascent or whatever -- you need to multiply by the metric tensor.<br /><br /><div class="twn-furtherinsight">This also motivates the picture of seeing covectors as parallel surfaces whose normals are their vector versions -- in Euclidean geometry, it doesn't make a difference, but on a general setting, this normality is a bit weird. Think about this.</div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://upload.wikimedia.org/wikipedia/commons/thumb/e/e9/1-form_linear_functional.svg/400px-1-form_linear_functional.svg.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="217" data-original-width="400" height="173" src="https://upload.wikimedia.org/wikipedia/commons/thumb/e/e9/1-form_linear_functional.svg/400px-1-form_linear_functional.svg.png" width="320" /></a></div><br />But I haven't really given a motivation for the metric tensor or how it comes up here -- for this, read on.<br /><br /><hr /><br /><div>Let's talk about something completely different -- let's think about the derivative of functions from $\mathbb{C}\to\mathbb{R}$, $df/dz$. I don't know about you, but I like the complex numbers, and prefer them to $\mathbb{R}^2$, because pretty much anything I write with the complex numbers is well-defined, and easily so -- so I don't need to worry about whether $df/d \vec{x}$ makes any sense or not. Well, we can write:</div><div><br /></div><div>$$\frac{df}{dz}=\frac{\partial f}{\partial x}\frac{\partial x}{\partial z}+\frac{\partial f}{\partial y}\frac{\partial y}{\partial z}\\\Rightarrow \frac{df}{dz}=\frac{\partial f}{\partial x}-\frac{\partial f}{\partial y}i$$</div><div>This $df/dz$ above is exactly the analog of the gradient for real-valued functions defined on the complex plane -- analogous to scalar multivariable functions.<br /><br /></div><div class="twn-furtherinsight">What's the expression for the complex derivative of a complex function? Compute it -- it may look a bit different from the analogous tensor derivative -- think of traces and commutators.</div><br /><div class="twn-pitfall"><b>Note:</b> In actual complex calculus, complex differentiability is defined in a more restrictive way -- specifically one needs to satisfy the <a href = "https://en.wikipedia.org/wiki/Cauchy%E2%80%93Riemann_equations">Cauchy-Riemann equations</a>, which makes the structure of complex functions fundamentally more special than that of multivariable functions, stuff like $dx/dz$ is even undefined, and the stuff we've written above isn't really relevant in complex analysis. It is, however, the "Wirtinger derivative".</div><br />Something interesting happened here, though -- we got a negative sign on the imaginary component of the derivative. The derivative got conjugated, or something -- and the reason this occurred is that $i^2=-1$ (so $1/i=-i$), and this leaves some sort of signature in our derivative.<br /><div><br /></div>Now let's (<i>non-rigorous alert!</i>) think about how an analogous argument may be written for vectors.<br /><br />$$\frac{{df}}{{d\vec x}} = \frac{{\partial f}}{{\partial x}}\frac{{\partial x}}{{\partial \vec x}} + \frac{{\partial f}}{{\partial y}}\frac{{\partial y}}{{\partial \vec x}}$$<br />What really is $\frac{{\partial x}}{{\partial \vec x}}$, though? We know that $\frac{\partial \vec x}{\partial x}=\vec{e_x}$. But what's the "inverse" of a vector? What does that even mean?<br /><br />So we want to define some sort of a product, or multiplication, with vectors -- we want to define a thing that when multiplied by a vector gives a scalar. It sounds like we're talking about a dot product -- but the dot product lacks an important property we need to have division, it's not injective. I.e. $\vec{a}\cdot\vec{b}=c$ for fixed $\vec{a}$ and $c$ defines a whole plane of vectors $\vec{b}$, not a unique one. But if we added an additional component to our product, the cross product (or in more than three dimensions, the wedge product), then the "dot product and cross product combined" <i>is</i> injective.<br /><br /><div class="twn-furtherinsight">This combination, of course, is the tensor product. Specifically, when we're talking about something like $1/\vec{e_x}$, we want a thing whose tensor product with $\vec{e_x}$ has trace (dot product) 1 and commutator (wedge/cross product) 0, i.e. $\mathrm{tr}(\vec{e_x}'\vec{e_x})=1$ and $(\vec{e_x}'\vec{e_x})-(\vec{e_x}'\vec{e_x})^T=0$.</div><br />If all you've ever done in your life is Euclidean geometry, you'd probably think the answer to this question is $\vec{e_x}$ itself -- indeed, its dot product with $\vec{e_x}$ is 1 and its cross product with $\vec{e_x}$ is 0. But if you've ever done relativity and dealt with -- forget curved manifolds! -- the Minkowski manifold, you know that this is not necessarily true -- it depends on the metric tensor.<br /><br />Could we <i>define</i> a vector in a general co-ordinate system that is the inverse of $\vec{e_x}$? Yes, we can. But let's not do that (yet*) -- it just seems like there should be something more natural, or elegant, like we had with complex numbers.<br /><br />So we define a space of "covectors", as "scalars divided by vectors" (informally speaking), call their basis $\tilde{e^i}$ which have the required dot and cross products. In Euclidean space -- and only in Euclidean space, these look exactly the same as vectors, and have exactly the same components. I like to call the conjugation here "metric conjugation", and the gradient is naturally a covector.<br /><br />*As for the question of writing the gradient as a vector instead, this follows naturally using the metric tensor -- as an exercise, show, by considering the required vector corresponding to the covector $\tilde{e^x}$ (i.e. that has the right dot and cross products with $\vec{e_x}$) that the vector gradient can be given as the product of the inverse metric tensor and the covector gradient:<br /><br />$${\partial ^\mu }f = {g^{\mu \nu }}{\partial _\nu }f$$<br />(Do this exercise! It is <em>the</em> motivation for the metric tensor, and why it determines your co-ordinate system!)<br /><br /><hr /><br /><div class="twn-furtherinsight">I've been talking about the covector $\tilde{e^x}$ as being equal to the quotient "$1/\vec{e_x}$" but as I mentioned, this isn't really accurate -- the "1" in the quotient is a (1,1) tensor with trace 1 and commutator 0. Think about this tensor. Can you find this tensor in Clifford algebra? Maybe not. Can you find it as a linear transformation? Yes? Find it. And can you think of the covector alternatively as a quotient of a bivector and a trivector? Will you get $(e_y\wedge e_z)/(e_x\wedge e_y\wedge e_z)$?</div>complex analysiscomplex conjugatecovectorsgradienttensor algebratensor calculustensorsFri, 18 Jan 2019 23:27:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1813565310719679343Abhimanyu Pallavi Sudhir2019-01-18T23:27:00ZAnswer by Abhimanyu Pallavi Sudhir for Varying constants in special relativity
https://physics.stackexchange.com/questions/455159/varying-constants-in-special-relativity/455176#455176
1<blockquote>
<p>(presumably) everything has mass, there is no such thing as a perfect inertial frame of reference</p>
</blockquote>
<p>This isn't right. "There isn't generally a perfectly flat co-ordinate system" does not imply everything has mass, and being an inertial reference frame has nothing to do with the associated observer's mass (in fact, the Lorentz transformation associated with a photon's "co-ordinate system" is singular, so there isn't really a co-ordinate system/reference frame associated with it).</p>
<p>I guess your concern is with the fact that photons are affected by spacetime curvature -- this is true, but the/a point of general relativity is that this doesn't imply anything about the mass.</p>Fri, 18 Jan 2019 20:30:09 GMThttps://physics.stackexchange.com/questions/455159/-/455176#455176Abhimanyu Pallavi Sudhir2019-01-18T20:30:09ZAnswer by Abhimanyu Pallavi Sudhir for How to understand Cantor's diagonalization method in proving the uncountability of the real numbers?
https://math.stackexchange.com/questions/2855987/how-to-understand-cantors-diagonalization-method-in-proving-the-uncountability/3064377#3064377
0<p>Here's a perhaps more fathomable way to phrase what everyone has already said: even if you were to include <span class="math-container">$\infty$</span> as an integer, it would be just one integer. On the other hand, counting 4142... and 1088... separately means you're adding a much larger number of infinities to your set. </p>
<p>How many numbers exactly? There are ten choices for each digit, there an infinite number of digits so indeed you're adding <span class="math-container">$10^{\aleph_0}=2^{\aleph_0}$</span> numbers to your integers, which is precisely the cardinality of the reals.</p>Sun, 06 Jan 2019 20:49:32 GMThttps://math.stackexchange.com/questions/2855987/-/3064377#3064377Abhimanyu Pallavi Sudhir2019-01-06T20:49:32ZAnswer by Abhimanyu Pallavi Sudhir for What is really curved, spacetime, or simply the coordinate lines?
https://physics.stackexchange.com/questions/290906/what-is-really-curved-spacetime-or-simply-the-coordinate-lines/452416#452416
0<p>Curved co-ordinates on flat spacetime correspond to accelerating observers, not gravity. </p>
<p>The first physical insight of general relativity is that when you have gravity, you have <em>no</em> globally inertial frames -- contrast this with flat space, where you can always construct a linear co-ordinate system. The second physical insight is that you do have locally inertial frames, specifically the freefalling ones -- this is the "equivalence principle" -- so the manifold you use to model spacetime must necessarily have local flatness. Consequently, (pseudo-)Riemannian manifolds become the right way to model spacetime in general relativity.</p>
<p>This is why Christoffel symbols exist for accelerating observers on flat spacetime too -- they're first-order in the derivatives of the metric, and so can be eliminated by transforming into a flat co-ordinate system where the metric is constant (this is okay because the Christoffel symbols aren't tensors). The Riemann curvature tensor, on the other hand, is second-order in the derivatives of the metric and cannot be eliminated by a co-ordinate transformation.</p>Sun, 06 Jan 2019 12:13:03 GMThttps://physics.stackexchange.com/questions/290906/-/452416#452416Abhimanyu Pallavi Sudhir2019-01-06T12:13:03ZAnswer by Abhimanyu Pallavi Sudhir for Relative velocity greater than speed of light
https://physics.stackexchange.com/questions/452078/relative-velocity-greater-than-speed-of-light/452100#452100
0<p>Velocity is definitonally the same as "relative velocity". This is the point of the first postulate of relativity.</p>Fri, 04 Jan 2019 15:11:44 GMThttps://physics.stackexchange.com/questions/452078/-/452100#452100Abhimanyu Pallavi Sudhir2019-01-04T15:11:44ZWhat is the region of convergence of $x_n=\left(\frac{x_{n-1}}{n}\right)^2-a$, where $a$ is a constant?
https://math.stackexchange.com/questions/3051551/what-is-the-region-of-convergence-of-x-n-left-fracx-n-1n-right2-a-w
5<p>The following recurrence relation came up in some research I was working on:</p>
<p><span class="math-container">$$x_n=\left(\frac{x_{n-1}}{n}\right)^2-a$$</span></p>
<p>Or equivalently the map:</p>
<p><span class="math-container">$$z\mapsto\frac{z^2}{n^2}-a$$</span></p>
<p>Where <span class="math-container">$n$</span> is the iteration number. Specifically, I'm interested in the size of the convergence region across the real line. Some stuff I know about this map:</p>
<ul>
<li>For <span class="math-container">$a = 1$</span>, it's easy, the "size on the real line" is <span class="math-container">$[-3,3]$</span>.</li>
</ul>
<p>I do have an infinite radical expansion for the size of the convergence region on the real line (see <a href="https://math.stackexchange.com/questions/2839527/solving-the-infinite-radical-sqrt6-sqrt62-sqrt63-sqrt6">Solving the infinite radical <span class="math-container">$\sqrt{6+\sqrt{6+2\sqrt{6+3\sqrt{6+...}}}}$</span></a>): </p>
<p><span class="math-container">$$\sqrt{a+2\sqrt{a+3\sqrt{a+...}}}$$</span></p>
<p>That's why it's easy for <span class="math-container">$a=1$</span> -- it's just the Ramanujan radical, and equals 3. It's also easy for <span class="math-container">$a=0$</span> -- it's <span class="math-container">$\exp\left(-\mathrm{PolyLog}^{(1,0)}(0,1/2)\right)$</span> as per Wolfram Alpha.</p>
<p>Has anyone seen this map before? Here's the region of convergence on the complex plane, plotted numerically (for <span class="math-container">$a=6$</span>):</p>
<p><a href="https://i.stack.imgur.com/sbrn8.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/sbrn8.png" alt="enter image description here"></a></p>sequences-and-seriesconvergencerecurrence-relationsMon, 24 Dec 2018 19:02:01 GMThttps://math.stackexchange.com/q/3051551Abhimanyu Pallavi Sudhir2018-12-24T19:02:01ZAnswer by Abhimanyu Pallavi Sudhir for Does spacetime position not form a four-vector?
https://physics.stackexchange.com/questions/192886/does-spacetime-position-not-form-a-four-vector/450137#450137
0<p>Right -- vectors in general relativity live in some tangent space. This is the point of differential geometry, and of calculus in general -- you approximate non-linear things, which are <em>not</em> vector spaces (like curvy manifolds) with linear things (like their tangent spaces), which are vector spaces. This is exactly the motivation for defining the basis vectors as <span class="math-container">$\partial_\mu$</span>, as you describe.</p>Mon, 24 Dec 2018 07:04:21 GMThttps://physics.stackexchange.com/questions/192886/-/450137#450137Abhimanyu Pallavi Sudhir2018-12-24T07:04:21ZIntuition, analogies and abstraction
https://thewindingnumber.blogspot.com/2018/12/intuition-analogies-and-abstraction.html
0$$-1=\sqrt{-1}\sqrt{-1}=\sqrt{(-1)(-1)}=\sqrt{1}=1$$<br />I bet you've seen the fake "proof" above that minus one and one are equal. And the standard explanation as to why it's wrong is that the statement $\sqrt{ab}=\sqrt{a}\sqrt{b}$ only applies when $\sqrt{a}$ and $\sqrt{b}$ are real, or something like that (maybe only one of them needs to be real -- something like that -- who cares?).<br /><br />But if you're like me, that isn't a very satisfactory proof. <i>Why</i> does the identity not hold for complex numbers? For that matter, why does it hold for real numbers? Well, that is a good question, and one way of answering it would be to try and prove the identity for real numbers, and see what properties of the real numbers (or of the real square root, in particular) you use. And if this article were being filed under "MAR1104: Introduction to formal mathematics", that's how I might explain things -- but that doesn't give us too much insight -- not about square roots and complex numbers, anyway.<br /><br />Let's think about what $\sqrt{ab}=\sqrt{a}\sqrt{b}$ means.<br /><br /><div class="twn-furtherinsight">What does the square root of a real number mean, anyway? It's some property related to multiplying a real number by itself. What does multiplication mean? What does a real number mean? The picture I have in my head of the real numbers is of a line. But what exactly is this line? -- the real numbers are just a set. Why did you put them on this line in this specific way? In doing so, you gave the real numbers a <i>structure</i>, a specific type of structure called an "order", defined by the operation $<$.<br /><br />But there are other ways to think about/structure the real numbers. One way is to think of real numbers as (one-dimensional) scalings. You can scale things like mass, and volume, using real numbers, representing the scalings as real numbers. Scaling a mass by 2 is equivalent to multiplication by 2. So this gives the real numbers a multiplicative structure, defined by the operation $\times$ (or whatever notation -- or lack thereof -- you prefer). And the "real line" then just represents the image of "1" under all scalings.<br /><br />So the way to think about square roots is to think of numbers as linear transformations called scalings, and think about the scaling that when done twice, gives you the number you're taking the square root of. So what's $\sqrt{-1}$? What's $-1$? $-1$, multiplicative, is a reflection. What's its square root? Try to think of a (linear!) transformation that when done twice gives you a reflection. It can't be done in one dimension. And can you think of another such transformation? Can you prove these are the only two? Are you sure -- what about if you add a dimension?<br /><br />So the natural way to think about square roots of numbers that may or may not be complex, is with so-called "Argand diagrams", on the complex plane, the image of "1" under all complex numbers multiplicative.</div><br /><center><iframe frameborder="0" height="500px" src="https://www.desmos.com/calculator/lufx5iszcs?embed" style="border: 1px solid #ccc;" width="500px"></iframe></center>Click "edit graph" to play with <em>a</em> and <em>b</em>!<br /><br />To simplify things, consider only unit complex numbers (this is okay, because all complex numbers can be written as a real multiple of a unit complex number and a real number). The product of complex numbers $a$ and $b$ involves rotating by $a$, then rotating by $b$. The square roots of $a$ and $b$ involve going halfway around the circle as $a$ and $b$, and the square root of $ab$ goes halfway around the circle as $ab$.<br /><br />So it seems like the identity should hold, doesn't it? $\sqrt{ab}$ goes half as much as $a$ and $b$ put together -- this seems to be exactly what $\sqrt{a}\sqrt{b}$ does -- go around half as much as $a$, then half as much as $b$. Isn't $\frac{\theta+\phi}2=\frac{\theta}2+\frac{\phi}2$?<br /><br />The problem is that $\sqrt{ab}$ doesn't really go $\frac{\theta+\phi}2$ around the circle, if $\theta+\phi$ is greater than $2\pi$. You can see this in the diagram courtesy of Desmos above -- $ab$ has gone a <i>full circle</i>, and its square root is defined to halve the <i>argument</i> of $ab$, but the argument isn't $\arg (ab)=\arg (a) + \arg (b)$, rather:<br /><br />$$\arg (ab) \equiv \arg (a) + \arg (b) \pmod{2\pi}$$<br />But <i>halving</i> is not an operation that the $\bmod$ equivalence relation respects -- not in general, anyway. It is <i>not</i> true that<br /><br />$$\arg (ab)/2 \equiv (\arg (a) + \arg (b))/2 \pmod{2\pi}$$<br />Instead:<br /><br />$$\arg (ab)/2 \equiv (\arg (a) + \arg (b))/2 \pmod{\pi}$$<br />Let's recall from basic number theory -- on integers, the general result regarding multiplication on mods. If $a\equiv b\pmod{m}$, then $na\equiv nb \pmod{nm}$, certainly, and also $na\equiv nb \pmod{m}$ iff $n$ is an integer*. But $1/2$ <i>isn't</i> an integer, which is why only the former result is relevant.<br /><br /><div class="twn-furtherinsight">This is also why $(ab)^2=a^2b^2$ <i>does</i> hold for complex numbers.</div><br />*when $n$ isn't an integer, we need $na$, $nb$ to be integers for the statement to even be <i>well-defined</i> in standard number theory, and then you have a result for division on mods involving $\gcd(d,m)$, etc. This isn't a concern for us here because we're dealing with divisibility over the reals -- if you want to be formal, a real number is divisible by another real number if the former can be written as an integer multiple of the latter.<br /><br />So there you have it -- I just demonstrated a very fundamental analogy between two seemingly incredibly unrelated ideas: complex numbers modular arithmetic -- square roots of complex numbers don't multiply naturally, because <i>mod</i> doesn't respect division. It's almost as if somehow, somewhere, somehow magically, <i>exactly the same kind of math was used to derive results, to prove things, about these unrelated objects</i>.<br /><br />As if they're just two instances of the same thing.<br /><br />I wonder what that thing could be.<br /><br /><hr /><br />Let's talk about something completely unrelated (no, genuinely -- completely unrelated -- I won't tell you this is an instance of the "same thing" too). Let's talk about logical operators, specifically: do $\forall$ and $\exists$ commute? I.e. is $\forall t, \exists s, P(s,t)$ equivalent to $\exists s, \forall t, P(s,t)$?<br /><br />You just need to read the statements aloud to realise they don't. To use a classical example, "all men have wives" and "there is a woman who is the wife of all men" are two very different statements (okay, in this case both statements are false, so they're equivalent in that sense, so you get my point).<br /><br />But let's think more deeply about why they don't commute. What do $\forall t, \exists s, P(s,t)$ and $\exists s, \forall t, P(s,t)$ mean, anyway? $\forall$ and $\exists$ are just infinite $\land$ and $\lor$ statements , i.e. $\forall t$ is just an $\land$ statement ranging over all possible values that $t$ can take and $\exists s$ is just an $\lor$ statement ranging over all possible values $s$ can take.<br /><br />So $\forall t, \exists s, P_{st}$ just means (letting $s$ and $t$ be natural numbers for simplicity, but they don't have to):<br /><br />$$({P_{11}} \lor {P_{21}} \lor ...) \land ({P_{12}} \lor {P_{22}} \lor ...) \land ...$$<br />And $\exists s, \forall t, P(s,t)$ means:<br /><br />$$({P_{11}} \land {P_{12}} \land ...) \lor ({P_{21}} \land {P_{22}} \land ...) \lor ...$$<br />This is a bit complicated, so let's instead look at the simpler case where you have only 2 by 2 statements -- i.e. just construct the analogy between $\forall,\exists$ and actual $\land,\lor$ statements.<br /><br />So the question is if:<br /><br />$$({P_{11}} \lor {P_{21}}) \land ({P_{12}} \lor {P_{22}}) \Leftrightarrow ({P_{11}} \land {P_{12}}) \lor ({P_{21}} \lor {P_{22}})$$<br />This is interesting. Maybe you see where this is going. Let me just do a notation change -- I'll use "$\times$" for $\land$, "$+$" for $\lor$, "$=$" for $\Leftrightarrow$" and some new letters for the propositions. Under this new notation, where $\times$ is invisible as always, we're asking if:<br /><br />$$(a + b)(c + d) = ac + bd$$<br /><br />Aha! This is Freshman's dream, isn't it? And we know it's not true -- it's a dream, after all, don't be delusional -- and we know <i>why</i> it's not true too.<br /><br />But wait -- we aren't talking about elementary algebra here. I just gave you some silly notation and made it <i>look</i> like Freshman's dream. But here's the thing: the proof (or algebraic proof -- a counter-example is also a proof, but that isn't so interesting... not here, anyway) that these propositions aren't equivalent is <i>exactly</i> the same as in algebra. We expand out the brackets (because we know that $\land$ distributes over $\lor$ -- we also know that $\lor$ distributes over $\land$, incidentally, something that is <i>not</i> true in standard algebra) and point out that there are extra terms, and point out that these extra terms change the value of the expression (they aren't zero).<br /><br />So there's some kind of relationship between the boolean algebra and an elementary algebra. A lot of proofs that can be done in one of these algebras can be written almost identically in the other. Not <i>all</i> these proofs, mind you -- then the algebras would just be isomorphic to each other -- but some of them can. Maybe a lot of important ones can.<br /><br /><div class="twn-pitfall">An abstraction that produces such proofs simultaneously for both elementary algebra and boolean algebra may be more complicated than you think -- there's no real sense in which a statement is "always zero" in boolean algebra. Take for instance, distributivity of $\lor$ over $\land$ -- $a+bc=(a+b)(a+c)$. This is not true in elementary algebra, because the extra term $ab+ac$ is not always equal to zero ($a^2\ne a$ is not really an example, because $a^2=a$ for $a\in\{0,1\}$ -- but $a(b+c)=0$ is not true for all $a,b,c\in\{0,1\}$). It's just that it leaves the value of the existing terms unchanged in this specific instance.</div><br /><hr /><br />I've just illustrated two examples here -- the first one is a type of group, by the way, but you've probably seen dozens of other such "connections between different areas of mathematics" yourself. I've made these sorts of analogies fundamental to a lot of the articles I've written here (I think). You might've just thought of them as interesting insights, but in reality, abstract mathematics/abstract algebra -- or really just mathematics in general -- is all about these analogies.<br /><br />In a sense, mathematics is largely about abstraction. I mean, that's not what mathematics fundamentally <i>is</i> -- fundamentally, math is just logic -- but it's how mathematics largely functions. Whenever one talks of axioms, you could think of them as fundamental defining ideas of mathematical objects, and you can also think of them as "interfaces" between mathematics and reality (see my <a href="https://thewindingnumber.blogspot.com/2017/01/introduction-to-linear-transformations.html">introduction to linear transformations</a>). There are a massive number of different physical phenomena that we can study, and rather than prove everything from scratch for each one of them, it is much better -- and more insightful in terms of understanding the connections between things -- to show that they satisfy a certain set of axioms that apply to a whole range of things, and then deduce that all the logical consequences of these axioms -- all theorems -- are satisfied by the objects.<br /><br />If we can do that with physical phenomena, we can sure as well do it with mathematical phenomena too -- instead of proving something from scratch for every new mathematical object, we prove that it is a group, or a ring, or a field, or a module, or an algebra, or a topology, or a geometry of some sort, by verifying it matches the axioms -- and then use all the abstract knowledge we have about these things and deduce they must necessarily apply to our new object, because they are logical consequences of our axioms.<br /><br />Abstract mathematics is, in this sense, all about generalising things by finding the "smallest set of axioms" the thing requires.<br /><br />(Well, not really -- the most general statement is "true", and everything else is just a logical deduction from this statement. So in that sense mathematics is all about finding special cases. But in order to know what to take a special case of, and what special case that "what" is of "true", you need to generalise.)<br /><br /><div class="twn-exercises">List some weird analogies you've seen before in math. Something about divisibility sound familiar?</div>abstract algebraabstract mathematicsalgebrasboolean algebracomplex numbersgroup theorygroupslinear algebramathematicsSat, 08 Dec 2018 14:35:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-4671784076766893802Abhimanyu Pallavi Sudhir2018-12-08T14:35:00ZAnswer by Abhimanyu Pallavi Sudhir for What is an event in Special Relativity?
https://physics.stackexchange.com/questions/389488/what-is-an-event-in-special-relativity/444892#444892
1<p>It is perfectly reasonable to say that an event is a point in spacetime and that spacetime is a collection of events -- it is not "circular" as you claim in the comments. This is just the physics version of "a vector is an element of a vector space" and "a vector space is a set of vectors". You have axioms in math, and you have axioms in physics. The only difference is that in math, the objects are abstract, but in physics, they have a physical interpretation.</p>Mon, 03 Dec 2018 16:51:40 GMThttps://physics.stackexchange.com/questions/389488/-/444892#444892Abhimanyu Pallavi Sudhir2018-12-03T16:51:40ZPhysical dimensions in math
https://math.stackexchange.com/questions/3018937/physical-dimensions-in-math
6<p>I was interested in the idea of of formalising the idea of physical dimensions with an algebraic structure containing "all physical quantities of any type". You'd need:</p>
<ul>
<li>Scalar multiplication over the reals (so you can get "2 kg" from "2 * kg")</li>
<li>Addition <em>within</em> the same dimension (so you can have "kg + kg = 2 kg")</li>
<li>Multiplication of any two elements (so you can have "J = N m = N * m")</li>
<li>Inverses (so you can have "m/s = m * s^(-1)")</li>
</ul>
<p>A <em>tensor algebra</em> could formalise this system -- but then you'd get all sorts of objects like "1 kg + 1 m", which make no sense.</p>
<p>A <em>group</em> would make sense -- with sub-groups like "mass measurements", "time measurements", "real numbers", "units" -- but then you can't have zero. Plus, I'd like to have some notion of units or "unit vectors"/"unit tensors".</p>
<p>What's a good way to formalise this?</p>abstract-algebragroup-theoryphysicstensor-productsdimensional-analysisThu, 29 Nov 2018 17:44:24 GMThttps://math.stackexchange.com/q/3018937Abhimanyu Pallavi Sudhir2018-11-29T17:44:24ZIntuition for Diaconescu's theorem
https://math.stackexchange.com/questions/3015860/intuition-for-diaconescus-theorem
1<p><a href="https://en.wikipedia.org/wiki/Diaconescu%27s_theorem" rel="nofollow noreferrer">Diaconescu's theorem</a> proves that the axiom of choice implies the law of the excluded middle.</p>
<p>While I can follow the proof in the above wikipedia article, it just seems like a cheap trick, so to speak, rather than something deep going on. </p>
<p>Specifically, I'm trying to think of an intuitive way of understanding the contrapositive (haha -- but this is okay, because we can prove <span class="math-container">$(p\to q)\to(\neg q \to \neg p)$</span> without the law of the excluded middle, just not the converse) of the statement in the context of type theory. </p>
<p>Here, the law of the excluded middle essentially states "all propositions are either empty or the universe" -- so the contrapositive of Diaconescu's theorem says "if there is a proposition that is neither empty nor the universe, you can't always have a choice function". </p>
<p>This seems to me to be a promising route to understanding the theorem, but I can't finish my train of thought -- do "middle" propositions not permit a choice function? Does that make sense? Is there another way to understand the proof more intuitively?</p>logicaxiom-of-choicetype-theoryTue, 27 Nov 2018 14:51:30 GMThttps://math.stackexchange.com/q/3015860Abhimanyu Pallavi Sudhir2018-11-27T14:51:30ZUnderstanding polynomial-ish differential equations
https://thewindingnumber.blogspot.com/2018/11/understanding-polynomial-ish.html
0This is a rather simple idea, perhaps not one you really had too many problems understanding to begin with. Given you know that $e^{\lambda x}$ solves first-order polynomial differential equations, it's not too much of a stretch to imagine it solves higher-order polynomial differential equations too. But let's talk about this anyway.<br /><br />So suppose you have differential equation like:<br /><br />$$y''-3y'+2=0$$<br />A more interesting way of writing this would be:<br /><br />$$(D-1)(D-2)y=0$$<br /><div class="twn-furtherinsight">The fact that you can do such a factoring is a consequence of the fact that polynomials in $D$ form a <i>commutative ring</i>. The idea behind rings and fields and other such objects is to look for a bunch of properties that a familiar set -- like the integers or the real numbers -- satisfies, then drilling those properties down to the basic axioms that imply them, to generalise them to objects other than the integers or real numbers. Differentiation operators are a great example of such a ring.</div><br />Now, your first instinct may to look at the factorisation and claim that $(D-1)y=0$ or $(D-2)y=0$. But this isn't right -- you assumed, here, incorrectly, that $(D-1)^{-1}$ and $(D-2)^{-1}$ existed (and that when applied on 0, they give you 0). This is not right, though -- we know there are in fact multiple functions that give 0 when you take $(D-1)$ of them. Which functions, specifically? The functions that are in the null space of $D-1$, i.e. the functions which satisfy:<br /><br />$$(D-1)f=0$$<br />And 0 isn't the only such function. Ok, I've been giving you silly tautologies for about three lines now, but the point I'm making is that when you take the inverse operator of $(D-1)$ of both sides, what you really get is:<br /><br />$$(D-2)y=(D-1)^{-1}0=ce^{x}$$<br />For arbitrary $c$.<br /><br /><div class="twn-furtherinsight">The way to think about this kind of a $c$ is that you don't really have an <i>equal to</i> relation, i.e. an equation, you have an <i>equivalence</i> relation -- the "=" sign there is really abuse of notation. And you're saying that $(D-2)y$ belongs in an equivalence class where all elements are of the form $ce^{x}$ (and your quotient group's "representative element" can be $e^x$. The same applies, for example, for _____ in calculus -- fill in the blank. Well, fill it in.</div><br />Anyway, what you now have is a first-order differential equation (or really differential <i>equivalence</i>) in $y$.<br /><br />$$(D-2)y=c_1e^x$$<br />But it isn't homogenous. I don't really know how to motivate a solution for a non-homogenous differential equation, really -- all I can say is that because the right-hand-side is an exponential, we just <i>know</i> that we can get some hints as to what $(D-2)^{-1}(ce^x)$ is by applying $(D-2)(ce^x)$ -- and if the right-hand-side <i>isn't</i> an exponential, then you can make it a sum or integral of exponentials, which is what Laplace and Fourier transforms are all about.<br /><br />In any case, performing $(D-2)$ on $c_1e^x$ gives us $(c_1-2)e^x$, which immediately gives us an example solution, or a particular solution, $(c_1+2)e^x$ -- and all other solutions can be formed by adding linear combinations of the elements of the null space, i.e. solutions to the homogenous equation $(D-2)y=0$. These elements we know to take the form $c_2e^{2x}$.<br /><br />$$y=(D-2)^{-1}c_1e^x=(c_1+2)e^x+c_2e^{2x}$$<br />Or transforming arbitrary constants,<br /><br />$$y=c_1e^x+c_2e^{2x}$$<br /><br /><div class="twn-exercises">Use this method to find a general form for the solution to $(D-\alpha_1)(D-\alpha_2)...(D-\alpha_n)y=0$. Formalise our method with induction, and prove this general form with induction.<br /></div>differential equationsequivalence relationshomogenous differential equationsinvertibilitylinear algebralinear differential equationsmathematical inductionring theorySun, 25 Nov 2018 22:54:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-8728609759867298079Abhimanyu Pallavi Sudhir2018-11-25T22:54:00ZAnswer by Abhimanyu Pallavi Sudhir for Differential equation Physical Example.
https://math.stackexchange.com/questions/1340028/differential-equation-physical-example/3010314#3010314
0<p>Asking "what does a differential equation really mean?" is the wrong question, if by "really" you mean "give me an exact physics equivalence". Things in math describe <em>multiple things</em> in physics, they "mean" multiple things at once -- this is the point of math, and the idea is that you can just do one mathematical theory to describe a bunch of different things.</p>
<p>Sometimes, the "multiple things at once" are explored within the math itself -- an example is the real numbers. What are the real numbers? Well, they're a set -- that's an unstructured, useless idea in itself. What does it "really mean"? Well, that question is answered by the variety of algebraic objects we can define out of this set -- you can define an additive group with the real numbers, then the real numbers are one-dimensional translations, you can define a multiplicative group, then they're scalings, you can define a ring or a field or one of those other things, and then they're in some sense objects on their own accord. And then there are a bunch of functions that map to the reals, so they also function as all sorts of things, like measures on sets and distances on metric spaces.</p>
<p>This kind of thing is I think where people get the idea that everything in math needs to have an exact physical equivalence, but there aren't really analogous structures defined for differential equations. </p>
<p>If you really want an answer to your question, the best I can give is "in general, differential equations are just recursive relations that describe the behaviour of continuous objects" -- it's the closest thing you can get to "induction on a continuum" or "recursion on the continuum", and they're just analogous to difference equations/recurrence relations on discrete sets. So if you know an initial state and you know the differential behaviour of an object -- as you often do in physics -- you're going to be using a differential equation. But they don't get more specific than that.</p>Fri, 23 Nov 2018 12:37:13 GMThttps://math.stackexchange.com/questions/1340028/-/3010314#3010314Abhimanyu Pallavi Sudhir2018-11-23T12:37:13ZAnswer by Abhimanyu Pallavi Sudhir for Why is the scalar product of two four-vectors Lorentz-invariant?
https://physics.stackexchange.com/questions/442119/why-is-the-scalar-product-of-two-four-vectors-lorentz-invariant/442164#442164
2<p>Here's the way to think about this -- why is the standard Euclidean dot product, <span class="math-container">$\sum x_iy_i$</span> interesting? Well, it is interesting primarily from the perspective of rotations, due to the fact that rotations leave dot products invariant. The reason this is so is that this dot product can be written as <span class="math-container">$|x||y|\cos\Delta\theta$</span>, and rotations leave magnitudes and relative angles invariant.</p>
<p>Is the standard Euclidean norm <span class="math-container">$|x|$</span> invariant under Lorentz transformations? Of course not -- for instance, <span class="math-container">$\Delta t^2+\Delta x^2$</span> is clearly not invariant, but <span class="math-container">$\Delta t^2-\Delta x^2$</span> is. Similarly, <span class="math-container">$E^2+p^2$</span> is not important, but <span class="math-container">$E^2-p^2$</span> is. The reason this is the case is that Lorentz boosts are fundamentally skew transformations, which means the invariant locus is a hyperbola, not a circle. So you have <span class="math-container">$\cosh^2 \xi - \sinh^2 \xi = 1$</span>, and <span class="math-container">$x_0^2-x_1^2$</span> is the right way to think of the norm on Minkowski space.</p>
<p>Similarly, Lorentz boosts change the rapidity <span class="math-container">$\xi$</span> by a simple displacement, so <span class="math-container">$\Delta \xi$</span> is invariant. From this point, it's a simple exercise to show that </p>
<p><span class="math-container">$$|x||y|\cosh\xi=x_0y_0-x_1y_1$$</span></p>
<p>(as for the remaining dimensions -- remember that the standard Euclidean dot product is still relevant in <em>space</em>, so you just need to write <span class="math-container">$x_0y_0-x\cdot y=x_0y_0-x_1y_1-x_2y_2-x_3y_3$</span>.)</p>Tue, 20 Nov 2018 15:59:18 GMThttps://physics.stackexchange.com/questions/442119/-/442164#442164Abhimanyu Pallavi Sudhir2018-11-20T15:59:18ZEdge colourings of an icosahedron
https://math.stackexchange.com/questions/3006270/edge-colourings-of-an-icosahedron
3<p>I'm referring to <a href="http://kskedlaya.org/putnam-archive/2017.pdf" rel="nofollow noreferrer">problem A6 of the 2017 Putnam competition</a> -- the question is "<em>How many ways exist to colour the labelled edges of an icosahedron such that every face has two edges of the same colour and one edge of another colour, where the colours are either <strong>red, white or blue</strong>?</em>".</p>
<p>My solution is as follows: consider the planar representation of the icosahedron:
<a href="https://i.stack.imgur.com/KslAY.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/KslAY.png" alt="enter image description here"></a></p>
<p>Note that:</p>
<ul>
<li><p>There are <strong>18 ways to colour the edges of a triangle</strong> such that it has two edges of the same colour and one edge of another colour (3 choices for the colour that appears twice, 2 choices for the colour that appears once, and 3 choices for the arrangement).</p></li>
<li><p><strong>Given</strong> the colouring on any <strong>one edge</strong> of a triangle, there are <strong>6 ways to colour the remaining edges</strong> in a way that satisfies the condition (WLOG suppose the given edge is white -- then either one other edge is white (and the other edge is red or blue), which has 4 possibilities, or both the other edges have the same colour, red or blue, which is possible in 2 ways.</p></li>
<li><p>Given the colouring on <strong>two edges</strong> of a triangle, there are <strong>2 ways</strong> to colour the <strong>remaining edge</strong> in a way that satisfies the condition. WLOG, the given edges are coloured either "R R" or "R B". If it's "R R", the 2 ways to choose the other edge are "W" and "B" -- if it's "R B", the 2 ways to choose the other edge are "R" and "B".</p></li>
</ul>
<p>So there are 18 ways to choose the colouring on the <strong>central triangle</strong> (the base case) of the planar representation, and you can write each the number of ways to colour <strong>each successive "containing triangle"</strong> as <span class="math-container">$6^32^3$</span> multiplied by the smaller triangle it contains, so the number of ways to colour the entire icosahedron should be:</p>
<p><span class="math-container">$$18(6^32^3)^3=2^{19}3^{11}$$</span></p>
<p>Unfortunately, the <a href="http://kskedlaya.org/putnam-archive/2017s.pdf" rel="nofollow noreferrer">official solution (p. 5)</a> presents an answer of <span class="math-container">$2^{20}3^{10}$</span> -- I'm off by a factor of <span class="math-container">$2/3$</span>! What's going on? What did I do wrong?</p>combinatoricscoloringplanar-graphsTue, 20 Nov 2018 12:47:20 GMThttps://math.stackexchange.com/q/3006270Abhimanyu Pallavi Sudhir2018-11-20T12:47:20ZUnderstanding variable substitutions and domain splitting in integrals
https://thewindingnumber.blogspot.com/2018/10/understanding-variable-substitutions.html
0Often when I'm reading a computation of some weird integral that contains some kind of a "trick" for some variable substitution and can't help but think "How could I have thought of that?" And even when introducing these at schools, these are usually taught as "tricks", and the strategy to decide which "trick" to use is memorised -- you see $1+x^2$? Well, that's either $\tan x$ or $\cot x$. And sure, for such simple ones, that kind of a trick might make sense. You know, you have something that really looks like a trig identity, so let's just make it one...<br /><br />But I tend to find often that these kinds of "tricks" can be motivated and made to make sense, and I think that there usually is such a way to come up with one from mathematical insight (and I think so, because someone's had to actually come up with the tricks).<br /><br />Here's the Cauchy-Schwarz inequality for functions on [0, 1]:<br /><br />\[{\left[ {\int_0^1 {f(t)g(t)dt} } \right]^2} \le \int_0^1 {f{{(t)}^2}dt} \,\int_0^1 {g{{(t)}^2}dt} \]<br />How would we go about proving this?<br /><br />Well, perhaps you recall what the proof of the Cauchy-Schwarz inequality for ordinary vectors in $\mathbb{R}^n$ looks like. Here's a standard proof:<br /><br />\[{\left( {{x_1}{y_1} + {x_2}{y_2} + ... + {x_n}{y_n}} \right)^2} \le \left( {{x_1}^2 + {x_2}^2 + ... + {x_n}^2} \right)\left( {{y_1}^2 + {y_2}^2 + ... + {y_n}^2} \right)\]<br />\[\left( {\begin{array}{*{20}{c}}{{x_1}^2{y_1}^2 + {x_1}{y_1}{x_2}{y_2} + ... + {x_1}{y_1}{x_n}{y_n} + }\\\begin{array}{l}{x_2}{y_2}{x_1}{y_1} + {x_2}^2{y_2}^2 + ... + {x_2}{y_2}{x_n}{y_n} + \\... + \\{x_n}{y_n}{x_1}{y_1} + {x_n}{y_n}{x_2}{y_2} + ... + {x_n}^2{y_n}^2 + \end{array}\end{array}} \right) \le \left( {\begin{array}{*{20}{c}}{{x_1}^2{y_1}^2 + {x_1}^2{y_2}^2 + ... + {x_1}^2{y_n}^2 + }\\\begin{array}{l}{x_2}^2{y_1}^2 + {x_2}^2{y_2}^2 + ... + {x_2}^2{y_n}^2 + \\... + \\{x_n}^2{y_1}^2 + {x_n}^2{y_2}^2 + ... + {x_n}^2{y_n}^2\end{array}\end{array}} \right)\]<br /><br />And now we simply need the fact that $2{x_i}{y_i}{x_j}{y_j} \le {x_i}^2{y_j}^2 + {x_j}^2{y_i}^2$, which is of course true since squares are nonnegative.<br /><br />Why on Earth would I walk you through this inane proof, which I'd rather be flogged to death than have to write? Because you might get the idea that the same principle can be applied for functions.<br /><br />What exactly would be the analogy? Well, let's first "expand out" the product of the two integrals, like we expanded out the product of two sums -- this just means rewriting the product as a double-integral.<br /><br />\[\iint_{{[0,1]}^2}{f(s)g(s)f(t)g(t)\,ds\,dt} \leq \iint_{{[0,1]}^2} {{f{{(s)}^2}g{{(t)}^2}\,ds\,dt}}\]<br />This is essentially the same as our double summation on $[1,n]^2$ from earlier -- and like before, the diagonals of the summations are exactly identical (this idea should itself tell you when the inequality becomes an equality) -- and we'd like to prove, as before, that the inequality holds for each sum of corresponding elements across the diagonal.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/WfuSV.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="320" data-original-width="800" height="128" src="https://i.stack.imgur.com/WfuSV.png" width="320" /></a></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;">(Why does the principal diagonal look oriented different from that for the vectors in $\mathbb{R}^n$?) But how would you actually write down, on paper, this technique of summing up stuff across the principal diagonal? Well, you'll need to split your domain into two, then "reflect" one domain across the principal diagonal so the two integrals can be on the same (new triangular) domain.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">So we start with:</div><div class="separator" style="clear: both; text-align: left;"><br /></div>\[\int\limits_0^1 {\int\limits_0^1 {f(s)g(s)f(t)g(t){\kern 1pt} ds{\kern 1pt} dt} } \le \int\limits_0^1 {\int\limits_0^1 {f{{(s)}^2}g{{(t)}^2}{\kern 1pt} ds{\kern 1pt} dt} } \]<br />Where we're integrating first on $s$ (let's say this is the x-axis) and then on $t$ (the y-axis). To reflect anything, we need to actually be dealing with that thing, so split the domain of $s$ (which we can do, since $t$ is still a variable) into $[0,t]$ and $[t,1]$. This is equivalent to splitting the entire domain into the two triangles (convince yourself that this is the case if you don't see it immediately).<br /><br />\[\int\limits_0^1 {\int\limits_0^t {f(s)g(s)f(t)g(t){\kern 1pt} ds{\kern 1pt} dt} } + \int\limits_0^1 {\int\limits_t^1 {f(s)g(s)f(t)g(t){\kern 1pt} ds{\kern 1pt} dt} } \]\[\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \le \int\limits_0^1 {\int\limits_0^t {f{{(s)}^2}g{{(t)}^2}{\kern 1pt} ds{\kern 1pt} dt} } + \int\limits_0^1 {\int\limits_t^1 {f{{(s)}^2}g{{(t)}^2}{\kern 1pt} ds{\kern 1pt} dt} } \]<br />Where the split integrals represent the top-left and bottom-right squares respectively. Now how do we "reflect" the second part-integral on each side to match the domain of the first-part integral? The reflection is just:<br /><br />\[s' = t\]\[t' = s\]<br />If we transform the second part-integrals under this transformation:<br /><br />\[\int\limits_0^1 {\int\limits_0^t {f(s)g(s)f(t)g(t)\,ds\,} dt} + \int\limits_0^1 {\int\limits_{s'}^1 {f(t')g(t')f(s')g(s')\,dt'\,} ds'} \]\[\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \le \int\limits_0^1 {\int\limits_0^t {f{{(s)}^2}g{{(t)}^2}{\kern 1pt} ds{\kern 1pt} } dt} + \int\limits_0^1 {\int\limits_{s'}^1 {f{{(t')}^2}g{{(s')}^2}{\kern 1pt} dt'{\kern 1pt} } ds'} \]<br />(Don't mind the $x'$ notation for the new co-ordinates -- you should think of $x'$ as matching up with $x$) But our transformation isn't really over. The two part integrals are now integrating over the same <i>domain</i> -- the top-left triangle -- but in different ways. To see this, just consider the "way we were integrating" before the transformation and see how it transforms under our reflection:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/k3g5y.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="320" data-original-width="800" height="128" src="https://i.stack.imgur.com/k3g5y.png" width="320" /></a></div><br />... which are different parameterisations of the same region. So we just reparameterise the second part-integrals (shown in green) to match that of the blue integrals, leaving the integrand the same:<br /><br />\[\int\limits_0^1 {\int\limits_0^t {f(s)g(s)f(t)g(t){\kern 1pt} ds{\kern 1pt} } dt} + \int\limits_0^1 {\int\limits_0^{t'} {f(t')g(t')f(s')g(s'){\kern 1pt} ds'{\kern 1pt} } dt'} \]\[\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \le \int\limits_0^1 {\int\limits_0^t {f{{(s)}^2}g{{(t)}^2}{\kern 1pt} ds{\kern 1pt} } dt} + \int\limits_0^1 {\int\limits_0^{t'} {f{{(t')}^2}g{{(s')}^2}{\kern 1pt} ds'{\kern 1pt} } dt'} \]<br />And then we can add the integrals:<br /><br />\[\int\limits_0^1 {\int\limits_0^t {\left[ {2f(s)g(s)f(t)g(t)} \right]\,{\kern 1pt} ds{\kern 1pt} } dt} \,\, \le \,\,\,\int\limits_0^1 {\int\limits_0^t {\left[ {f{{(s)}^2}g{{(t)}^2} + f{{(t)}^2}g{{(s)}^2}} \right]\,{\kern 1pt} ds{\kern 1pt} } dt} \]<br />Which is true as it is true locally, i.e.<br /><br />\[2f(s)g(s)f(t)g(t) \le f{(s)^2}g{(t)^2} + f{(t)^2}g{(s)^2}\]<br />Which proves our result.<br /><br /><hr /><br />What's the point of going through all of this? Well, the point is that if I'd just thrown the substitutions at you -- or worse, the reparameterisation of the region, or the splitting in the first place -- without any motivation, then it would take about 20 days before there'd be murder charges on you and a tombstone on me. The reason you make them is because you want to unify the integrands -- but this motivation comes at the <i>very beginning</i>, before you start doing any substitutions, because that's why you're doing the substitutions in the first place, <i>that's how you come up with them</i>.<br /><br /><b>Exercise: </b>Motivate the substitutions and changes in the Gaussian integral, $\int_{-\infty}^\infty e^{-x^2}dx=\sqrt{\pi}$. Hint : what's the significance of the two-variable normal distribution?<br /><br />Another exercise: consider the integral $\int_\gamma \frac{f(z)}zdz$ ($\gamma$ is a circle) with the substitution $z=re^{i\theta}$ -- what substitution is this? Understand this geometrically with thin triangles and averaging on circles or whatever.antiderivativescalculusdomain splittingintegrationlinear basismathematicsvariable substitutionSat, 27 Oct 2018 14:06:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7515239367411347178Abhimanyu Pallavi Sudhir2018-10-27T14:06:00ZComment by Abhimanyu Pallavi Sudhir on Mate in 0 moves
https://puzzling.stackexchange.com/questions/74086/mate-in-0-moves/74093#74093
@FabianRöling Pawns have directions.Mon, 22 Oct 2018 09:27:58 GMThttps://puzzling.stackexchange.com/questions/74086/mate-in-0-moves/74093?cid=221467#74093Abhimanyu Pallavi Sudhir2018-10-22T09:27:58ZAnswer by Abhimanyu Pallavi Sudhir for Newton's Third Law and conservation of momentum
https://physics.stackexchange.com/questions/435941/newtons-third-law-and-conservation-of-momentum/436015#436015
3<p>As far as the actual physics is concerned, it is meaningless to talk of whether conservation of momentum is "more fundamental" than Newton's third law -- you can axiomatise classical physics in either way -- from Newton's laws, from conservation laws, from symmetry laws, from an action principle, whatever. You can prove the resulting theories are equivalent, in the sense that all the alternative axiomatic systems imply each other.</p>
<p>In terms of understanding, it makes sense to have multiple different frameworks in your head -- a symmetry-based framework is really good intuitively, especially once you understand Noether's theorem, while an action principle is the most powerful and also more useful when you leave the realm of classical physics. Treating Newton's laws as axioms isn't a great idea -- it's mostly just historically relevant.</p>
<p>When you learn more advanced physics, conservation of momentum <em>will</em> start "feeling" more fundamental -- this is simply because momentum is an interesting quantity to talk about.</p>Sun, 21 Oct 2018 21:20:57 GMThttps://physics.stackexchange.com/questions/435941/-/436015#436015Abhimanyu Pallavi Sudhir2018-10-21T21:20:57ZDerive $P \to \neg \neg P$ in a structure with not and implies
https://math.stackexchange.com/questions/2962525/derive-p-to-neg-neg-p-in-a-structure-with-not-and-implies
5<p>We can define an abstract system with the following three axiom schemes that define <span class="math-container">$\to$</span> and <span class="math-container">$\lnot$</span> as follows:</p>
<p>ax1. <span class="math-container">$P\to(Q\to P)$</span></p>
<p>ax2. <span class="math-container">$(\lnot Q \to \lnot P)\to(P\to Q)$</span></p>
<p>ax3. <span class="math-container">$(P\to(Q\to R))\to((P\to Q)\to(P\to R))$</span></p>
<p>And any logical expressions may be substituted for <span class="math-container">$P, Q, R$</span>. Now obviously, you can't assume anything else (not even any definition of <span class="math-container">$\lnot$</span>, etc.) these two objects <span class="math-container">$\to$</span> and <span class="math-container">$\lnot$</span> need not have anything at all to do with the standard implication and negation we know of, they just happen to satisfy the above properties. But with the above, we can prove basic "logical laws" like:</p>
<p><span class="math-container">$$P\to P$$</span></p>
<p>(which we can prove by applying ax3 on ax1, showing that <span class="math-container">$(P\to Q)\to(P\to P)$</span>, so we just need to construct a <span class="math-container">$Q$</span> for any <span class="math-container">$P$</span> to imply, and such a <span class="math-container">$Q$</span> is provided by ax1, <span class="math-container">$Q:= (R\to P)$</span>.)</p>
<p>Is it possible to prove this? :--</p>
<p><span class="math-container">$$P\to \lnot\lnot P$$</span></p>
<p>The person who gave me this problem insists it is provable, although it seems to me that such a proof is impossible, as none of the axioms increase the depth of <span class="math-container">$\lnot$</span>s across the <span class="math-container">$\to$</span> (i.e. none of them have a more knotty right-hand-side than left-hand-side).</p>logicpropositional-calculusaxiomshilbert-calculusFri, 19 Oct 2018 19:51:34 GMThttps://math.stackexchange.com/q/2962525Abhimanyu Pallavi Sudhir2018-10-19T19:51:34ZDiscovering the Fourier transform
https://thewindingnumber.blogspot.com/2018/10/discovering-fourier-transform.html
0Consider a function with period 1 -- computing its Fourier series, you write it as:<br /><br />\[f(x) = \sum\limits_{n = - \infty }^\infty {{a_n}{e^{i2\pi \,\,nx}}} \]<br />Where<br /><br />\[{a_n} = \int_{-1}^1 {f(x){e^{ - 2\pi inx}}dx} \]<br />That's all standard and trivial. But suppose you wanted to study a function with a higher period (we will tend this period to infinity) -- what would that look like? Well, consider $g(x)=f(x/L)$, which is this function we're looking for -- then we can rewrite the above identities as:<br /><br />\[g(xL) = \sum\limits_{n = - \infty }^\infty {{a_n}{e^{i2\pi {\kern 1pt} {\kern 1pt} nx}}} \Rightarrow g(x) = \sum\limits_{n = - \infty }^\infty {{a_n}{e^{i2\pi {\kern 1pt} {\kern 1pt} nx/L}}} \]<br />\[{a_n} = \int_{-1}^1 {g(xL){e^{ - 2\pi inx}}dx} \Rightarrow {a_n} = \int_{-L}^L {g(x){e^{ - 2\pi inx/L}}dx/L} \]<br /><br />Where we transformed $x\to x/L$.<br /><br />This seems all too trivial and useless, and maybe you're looking for a little trick to turn this into something interesting. But tricks must typically also arise from some sort of insight. Let's assume for a moment that we didn't know anything about variable substitutions or transformations like the kind we did above (and indeed, the idea behind variable substitutions also comes from a geometric understanding of the corresponding transformation) and think about how we may re-think the Fourier transform in its context.<br /><br />Well, if the function's period is $P$, in other words it is stretched out by $P$, the same logic must be used to derive the Fourier series for the new function as for the function with period 1 -- specifically, sines and cosines with <i>longer periods </i>than $P$ don't matter (their coefficient must be zero, because otherwise you've introduced an element into the function that doesn't repeat with that period), but those with <i>shorter, divisible periods </i>matter, because they influence the value of the function within the period, perturbing it by little bits to get to the right function.<br /><br />So when dealing with our new period $L$, one would expect periods that are fractions of $L$, i.e. $L/n$, as opposed to just $1/n$. So $n/L$ is "more important" than $n$, and indeed it seems very easy to transform the summation into one in terms of this new variable, which we will still call $n$ (i.e. transform $n/L\to n$):<br /><br />\[g(x) = \sum\limits_n^{} {{a_n}{e^{i2\pi nx}}} \]<br />\[{a_n} = \frac{1}{L}\int_{-L}^L {g(x){e^{ - 2\pi inx}}dx} \]<br /><br />Where we labeled $a_{nL}$ as just $a_n$, because that's just a subscript, the labeling doesn't matter. Just remember that $n$ is no longer just an integer/multiple of 1, but a multiple of any fraction $1/L$.<br /><br />Now note how a non-periodic function is just a function with infinite period, i.e. $L\to\infty$. So $n$ stops being a discrete integer and starts approaching a continuous variable, which we'll call $s$, writing $a_n$ as $a(s)ds$ (why the $ds$? because the increment in $n$ is just $1/L$, which appears in the expression for $a_n$).<br /><br />\[g(x) = \int_{ - \infty }^\infty {ds\,\,a(s){e^{i2\pi sx}}} \]<br />\[a(s) = \int_{ - \infty }^\infty {dx\,\,g(x){e^{ - i2\pi sx}}} \]<br /><br />Which is just a pretty satisfying result.<br /><br /><hr /><br />Recall again the expressions we got for the Fourier transform and its inverse:<br /><br />\[f(t) = \int_{ - \infty }^\infty {ds\,\,\hat f(s){e^{i2\pi ts}}} \]<br />\[\hat f(s) = \int_{ - \infty }^\infty {dt{\kern 1pt} {\kern 1pt} f(t){e^{ - i2\pi st}}} \]<br />(We typically say the Fourier transform maps time-domain functions to frequency-domain ones, so we consider the latter to be the Fourier transform and the first equation to be its inverse.) Note how you can easily turn the first one into an actual Fourier transform, by transforming $s\to -s$:<br /><br />\[f(t) = \int_{ - \infty }^\infty {ds\,\,\hat f( - s){e^{ - i2\pi ts}}} \]<br />In other words:<br /><br />\[{\mathcal{F}^{ - 1}}\left\{ {f(s)} \right\} = \mathcal{F}\left\{ {f( - s)} \right\}\]<br />And of course that means ${\mathcal{F}^4} = I$, the identity operator (kind of like the derivative on complex exponentials/sine and cosine, is it not?).calculusfinite-domain fourier transformsfourierfourier seriesfourier transformsintegral transformmathematicsWed, 10 Oct 2018 21:12:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1844418461803314162Abhimanyu Pallavi Sudhir2018-10-10T21:12:00ZQuaternion introduction: Part I
https://thewindingnumber.blogspot.com/2018/09/quaternion-introduction-part-i.html
0I generally really like the content produced at <a href="https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw">3blue1brown</a>, but their <a href="https://www.youtube.com/watch?v=d4EgbgTm0Bg">recent video on quaternions</a> was just downright terrible. It entirely lacked Grant Sanderson's signature "discover it for yourself" approach, i.e. motivating the idea from the ground-up, and focused too much on an arbitrary formalism (stereographic projections aren't necessary for visualising anything).<br /><br />The right way to motivate quaternions is to start by thinking about generalising complex numbers to higher dimensions. Complex numbers are a remarkable and elegant idea -- if you don't understand why I'm saying this, you could either get off the grid and spend the rest of your life as a circus monkey, or you could read my posts "<a href="https://thewindingnumber.blogspot.com/2017/08/symmetric-matrices-null-row-space-dot-product.html">Null and row spaces, transpose and the dot product</a>" and "<a href="https://thewindingnumber.blogspot.com/2016/11/making-sense-of-eulers-formula.html">Making sense of Euler's formula</a>".<br /><br />The key idea behind complex numbers is that they are an alternate, simple representation of a specific set of linear transformations, namely: two-dimensional spirals (scaling and rotations). Note, similarly, that the real numbers can also be considered an alternate representation of e.g. scaling in one dimension.<br /><br />The natural way to generalise complex numbers to more than two dimensions may seem to be to have an imaginary unit for each possible rotation (or more precisely, each "basis rotation"). In three dimensions, the basis has three planes of rotation, and could be e.g. rotations in the <i>xy</i>-plane, rotations in the <i>yz</i>-plane and rotations in the <i>zx</i> plane (you may have heard these as rotations "around" the <i>z</i>, <i>x</i> and <i>y</i> axes respectively, referring to the axes that remain invariant during the rotation -- however, as it turns out, in a greater number of dimensions $n$, the number of dimensions held invariant is $n-2$, which is only equal to 1 -- i.e. a single axis -- in 3 dimensions. e.g. in 4 dimensions, an $xy$-rotation would leave the $zw$ plane invariant.)<br /><br />So let's try out this formalism, because it seems promising. We could write, e.g. <i>i</i> for the <i>yz</i> rotation, <i>j</i> for the <i>zx</i> rotation and <i>k</i> for the <i>xy</i> rotation. Try to work out some of the algebra here for yourself. What does $ij=?$ equal? What does $jk = ?$ What does $i^2=?$ equal?<br /><br />As it turns out, none of these transformations result in anything very interesting. It would have certainly been elegant if you'd gotten nice results, like $ij=k$, or something, but you don't. One of the neat things about the complex number system is that not only do all complex numbers together, or all unit complex numbers together, form a group -- even $\{1,i,-1,-i\}$ forms a group under multiplication. But $\{1,-1,,i,j,k,-i,-j,-k\}$ <i>do not</i> form a group.<br /><br />How would one solve this problem? Well, the reason $i^2$ doesn't equal minus 1 is that it only offers a reflection across the $x$-axis. The matrix representing $i^2$ is:<br /><br />$${\left[ {\begin{array}{*{20}{c}}1&0&0\\0&0&{ - 1}\\0&1&0\end{array}} \right]^2} = \left[ {\begin{array}{*{20}{c}}1&0&0\\0&{ - 1}&0\\0&0&{ - 1}\end{array}} \right]$$<br />(If you can't come up with the matrix for $i$, you should review the linear algebra series -- or the circus monkey thing.) What if you reflected across all three axes, in some order? You'd have:<br /><br />$${i^2}{j^2}{k^2} = \left[ {\begin{array}{*{20}{c}}1&0&0\\0&{ - 1}&0\\0&0&{ - 1}\end{array}} \right]\left[ {\begin{array}{*{20}{c}}{ - 1}&0&0\\0&1&0\\0&0&{ - 1}\end{array}} \right]\left[ {\begin{array}{*{20}{c}}{ - 1}&0&0\\0&{ - 1}&0\\0&0&1\end{array}} \right] = \left[ {\begin{array}{*{20}{c}}1&0&0\\0&1&0\\0&0&1\end{array}} \right]$$<br />In other words, ${i^2}{j^2}{k^2} = 1$. Additionally you may have observed while crunching the numbers above that ${i^2}{j^2} = {k^2}$.<br /><br />This may give you an idea*. Here's another thing that may give you an idea: the reason you had $i^2=-1$ with complex numbers was that $i$ rotated <i>all</i> the axes in the plane. By contrast, $i,j,k$ only each rotate two of the three axes in 3-dimensional space.<br /><br />*the idea being that perhaps combinations of two rotations can give us more interesting results<br /><br />Well, how do you solve this problem? How do you create a rotation that "rotates all the axes"? Seemingly, you can't. Sure, you can define a rotation that rotates all three of the x, y and z axes, but that would still leave some other axis invariant, which we call "the axis of rotation". <b>Can we define a rotation that leaves no axis invariant?</b><br /><b><br /></b> In three dimensions, the answer is no. Any rotation leaves one axis invariant, and trying to rotate this axis requires rotating it with another axis, and the resulting product rotation still leaves some, calculable axis invariant.<br /><br /><div class="twn-furtherinsight">Calculate this axis.</div><br />The key is to extend our thinking to <i>four</i> dimensions. Here, you can have pairs of rotations <i>acting simultaneously</i> on two different pairs of axes. Since there are only four dimensions in four dimensions, all four axes are transformed.<br /><br />Now, the obvious thing to do here may be to define an imaginary number for each pair of rotations in four dimensions -- there are $\left( {\begin{array}{*{20}{c}}4\\2\end{array}} \right)=6$ rotations, and $\left( {\begin{array}{*{20}{c}}6\\2\end{array}} \right) = 15$ such pairs. But this would be too many "basis rotations", and the rotations would not be independent of each other, since rotations in 4 dimensions can be described with only 6 basis rotations.<br /><br />So how could we make use of our idea of using pairs of rotations as our basis for describing rotations?<br /><br />The key is to make one of our four axes "special" -- call this axis $t$, and the other three axes $x, y, z$. Instead of considering all 15 rotation-pairs, we only consider the following three:<br /><br />$$\begin{array}{l}i = (tx,yz)\\j = (ty,\overline{xz})\\k = (tz,xy)\end{array}$$<br /><div class="twn-furtherinsight">This is not the only possible representation of the quaternions, of course. Even among complex numbers, you have two possible representations -- you could make $i$ a counter-clockwise rotation, as is conventional, or a clockwise one, i.e. there is a symmetry between $i$ and $-i$. For quaternions, it turns out there are 48 different possible representations -- prove this.</div><br />Where $tx$ represents a rotation that sends $t$ to $x$ (i.e. a counter-clockwise rotation on a plane where $t$ is the x-axis and $x$ is the y-axis) and $\overline{xz}$ represents a rotation that sends $z$ to $x$, i.e. the clockwise rotation on a plane where $x$ is the x-axis and $z$ is the y-axis.<br /><br />It turns out that these pairs -- called <i>quaternions</i> -- in fact allow the representation of 3-dimensional rotations, since you need only a $\left( {\begin{array}{*{20}{c}}3\\2\end{array}} \right)=3$-dimensional basis to represent rotations in 3 dimensions.<br /><br /><div class="twn-furtherinsight">Think: Are there any other dimensions that allow such a system to be defined? Can you have, e.g. "hexternions"?</div><br /><div class="twn-pitfall">Note that however tempting it may seem, there is no known natural description of special relativity in terms of quaternions. Sorry.</div><br />One may work through the algebra of these new quaternions by tracking the position of each axis through the multiplication, and as it turns out, it is indeed much more elegant than the more obvious representation detailed earlier:<br /><br />$$\begin{array}{l}j = k,jk = i,ki = j\\{i^2} = {j^2} = {k^2} = - 1\\ijk = - 1\end{array}$$<br />In the next several articles, we will look at exactly how 3 dimensional rotations can be represented with quaternions, the relation between quaternions and the dot and cross products through the commutative and anti-commutative parts, and further extensions of the quaternions to higher dimensions.complex algebracomplex numberscross productgroup theorylinear algebramathematicsquaternionsFri, 21 Sep 2018 16:41:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-6898563123316782451Abhimanyu Pallavi Sudhir2018-09-21T16:41:00ZAnswer by Abhimanyu Pallavi Sudhir for If force is a vector, then why is pressure a scalar?
https://physics.stackexchange.com/questions/429998/if-force-is-a-vector-then-why-is-pressure-a-scalar/430008#430008
4<p>Pressure is a scalar because it does not behave as a vector -- specifically, you can't take the "components" of pressure and take their Pythagorean sum to obtain its magnitude. Instead, pressure is actually proportional to the <em>sum</em> of the components, <span class="math-container">$(P_x+P_y+P_z)/3$</span>.</p>
<p>The way to understand pressure is in terms of the stress tensor, and pressure is equal to the trace of the stress tensor. Once you understand this, the question becomes equivalent to questions like "why is the dot product a scalar?" (trace of the tensor product), "why is the divergence of a vector field a scalar?" (trace of the tensor derivative), etc. </p>
<p>There is no physical significance to taking the diagonal components of a tensor and putting them in a vector -- there <em>is</em> a physical significance to adding them up, and the invariance properties of the result tells you that it is a scalar.</p>
<p>See also: <a href="https://physics.stackexchange.com/questions/186045/why-do-we-need-both-dot-product-and-cross-product/419873#419873">Why do we need both dot product and cross product?</a></p>Fri, 21 Sep 2018 08:57:17 GMThttps://physics.stackexchange.com/questions/429998/-/430008#430008Abhimanyu Pallavi Sudhir2018-09-21T08:57:17ZAnswer by Abhimanyu Pallavi Sudhir for How can the solutions to equations of motion be unique if it seems the same state can be arrived at through different histories?
https://physics.stackexchange.com/questions/426445/how-can-the-solutions-to-equations-of-motion-be-unique-if-it-seems-the-same-stat/426453#426453
1<p>"The jar is empty at present" just tells you $f(0)$. You also need $f'(0)$, $f''(0)$, etc.</p>Mon, 03 Sep 2018 09:46:25 GMThttps://physics.stackexchange.com/questions/426445/-/426453#426453Abhimanyu Pallavi Sudhir2018-09-03T09:46:25Zhtaccess – Deny all access to files?
https://stackoverflow.com/questions/52136524/htaccess-deny-all-access-to-files
0<p>I am using <a href="http://www.htaccesseditor.com/en.shtml" rel="nofollow noreferrer">this</a> .htaccess editor, which tells me that the following code to "Deny all access to files" is "strongly recommended" (and apparently removing the minus from "-Indexes" allows access):</p>
<pre><code><Files ~ "^\.(htaccess|htpasswd)$">
deny from all
</Files>
Options -Indexes
order deny,allow
</code></pre>
<p>I'm new to .htaccess files, and honestly don't really know what this code does (I haven't yet set up my web server to try things out for myself), but the first line seems to suggest it only denies access to the htaccess and htpasswd files. </p>
<p><strong>Does it only deny access to the htaccess file or to literally "all files" as the description states?</strong> </p>
<p>Obviously, I don't want the latter -- in fact, I don't want to deny access to any file except if there are security concerns.</p>.htaccessSun, 02 Sep 2018 11:23:48 GMThttps://stackoverflow.com/q/52136524Abhimanyu Pallavi Sudhir2018-09-02T11:23:48ZWhy are negative temperatures hot?
https://thewindingnumber.blogspot.com/2018/08/why-are-negative-temperatures-hot.html
0You've probably heard the statement "negative temperatures are <i>hot</i>!", referring of course to negative absolute temperatures.<br /><br />But why are they hot? Well, a common explanation is that it's not really the temperature $T$ that is the fundamental quantity, but rather the statistical beta, or "coldness" $\beta=1/T$. So negative temperatures have <i>negative</i> coldness, which is hotter than any positive temperature, since even the hottest positive temperature is only going to give you a small, but positive coldness. So the fact that negative temperatures are hot is a result of the fact that $1/x$ is not really decreasing everywhere, due to its discontinuity.<br /><br />But why? Why is $\beta$ the fundamental quantity? Why should we arbitrarily consider this to be our metric of hotness and coldness, and not $T$?<br /><br />This is a really interesting example to teach people to think in a positivist way in physics, and to operationalise things. What does it mean for something to be hot?<br /><br />Well, you touch it and you say "Ouch!"<br /><br />Seriously, that's all there is -- if you touch something hot, you say "Ouch!", if you touch something cold, you say "Whee!", or something. That's the fundamental, positivist definition of hotness -- "Does it feel hot?"<br /><br />Well, why would something feel hot? <i>Because it transfers heat to you</i>. And this is our operational, positivistic definition of hotness -- if one body transfers heat to another body, it is said to be hotter than the other body.<br /><br />So we need to find out a criterion to decide the direction of heat flow between two bodies. In the past, you've probably taken for granted that heat is transferred from a body with higher temperature to that with lower temperature, but that's just a crappy high school definition. What really causes heat diffusion? Well, when there are a lot of fast-moving particles in one place and slow-moving particles in another, it turns out that a state where the particles are more uniformly spread-out is more likely to happen in future. This is just the requirement that entropy must increase -- it's the second law of thermodynamics.<br /><br />So if we have body 1 with temperature $T_1$ and body 2 with temperature $T_2$, with heat flow of $Q$ from body 1 to body 2, then the second law of thermodynamics is stated as:<br /><br />$$\Delta S_1+\Delta S_2>0$$<br />$$-\Delta Q/T_1+\Delta Q/T_2>0$$<br />$$\Delta Q\left(\frac1{T_2}-\frac1{T_1}\right)>0$$<br />In other words -- if $\Delta Q>0$, i.e. if the heat flow is really from body 1 to body 2, then we require $1/T_2>1/T_1$, and if the heat flow is from body 2 to body 1 ($\Delta Q<0$), we require $1/T_1>1/T_2$.<br /><br />And there you have it! Heat does <i>not</i> flow from the body with higher temperature to the body with lower temperature -- it flows from the body with lower $1/T$ to the body with higher $1/T$. For positive temperatures, these are the same thing -- but negative temperatures have the lowest $1/T$, and are thus hotter.<br /><br /><hr /><br />So those of you want the U.S. to switch to Celsius, or those who report temperatures in Kelvin for no good reason except intellectual signalling... perhaps start reporting <i>statistical betas</i> in 1/Kelvins instead.<br /><br />...<br />"Hey, Alexa, is it chilly outside?"<br />"The coldness in your area is 0.00375 anti-Kelvin."<br /><br />"...I think I'll just risk freezing to death."entropymetric systemnegative temperaturesphysicsstatistical betastatistical physicsthermal physicsthermodynamicsSat, 18 Aug 2018 19:17:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-3164462372727249499Abhimanyu Pallavi Sudhir2018-08-18T19:17:00ZAnswer by Abhimanyu Pallavi Sudhir for From the speed of light being an invariant to being the maximum possible speed
https://physics.stackexchange.com/questions/331119/from-the-speed-of-light-being-an-invariant-to-being-the-maximum-possible-speed/423423#423423
0<p>A simple thought experiment does the trick -- consider a train moving faster than light, and it has headlights (it's a glass train). According to a stationery observer (stationery in a reference frame where the train is faster than light), the train must always be in front of the light, but according to an observer hanging out of the train, the light must be in front of him, since light speed is still $c$.</p>
<p>It might not seem like this relativeness of the order of the two objects is a problem, but it is -- say, for instance, the train is moving towards a high-tech wall which is trained to do this when switched ON:
(1) if hit by a train, make world explode
(2) if light is incident, switch OFF.
The wall is currently switched ON. According to one observer, the world explodes, whereas according to another, it doesn't. This is an inconsistency.</p>
<p>Why wouldn't this argument apply to <em>any</em> speed and prohibit all motion? For example, why can't the wall be programmed to switch off a certain amount of time after which light is incident? Relativity says this is okay, because time can dilate and transform scale between reference frames. </p>
<p>But in order to make FTL speeds okay, you need to allow time to flip direction -- this is why the real condition is "to go faster than light, you must forgo causality", or simply, "locality = causality".</p>Sat, 18 Aug 2018 12:28:48 GMThttps://physics.stackexchange.com/questions/331119/-/423423#423423Abhimanyu Pallavi Sudhir2018-08-18T12:28:48ZAnswer by Abhimanyu Pallavi Sudhir for Link between Special relativity and Newtons gravitational law
https://physics.stackexchange.com/questions/123243/link-between-special-relativity-and-newtons-gravitational-law/423379#423379
0<p>Consider three theories:</p>
<p>$$L_A=1$$
$$L_B=1+h$$
$$L_C=1+h+h^2$$</p>
<p>Theory A is a special case of Theory C when $h$ is small, Theory B is a special case of C when $h$ is small, doesn't this mean A and B are the same?</p>
<p>This is not a perfect analogy, but an example as to why this sort of reasoning breaks down.</p>Sat, 18 Aug 2018 07:13:36 GMThttps://physics.stackexchange.com/questions/123243/-/423379#423379Abhimanyu Pallavi Sudhir2018-08-18T07:13:36ZAnswer by Abhimanyu Pallavi Sudhir for Why is velocity defined as 4-vector in relativity?
https://physics.stackexchange.com/questions/423360/why-is-velocity-defined-as-4-vector-in-relativity/423364#423364
5<p>"It should transform like a four-vector under a Lorentz transformation" is a generalisation of several intuitions you typically have regarding how natural objects/tensors should behave in special relativity -- an obvious one is "no special status to any individual dimension, since space and time are inherently symmetric. That $dx^\mu/dx^0$ doesn't transform like a four-vector is obvious from the fact that it gives special preference to time.</p>
<p>The conventional way to define four-velocity in relativity is as $dx^\mu/ds$. Your 2-tensor idea is cute -- it is similar to the angle tensor generalised to four-dimensions -- but it doesn't satisfy the uses we have of the standard four-velocity (e.g. how would the four-momentum be defined? $m\,dx^\mu/dx^\nu$? That wouldn't be conserved.)</p>Sat, 18 Aug 2018 06:11:14 GMThttps://physics.stackexchange.com/questions/423360/why-is-velocity-defined-as-4-vector-in-relativity/423364#423364Abhimanyu Pallavi Sudhir2018-08-18T06:11:14ZAnswer by Abhimanyu Pallavi Sudhir for Sheldon Cooper Primes
https://math.stackexchange.com/questions/1024969/sheldon-cooper-primes/2877436#2877436
1<p>I created a <a href="https://www.khanacademy.org/computer-programming/run-testssheldon-cooper-primes/5927601331011584" rel="nofollow noreferrer">[script]</a> you can play with here to test this out. Note that the answer depends on your numerical base -- among all bases I've tried, 10 seems to be the <em>only</em> base in which there's a Sheldon Cooper prime. </p>
<p>Base 16 seems promising, however -- it has a large number of "special emirps", and actually provides primes with the appropriate product of digits, which very few bases provide. </p>
<p>Can someone try base = 16, convbase = 2 (and perhaps other bases in multiple tabs) with a large uppercap (e.g. 10,000,000) using fastcount = false? It would take ~15 hours for an upper cap of 10 million -- or just 90 minutes for an uppercap of 1 million -- but I can't leave my laptop on for so long (the fan is malfunctioning).</p>Thu, 09 Aug 2018 16:47:08 GMThttps://math.stackexchange.com/questions/1024969/-/2877436#2877436Abhimanyu Pallavi Sudhir2018-08-09T16:47:08ZA curious infinite sum arising from an elementary geometric argument
https://thewindingnumber.blogspot.com/2018/07/a-curious-infinite-sum-arising-from.html
0A well-known elementary geometric argument for the sum of an infinite geometric progression proceeds as follows: consider a Euclidean triangle $\Delta ABC$ with angles $A=\alpha$, $B=\beta$, $C=2\beta$ and bisect $C$ to create a point $C'$ on $AB$. Then $\Delta ABC \sim \Delta ACC'$. Record the area of $\Delta C'BC$ to a counter. Repeat the same bisection with $C'$, $C''$, ad infinitum, each time adding to the counter the area of the piece of the triangle that <i>isn't</i> similar to the parent triangle and bisecting the triangle that is.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-b9Xt5bxIYFI/W11MmoHSv6I/AAAAAAAAFBc/a9Fv2xTtFYgM7SGtqDZRlV19sVA39ZoCgCLcBGAs/s1600/tribasic.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="129" data-original-width="230" src="https://3.bp.blogspot.com/-b9Xt5bxIYFI/W11MmoHSv6I/AAAAAAAAFBc/a9Fv2xTtFYgM7SGtqDZRlV19sVA39ZoCgCLcBGAs/s1600/tribasic.png" /></a></div><br />Suppose the area of the original triangle $\Delta ABC$ is 1, and the piece $ACC'$ has area $x$ (thus each succeeding similar copy has area a fraction of $x$ of the preceding triangle). Then the total value of our counter, which approaches 1, is:<br /><br />$$(1-x)+x(1-x)+x^2(1-x)+...=1$$<br />$$1+x+x^2+...=\frac1{1-x}$$<br />Where $x$ depends on the actual angle $\beta$.<br /><br />It is interesting, however, to consider the case of a general scalene triangle $\Delta ABC$ where $C$ is not necessarily twice of $B$. Here each successive triangle wouldn't be similar to the last, thus we won't be dealing with a geometric series.<br /><br />Let the angles of $\Delta ABC$ be $A=\alpha$, $B=\beta$,$C=\pi-\alpha-\beta$. We bisect angle $C$, as before, adding to our counter the piece that contains the angle $B$. The remaining triangle has angles $\alpha$, $\frac{\pi-\alpha-\beta}{2}$ and $\pi-\alpha-\frac{\pi-\alpha-\beta}{2}$.<br /><br />We keep repeating the process, each time bisecting the angle that is neither $\alpha$ nor the angle formed as half the angle that was just bisected, and adding to our counter the area of the piece that does not contain the angle $A$, while splitting the piece that does.<br /><br />To keep track of the angles in each successive triangle, we define three series:<br /><br />$$\begin{gathered}<br />{\alpha _n} = \alpha\\<br />{\beta _n} = {\gamma _{n - 1}}/2\\<br />{\gamma _n} = \pi - {\alpha _n} - {\beta _n}\\<br />\end{gathered}$$<br />These are defined recursively, of course, so we calculate the explicit form by substituting $\gamma_n$ into $\beta_n$ to get a recursion within $\beta_n$ -- then with the simple initial-value conditions $\alpha_0=\alpha$, $\beta_0=\beta$, etc. we get:<br /><br />$$\begin{gathered}<br />{\alpha _n} = \alpha\\<br />{\beta _n} = \frac{{\pi - \alpha }}{3} + {\left( { - \frac{1}{2}} \right)^n}\left( {\beta - \frac{{\pi - \alpha }}{3}} \right)\\<br />{\gamma _n} = \frac{{2(\pi - \alpha )}}{3} - {\left( { - \frac{1}{2}} \right)^n}\left( {\beta - \frac{{\pi - \alpha }}{3}} \right)\\<br />\end{gathered}$$<br />The area ratio of the piece we're keeping at each stage is $\frac{{\sin {\alpha _n}}}{{\sin {\alpha _n} + \sin {\beta _n}}}$, therefore the convergence of their sum of their areas to 1 implies:<br /><br />$$\begin{gathered}<br />\frac{{\sin \alpha }}{{\sin \alpha + \sin \beta }} + \frac{{\sin \beta }}{{\sin \alpha + \sin \beta }}\frac{{\sin \alpha }}{{\sin \alpha + \sin {\beta _1}}} \hfill \\<br />\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, + \frac{{\sin \beta }}{{\sin \alpha + \sin \beta }}\frac{{\sin {\beta _1}}}{{\sin \alpha + \sin {\beta _1}}}\frac{{\sin \alpha }}{{\sin \alpha + \sin {\beta _2}}} + ... = 1 \hfill \\ <br />\end{gathered} $$<br />Or more compactly:<br /><br />$$\sum\limits_{k = 0}^\infty {\left[ \left(1-x_k(\alpha,\beta)\right)\prod\limits_{j = 0}^{k - 1} {x_j(\alpha,\beta)} \right]} = 1$$<br />Where:<br /><br />$${x_k}(\alpha ,\beta ) = \frac{{\sin \left( {\frac{{\pi - \alpha }}{3} + {{\left( { - \frac{1}{2}} \right)}^k}\left( {\beta - \frac{{\pi - \alpha }}{3}} \right)} \right)}}{{\sin \alpha + \sin \left( {\frac{{\pi - \alpha }}{3} + {{\left( { - \frac{1}{2}} \right)}^k}\left( {\beta - \frac{{\pi - \alpha }}{3}} \right)} \right)}}$$<br />For all values of $\alpha$ and $\beta$.<br /><br /><hr /><br />Well, have we truly discovered something new? <br /><br />Turns out, no. It doesn't even matter what $x_k(\alpha,\beta)$ is, really -- the identity $\sum\limits_{k = 0}^\infty {\left[ \left(1-x_k(\alpha,\beta)\right)\prod\limits_{j = 0}^{k - 1} {x_j(\alpha,\beta)} \right]} = 1$ will always be true. Indeed, it is a telescoping sum:<br /><br />$$\begin{gathered}<br />1 - {x_0} + \hfill \\<br />\left( {1 - {x_1}} \right){x_0} + \hfill \\<br />\left( {1 - {x_2}} \right){x_0}{x_1} + \hfill \\<br />\left( {1 - {x_3}} \right){x_0}{x_1}{x_2} + \hfill \\<br />... = 1 \hfill \\ <br />\end{gathered} $$<br />All that is required is that the final term, $x_0x_1x_2x_3...x_k$ approaches 0 as $k\to\infty$ -- <a href="https://thewindingnumber.blogspot.com/2018/07/intuition-to-convergence.html">this ensures sum convergence</a>. (So I suppose I was not completely right when I said it doesn't matter what $x_k$ is -- but considering renormalisation and stuff, I kinda was.)<br /><br />This raises two interesting questions:<br /><ol><li>How would this "telescoping sum" argument work for the simple geometric series?</li><li>Can we get interesting incorrect (? perhaps renormalisations) sums by choosing an $x_k$ sequence whose product doesn't approach zero?</li></ol><br />Well, for the geometric series we had $\beta = (\pi - \alpha )/3$ so that ${x_k}(\alpha ,\beta ) = x(\alpha,\beta)=\frac{{\sin \beta }}{{\sin \alpha + \sin \beta }}$. Indeed, one may confirm that setting $x_0=x_1=x_2=...$ yields the product of the geometric series and $1-x$, and that happens to be telescoping. This is really just our standard proof of the series, where we multiply the sum by $x$, subtract this from the original sum, etc.<br /><br />As for the second question -- consider, for example, $x_k=k+1$. It gives you the sum $1!\cdot1+2!\cdot2+3!\cdot3+...=-1$. Of course, this is just the identity $n\cdot n!=(n+1)!-n!$, and the telescope doesn't really cancel out so you're left with $\infty!-1$.blogdivergent sumsgeometric proofgeometric seriesinfinite seriesrenormalizationtelescoping sumSun, 29 Jul 2018 06:24:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1357005204593986817Abhimanyu Pallavi Sudhir2018-07-29T06:24:00ZProbability of immortality for a transhuman being
https://thewindingnumber.blogspot.com/2018/07/probability-of-immortality-transhuman.html
0Some species in nature, including the <i>Turritopsis dohrnii </i>"the immortal jellyfish" are <i>biologically immortal</i>. This means that they do not die due to biological reasons -- however, they obviously may die due to other physical reasons, like getting smashed with a hammer. If I asked you to calculate what the probability of a biologically immortal species being truly immortal -- i.e. of it <i>never dying</i> (ever) -- would be, what'd you answer?<br /><br />Well, obviously the probability is zero. Provided there is any chance at all of the jellyfish getting squashed by a hammer this year, with a sufficient amount of time you can be as certain as you want -- the probability can be as close to 1 as you want -- that the jellyfish will get squashed by a hammer.<br /><br />But what if the probability of getting smashed by a hammer in that year was <i>decreasing </i>with time? Perhaps this is not the case with jellyfish, but it certainly would be true for, e.g. a transhuman society where technological innovation continually decreases the probability of dying (to be precise, the probability density of being dead in the next interval of time $\Delta t$ given that you haven't already died).<br /><br />Let $p(t)\Delta t$ be the probability of our transhuman dying between $t$ and $t+\Delta t$. Then the probability of the transhuman <i>never</i> dying any time from 0 to infinity is:<br /><br />\[\begin{gathered}<br /> P = \left( {1 - p(0)\Delta t} \right)\left( {1 - p(\Delta t)\Delta t} \right)\left( {1 - p(2\Delta t)\Delta t} \right)... \\<br /> = \coprod\limits_{t = 0}^\infty {\left( {1 - p(t)dt} \right)} \\<br />\end{gathered} \]<br />Of course, we need to take the limit as $\Delta t \to 0$.<br /><br />The reason this problem is so interesting is because it introduces the idea of <i>multiplicative calculus</i>. If the product had been a sum, the solution would've been utterly, ridiculously straightforward. But since it's not, it's only really ridiculously staightforward. The natural way (no pun intended) to convert a product (we use the symbol \(\coprod {} \) to refer to the <i>multiplicative integral</i>) into a sum (or rather an integral) is to take the logarithm:<br /><br />\[\begin{gathered}<br />\ln P = \ln \left( {1 - p(0)\Delta t} \right) + \ln \left( {1 - p(\Delta t)\Delta t} \right) + \ln \left( {1 - p(2\Delta t)\Delta t} \right)... \\<br />= \int_0^\infty {\ln \left( {1 - p(t)dt} \right)} \\<br />\end{gathered} \]<br />This may look awkward to you -- and indeed, the standard form of the multiplicative integral typically has the $dt$ differential as the exponent of the integrand so as to obtain after taking the logarithm the additive integral in its standard form.<br /><br />But you might remember that<br /><br />\[\ln (1 - x) = - x - \frac{{{x^2}}}{2} - \frac{{{x^3}}}{3} - ...\]<br />Or to first-order in $x$ (since the "x" here, $p(t)dt$ approaches 0), $\ln (1 - x) \approx - x$. Thus:<br /><br />\[\ln P = - \int_0^\infty {p(t)dt} \]<br />Or:<br /><br />\[P = {e^{ - \int_0^\infty {p(t)dt} }}\]<br />Which is pretty neat! Interestingly, this means that if the integral of $p(t)$ diverges (e.g. if $p(t)\sim1/t$), you are <i>guaranteed</i> to eventually die. So this gives mankind a manual of how fast technological progress on this issue needs to be in the transhuman age to guarantee immortality. Internalise it in your demand, fellow robot!<br /><hr/><div class="twn-furtherinsight">Here, we've calculated the probability of <em>immortality</em>. The probability of eventual <em>mortality</em> is of course 1 minus this, but could also be calculated from the get go -- try this out. You'll get $P'=1-\int_0^\infty p(t) e^{-\int_0^t p(\tau) d\tau}dt$, which you can then simplify with a variable substitution. Perhaps this gives you some insight into variable substitutions in integrals of this sort.</div>biological immortalityimmortalitymultiplicative calculusprobabilitytranshumanismSat, 28 Jul 2018 18:05:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1083934993852776864Abhimanyu Pallavi Sudhir2018-07-28T18:05:00ZIntuition to convergence
https://thewindingnumber.blogspot.com/2018/07/intuition-to-convergence.html
2We've all seen these kinds of sums. You start with something obviously divergent, like:<br /><br />$$S = 1 + 2 + 4 + 8 + ...$$<br />And then apply standard manipulations on it to obtain a bizarrely finite result:<br /><br />$$\begin{gathered}<br /> \Rightarrow S = 1 + 2(1 + 2 + 4 + ...) \hfill \\<br /> \Rightarrow S = 1 + 2S \hfill \\<br /> \Rightarrow S = - 1 \hfill \\<br />\end{gathered} $$<br />How exactly is this result to be interpreted? Surely the definition of an infinite sum is as a limit of a finite sum as the upper limit increases without bound -- by this definition it would seem that $S$ evidently doesn't approach $-1$, it diverges to infinity. Is there, then, something wrong with the form of our argument? And if so, why does it seem to work for so many other sums, like convergent geometric progressions?<br /><br /><hr /><br />We'll get to all that in a moment, but first, let's talk about how to fold a tie into thirds. We know how to fold a tie -- or a strip of paper or a rope or whatever -- into halves, into quarters, into any power of two. But how would one fold it into thirds? Sure, we can approximate it by trial and error, but is there a more efficient algorithm to approximate it?<br /><br />Here's one way: start with some approximation to 1/3 of the tie -- any approximation, however good or bad. Now consider the rest of the tie (~2/3) and fold it in half. Take one of these halves -- <i>this is demonstrably a better approximation to 1/3 than your original</i>. In fact, the error in this approximation is exactly half the error in the original approximation. You can keep repeating this process, and approach an arbitrarily close value to 1/3.<br /><br />Why does it work? Well, it's obvious why it works. More interestingly, how could one have come up with this technique from scratch?<br /><br />The key insight here is that if you had started from exactly 1/3 and performed this algorithm, defined as $x_{n+1}=\frac12(1-x_n)$, the sequence would be constant -- it would be 1/3s all the way down.<br /><br />However, this is <i>not</i> a sufficient argument. For instance, here's another sequence of which 1/3 is a fixed point: the algorithm $x_{n+1}=1-2x_n$. However here, if you were to start with <i>any other number but 1/3, the sequence would not approach 1/3</i>, but rather diverge away. While 1/3 is still a fixed point, this is an <i>unstable</i> fixed point, while in the previous case it was a <i>stable</i> fixed point.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.imgur.com/U6M2g19.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="333" data-original-width="800" height="166" src="https://i.imgur.com/U6M2g19.png" width="400" /></a></div><br />But what exactly is wrong with extending the same argument to $x_{n+1}=1-2x_n$? Well, perhaps we should state the argument precisely in the case of $x_{n+1}=\frac12(1-x_n)$. The reason we know this converges to 1/3 regardless of the initial value is that 1/3 is the <i>only</i> value which stays the same in the algorithm (i.e. is a steady-state solution). Convergence of the sequence requires that the sequence the fluctuations get smaller, i.e. the sequence approaches a value that doesn't fluctuate around, it approaches a steady state.<br /><br />But this reveals our central assumption -- we <i>assumed</i> that the sequence is convergent at all! If it is convergent, then 1/3 is the only value it could converge to, because convergence means approaching a steady state, and 1/3 is the only steady state.<br /><br /><hr /><br />The same principle applies to our original problem -- an infinite series is also a sequence, a sequence of partial sums. Our mistake is really in this step:<br /><br />$$\begin{gathered}<br /> ... \hfill \\<br /> 1 + 2(1 + 2 + 4 + 8 + ...) = 1 + 2S \hfill \\<br />\end{gathered} $$<br />By declaring that this is the same $S$, we have assumed that this sum really has a value. To be even clearer, consider this (taking $n\to\infty$):<br /><br />$$\begin{gathered}<br /> S = 1 + 2 + 4 + ... + {2^n} \hfill \\<br /> S = 1 + 2(1 + 2 + 4 + ... + {2^{n - 1}}) = 1 + 2S? \hfill \\<br />\end{gathered} $$<br />In other words, we assumed that $S$ reaches a steady state, that removing the last term $2^n$ wouldn't change the value of the summation. This would've been true if we were dealing with $(1/2)^n$ instead, because then the partial sum does reach a steady state, since its "derivative", $(1/2)^n$, approaches approaches 0.<br /><br /><b>With that said,</b> the sum $1+2+4+8+...=-1$ (and other such surprising results) <i>can</i> in fact be correct. What we've proven here is that <i>if the sum converges, it converges to -1</i>. Otherwise, it's $2^\infty -1$. If you can construct an axiomatic system in which the sum does converge, where 0 behaves like $2^\infty$ in some specific sense, then the identity would be true. Such a system does in fact exist, it's called the 2-adic system.<br /><br /><hr /><br />You know, there is a sense in which you can understand the 2-adic system. When you take partial sums of $1+2+4+8+...$, you always get sums that are "1 less than a power of 2". $1+2+4+8+16=2^5-1$, for example -- what's the significance of $2^5$? Well, it's a number which 2 divides into 5 times. What's a number that 2 divides into an infinite number of times? Well, it's zero, and $0-1=-1$. Ths might sound like a ridiculous argument, and indeed it is false in our conventional algebra system, but the foundation of the 2-adic system.<br /><br /><div class = "twn-furtherinsight">Explain similarly why $1+3+9+27+...=-1/2$ in the 2-adic system.</div><br /><hr /><br />The understanding of convergence we gained here -- from the tie example -- was pretty fantastic. It applies to all sorts of infinite sequences -- ordinary recurrences, (such in the form of) infinite series, continued fractions, etc. The idea of stable and unstable fixed points is a general one, and a very important one. Recommended watching:<br /><br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/CfW845LNObM/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/CfW845LNObM?feature=player_embedded" width="320"></iframe></div>calculusconvergencedivergent sumsinfinite sequencesinfinite seriesmathematicsrenormalizationSun, 22 Jul 2018 16:26:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7661817356581994768Abhimanyu Pallavi Sudhir2018-07-22T16:26:00ZThe validity of Newton's three laws today
https://thewindingnumber.blogspot.com/2018/07/the-validity-of-newtons-three-laws-today.html
0It's often said that Newtonian physics is outdated and its laws are in fact incorrect. That's true, but it's intriguing to think about in what way exactly Newton's three laws have been replaced or generalised in relativity.<br /><ol><li>There are two ways to think about the first law -- the first is "inertial reference frames exist". This is unchanged in special relativity, but general relativity generalises the notion with that of geodesics. The law as it is typically stated -- "stuff moves in straight lines on spacetime unless forced" is generalised to the geodesic equation, $\frac{{{d^2}{x^\mu }}}{{d{s^2}}} = - {\Gamma ^\mu }_{\alpha \beta }\frac{{d{x^\alpha }}}{{ds}}\frac{{d{x^\beta }}}{{ds}}$.</li><li>$F=dp/dt$ is generalised to $F=dp/d\tau$ in special relativity, and is replaced by a covariant derivative in general relativity. $F=ma$ has some weirder changes.</li><li>The third law is the conservation of momentum. This is replaced in General Relativity by the statement $\nabla^\mu T_{\mu\nu}=0$ ($\nabla$ instead of $\partial$).</li></ol><ol></ol>bloggeneral relativitynewton's lawsnewtonian mechanicsrelativityspecial relativitySun, 22 Jul 2018 05:24:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7737004677109282224Abhimanyu Pallavi Sudhir2018-07-22T05:24:00ZWhat is calculus-based physics?
https://thewindingnumber.blogspot.com/2018/07/what-is-calculus-based-physics.html
0I dislike this whole “non-calculus physics”/”calculus physics” distinction created in schools, because it degrades mathematics to some kind of a weird tool used in physics.<br /><br />Physics is just the study of the mathematical stuff we do observe — every physical system is a mathematical system on a fundamental level, which for pedagogical purposes and stuff, we often approximate with other mathematical systems (e.g. modelling stuff as rigid bodies, not considering the motion of every single particle within an extended body, neglecting gravity in particle physics, etc.). So <i>of course</i> you will find math being “used” in physics, because physics is mathematics!<br /><br />Physics uses math in the same way that mathematics uses math — like how you “use” differentiability in defining lie groups, or how you “use” calculus and linear algebra in differential geometry, or how you “use” matrices in describing linear transformations, or whatever. Neither the physics, nor the mathematics should be classified or segregated by what mathematical methods, or “math” is used in describing or defining it.<br /><br />You shouldn’t divide physics as “calculus-based” and “non-calculus” for the same reason you don’t divide it into “partial fractions-based” and “non-partial fractions”, or “elementary algebraic” and “non-elementary algebra”, or at a little higher level, “differential geometry-based” and “non-differential geometry-based”.<br /><br />Use whatever tools you have to use! The point of physics is to describe what we observe — aka the universe — as efficiently and conveniently as possible, not to do elementary calculus.<br /><br />There are other, more sensible ways to divide physics — experimental, theoretical and phenomenology — “mathematical physics”, which is basically physics done with as much rigor as you find in the mathematics literature, so you ensure everything you know about physics is consistent and stuff (the physics exists), you know what your underlying assumptions/axioms/postulates (that you must verify empirically) are, etc. — you could define it as “symmetry-based physics” and “non-symmetry based physics”, where the good physics is symmetry-based and the bad physics isn’t, but since Einstein, all physics is symmetry-based, so this is irrelevant today.blogcalculuseducationmath as a fieldmath as a toolphysicsphysics educationscience educationSun, 22 Jul 2018 05:11:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7484346114034678201Abhimanyu Pallavi Sudhir2018-07-22T05:11:00ZNo, Einstein is not overrated
https://thewindingnumber.blogspot.com/2018/07/no-einstein-is-not-overrated.html
0No, this is ridiculous. Einstein is not overrated, and neither is Newton — the general public gets this issue completely right, because Newton and Einstein are remembered not for their direct contributions to physics, but the effect they had on <i>how physics is done</i>.<br /><br />Einstein’s impact in this regard can be compared only to Newton, who turned physics from a field of philosophy into a field of mathematics. There were a few good physicists in Ancient Greece, like Archimedes and Apollonius — and also similar folks in India, China and the Arab civilisation — but it was disorganised, and it got destroyed by the Romans, in the case of Greece. It was due to Newton that the good folks got taken seriously, and the idiots, like Aristotle, got discarded.<br /><br />Einstein had a very similar effect on physics, by forcing people to accept logical positivism. By force I don’t mean taking a gun to people’s heads and forcing them, or taxing people to fund pro-logical positivism posters or whatever, but you can’t do relativity without accepting logical positivism.<br /><br />If I remember correctly, this is done in the very first section of his 1905 paper on special relativity (read it — it’s remarkable, even if you don’t understand physics — you can read it either as a contemporary work or a historical one, which is very rare for any paper, even Newton’s Principia), where he rejects all meaningless babbling about “is it really ____ or do we just <i>see/feel/</i>… ____?” etc. You don’t need to actually read Carnap, because philosophy is a trivial field, and positivism can be learned in three simple sentences: observers do observe. agents should act. everything else is nonsense. But if you need more convincing, just search for “the elimination of metaphysics by the logical analysis of language”, and you’ll get it.<br /><br />Perhaps his most important contribution in this regard, though, was the establishment of symmetry laws as a defining pillar of physics. You can divide physics as “symmetry-based” and “non-symmetry-based”, and all physics since Einstein onwards is symmetry-based in some form or another. Until Einstein, symmetry was just a cool heuristic you derived from some physical laws — since Einstein, we accept some symmetries (or generalise them, e.g. in the case of Poincaire invariance in GR) and the theory gains its elegance from this. Emmy Noether is also crucial in this aspect, for Noether’s theorem.<br /><br />Einstein is also remembered because relativity, along with quantum mechanics, put the final nail in the coffin for elementary physical intuition. This is a similar role as what Bertrand Russell played in the demise of naive intuition and the adoption of rigor in mathematics.<br /><br />The celebration of Einstein by the general public often seems like giggling over random factoids, like E = m, time dilation or there being a supremum possible speed, but it’s really just their subconscious telling them the above.<br /><br />blogeinsteingeneral relativitylogical positivismrelativityspecial relativitySun, 22 Jul 2018 05:10:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-6092579095080943472Abhimanyu Pallavi Sudhir2018-07-22T05:10:00ZWhy are calculus and linear algebra taught early?
https://thewindingnumber.blogspot.com/2018/07/why-are-calculus-and-linear-algebra.html
0Linear algebra and function theory are related — you can construct plenty of accurate analogies here, like functions and vectors, linear transforms and integral transforms, etc. In addition, the elementary techniques of calculus allow you to talk about non-linear transformations in a pretty nice manner — e.g. the Jacobian matrix as a change-of-basis matrix for non-linear co-ordinate transformations.<br /><br />In general, calculus is just a special case and a “constructivist” kind of way of understanding the much deeper mathematical field of analysis. The calculus of variations, basic complex analysis, matrix calculus, etc. are other examples of this. It’s taught, despite its non-fundamental nature, not only because it locally linearises things with infinitesimals, allowing us to study non-linear things, e.g. in differential geometry, but also because a lot of its results are special cases of purer results in advanced mathematics. Some elementary examples: the chain rule, a special case of a change-in-basis-variables/the Jacobian matrix; the fundamental theorem of calculus and Stokes’ theorem, special cases of the generalised Stokes’ theorem in differential geometry.<br /><br />Linear algebra is taught for similar reasons — it introduces you to a lot of things in algebra, much like how calculus introduces you to a lot of things in analysis. Together, they also introduce you to a lot of things in geometry — largely because the “ideas” behind the two allow us to describe a lot of things in a linear way — completing the algebra-analysis-geometry trinity.<br /><div><br /></div>blogcalculuseducationlinear algebramathematicsmathematics educationscience educationSun, 22 Jul 2018 05:09:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1551319580487629464Abhimanyu Pallavi Sudhir2018-07-22T05:09:00ZComment by Abhimanyu Pallavi Sudhir on Ubuntu 17.04 Chromium Browser quietly provides full access to Google account
https://askubuntu.com/questions/915556/ubuntu-17-04-chromium-browser-quietly-provides-full-access-to-google-account
Me too. This is weird. Even if it's just the Chrome browser, I don't see why they'd need <i>full</i> access to my Google account. Windows doesn't do this.Sat, 14 Jul 2018 17:11:46 GMThttps://askubuntu.com/questions/915556/ubuntu-17-04-chromium-browser-quietly-provides-full-access-to-google-account?cid=1726608Abhimanyu Pallavi Sudhir2018-07-14T17:11:46ZComment by Abhimanyu Pallavi Sudhir on How to create folder shortcut in Ubuntu 14.04?
https://askubuntu.com/questions/486461/how-to-create-folder-shortcut-in-ubuntu-14-04/691976#691976
@jave.web Yes -- use the application menu (either at the top left of your screen or a colourful icon next to the window controls) to go to your Nautilus preferences, then under "Behavior" enable link creation.Fri, 13 Jul 2018 11:49:02 GMThttps://askubuntu.com/questions/486461/how-to-create-folder-shortcut-in-ubuntu-14-04/691976?cid=1724793#691976Abhimanyu Pallavi Sudhir2018-07-13T11:49:02ZSolving the infinite radical $\sqrt{6+\sqrt{6+2\sqrt{6+3\sqrt{6+...}}}}$
https://math.stackexchange.com/questions/2839527/solving-the-infinite-radical-sqrt6-sqrt62-sqrt63-sqrt6
20<p><span class="math-container">$$\sqrt{6+\sqrt{6+2\sqrt{6+3\sqrt{6+\cdots}}}}$$</span></p>
<p>This is a modification on the well-known Ramanujan infinite radical, <span class="math-container">$\sqrt{1+\sqrt{1+2\sqrt{1+3\sqrt{1+\cdots}}}}$</span>, except it cannot be solved by the conventional method -- the functional equation <span class="math-container">$F(x)^2=ax+(n+a)^2+xF(x+n)$</span>, since setting <span class="math-container">$n=1$</span> with <span class="math-container">$a=0$</span> requires having <span class="math-container">$(n+a)^2=1$</span>, not <span class="math-container">$6$</span>.</p>
<p>Here are some alternative methods I've tried:</p>
<ul>
<li>The functional equation we have instead for this infinite radical is <span class="math-container">$F(x)^2=6+xF(x+1)$</span>. I've tried to solve this, but unfortunately it's easy to demonstrate that <span class="math-container">$F(x)$</span> cannot be a simple linear function <span class="math-container">$F(x)=ax+b$</span>. I've tried some slightly more complicated versions -- the equation for a hyperbola, etc. -- but nothing seems to work.</li>
<li>I've tried factoring stuff out from the radical to bring it to a more tenable form. Perhaps not a satisfactorily rigorous approach, I thought of factoring out <span class="math-container">$\sqrt{6^{N/2}}$</span> where <span class="math-container">$N\to\infty$</span>, which allows us to transform the radical into <span class="math-container">$6^{-N/2}\sqrt{6^{N+1}+\sqrt{6^{2N+1}+2\sqrt{6^{4N+1}+\cdots}}}$</span>, which can be treated as having each term a power of <span class="math-container">$6^{N/2}$</span> in the limit. For a radical of the form <span class="math-container">$\sqrt{\alpha^2+\sqrt{\alpha^4+2\sqrt{\alpha^8+\cdots}}}$</span> we have the functional equation <span class="math-container">$F(x)^2=\alpha^{2^x}+xF(x+1)$</span>, or upon letting <span class="math-container">$F(x)=\alpha^{2^x}p(x)$</span>, you get <span class="math-container">$p(x)^2-xp(x+1)=\alpha^{-2^x}$</span>, but I'm stuck there.</li>
<li>Similarly, I tried factoring out some arbitrary <span class="math-container">$N$</span> then factoring out a term from each radical inside such that the coefficients go from being <span class="math-container">$1,2,3,\cdots$</span> to a constant <span class="math-container">$1/N,1/N,1/N...$</span>, transforming the radical into <span class="math-container">$N\sqrt{\frac6{N^2}+\frac1N\sqrt{\frac6{N^2}+\frac1N\sqrt{\frac{24}{N^2}+\frac1N\sqrt{\frac{864}{N^2}+\frac1N\sqrt{\frac{1990656}{N^2}+\cdots}}}}}$</span> where the added terms go as <span class="math-container">$k_1=6$</span>, <span class="math-container">$k_{n+1}=\frac{n^2}6k_n^2$</span>. But how might one proceed?</li>
<li>I considered differentiating the function <span class="math-container">$G(x)=\sqrt{x+\sqrt{x+2\sqrt{x+3\sqrt{x+\cdots}}}}$</span>. But all I got was an equally weird differential equation:</li>
</ul>
<p><span class="math-container">$$\frac{df}{dx}=\frac{1+\frac{1+\frac{1+\frac{{\mathinner{\mkern2mu\raise1pt\hbox{.}\mkern2mu \raise4pt\hbox{.}\mkern2mu\raise7pt\hbox{.}\mkern1mu}}}{\frac23\frac{\left(\frac{\left(f(x)^2-x\right)^2-x}{2}\right)^2-x}{3}}}{\frac22\frac{\left(f(x)^2-x\right)^2-x}{2}}}{\frac21\left(f(x)^2-x\right)}}{2f(x)}$$</span></p>
<p>Any ideas as to how I might proceed?/Any alternative (hopefully less tedious, but regardless) methods that might work?</p>
<hr>
<p>I created a <a href="https://www.khanacademy.org/computer-programming/run-tests/5147953190961152" rel="nofollow noreferrer">small program</a> to play with this. The exact answer (perhaps as an infinite series) <em>may</em> contain <span class="math-container">$\sqrt{6}+1/2+...$</span> somewhere in it, because as you increase the number replacing 6, the radical approaches <span class="math-container">$\sqrt{x}+1/2$</span>. Of course, this term just comes from the binomial series for <span class="math-container">$\sqrt{6+\sqrt{6}}$</span>.</p>
<p>I also got nothing on the inverse symbolic calculator.</p>
<hr>
<p>Here's another possible approach: one may consider the sequence of polynomials:</p>
<p><span class="math-container">$$P_1:x^2-6=x$$</span>
<span class="math-container">$$P_2:\left(\frac{x^2-6}2\right)^2-6=x$$</span>
<span class="math-container">$$P_3:\left(\frac{\left(\frac{x^2-6}2\right)^2-6}3\right)^2-6=x$$</span></p>
<p>Formed by taking recurrent approximations to the infinite radical. The limit of <span class="math-container">$P_n$</span> as <span class="math-container">$n\to\infty$</span> is the root of some function with a power series expansion that can perhaps be calculated in this form. But what is the power series expansion? </p>
<p>Note that the polynomial gets very complicated very quick. E.g. here's <span class="math-container">$P_5$</span>:</p>
<p><span class="math-container">$$\frac{x^{32}}{2751882854400}-\frac{x^{30}}{28665446400}+\frac{43x^{28}}{28665446400}-\frac{91x^{26}}{2388787200}+\frac{121x^{24}}{191102976}-\frac{53x^{22}}{7372800}+\frac{11167x^{20}}{199065600}-\frac{4817x^{18}}{16588800}+\frac{57659x^{16}}{66355200}-\frac{x^{14}}{1382400}-\frac{9491x^{12}}{1382400}+\frac{367x^{10}}{12800}-\frac{2443x^8}{46080}+\frac{179x^6}{9600}+\frac{2233x^4}{9600}-\frac{71x^2}{160}-x-\frac{33359}{6400}=0$$</span></p>
<p>See <a href="https://math.stackexchange.com/questions/3051551/what-is-the-region-of-convergence-of-x-n-left-fracx-n-1n-right2-a-w">What is the region of convergence of <span class="math-container">$x_n=\left(\frac{x_{n-1}}{n}\right)^2-a$</span>, where <span class="math-container">$a$</span> is a constant?</a></p>functional-equationsnested-radicalsTue, 03 Jul 2018 11:34:34 GMThttps://math.stackexchange.com/q/2839527Abhimanyu Pallavi Sudhir2018-07-03T11:34:34ZComment by Abhimanyu Pallavi Sudhir on How to customize (add/remove folders/directories) the "Places" menu of Ubuntu 13.04 "Files" application?
https://askubuntu.com/questions/285313/how-to-customize-add-remove-folders-directories-the-places-menu-of-ubuntu-13/292727#292727
This works. If you also want to remove the folders from the home directory, edit user-dirs.defaults, or make a copy of it in .config and edit there (for your local user).Sat, 30 Jun 2018 09:00:13 GMThttps://askubuntu.com/questions/285313/how-to-customize-add-remove-folders-directories-the-places-menu-of-ubuntu-13/292727?cid=1716388#292727Abhimanyu Pallavi Sudhir2018-06-30T09:00:13ZComment by Abhimanyu Pallavi Sudhir on How to safely remove default folders?
https://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964#140964
See <a href="https://askubuntu.com/questions/285313/how-to-customize-add-remove-folders-directories-the-places-menu-of-ubuntu-13">here</a> for a working solution. If you also want to remove the folders from the home directory, edit user-dirs.defaults, or make a copy of it in .config and edit there (for your local user).Mon, 25 Jun 2018 06:08:26 GMThttps://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964?cid=1713336#140964Abhimanyu Pallavi Sudhir2018-06-25T06:08:26ZComment by Abhimanyu Pallavi Sudhir on How to safely remove default folders?
https://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964#140964
Doesn't work -- even if you don't run the update command, it gets updated upon the next reboot. There must be a more fundamental file in which these directory names are kept.Mon, 25 Jun 2018 05:32:16 GMThttps://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964?cid=1713326#140964Abhimanyu Pallavi Sudhir2018-06-25T05:32:16ZAnswer by Abhimanyu Pallavi Sudhir for Intuition behind Chebyshev's inequality
https://math.stackexchange.com/questions/1344734/intuition-behind-chebyshevs-inequality/2814492#2814492
0<p>First of all, putting $\mu$'s and $\sigma$'s all over is ridiculous -- like natural units in physics, let's set $\mu=0$ and $\sigma=1$. Then Chebyshev's inequality states that:</p>
<p>$$P(|X|>k)\leq1/k^2$$</p>
<p>I.e. for a distribution to have a unit standard deviation, there is a natural limit on how much of the distribution can be some given amount $k>1$ away from the mean. This isn't particularly surprising -- as you add stuff to the distribution outside standard deviation range, you inevitably increase the standard deviation. In order to keep the standard deviation at 1, you need to squeeze the things inside the standard deviation range and reduce <em>their</em> deviation, so the overall deviation stays at 1.</p>
<p>But there's got to be a limit on how much you can reduce the total deviation, right? You can't make the contribution of the things inside to the deviation <em>negative</em> -- it can only go down to zero. So what exactly is this limiting case?</p>
<p>Well, at the limiting case, you have a Dirac delta at $X=0$ and two Dirac deltas at $X=k$ and $X=-k$. What's the maximum height of these two Dirac deltas? Answer this, and you have Chebyshev's inequality.</p>
<p><a href="https://i.stack.imgur.com/xBf2P.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/xBf2P.png" alt="enter image description here"></a></p>
<p>The standard deviation of this distribution is --</p>
<p>$$\sqrt{\frac{p}2(-k)^2+\frac{p}2k^2}=k\sqrt{p}$$</p>
<p>We set this equal to 1, and we get $p=1/k^2$ as the maximum amount of stuff you can pile at $k$ and beyond even if you use the distribution with the least possible standard deviation. </p>
<p>(A full proof would consider the possibility of an asymmetric distribution with a $p$ stick at $X=k$ and a $1-p$ stick at $X=-\frac{p}{1-p}k$, but it turns out the limit on $p$ is even stricter, at $p\leq\frac1{k^2+1}$.)</p>Sun, 10 Jun 2018 11:50:04 GMThttps://math.stackexchange.com/questions/1344734/-/2814492#2814492Abhimanyu Pallavi Sudhir2018-06-10T11:50:04ZAnswer by Abhimanyu Pallavi Sudhir for units in math, cross product
https://math.stackexchange.com/questions/1449396/units-in-math-cross-product/2813670#2813670
0<p>Yes -- using a vector to write the cross product is just a 3D convention to remind you it has 3 independent components. It's more naturally represented as a bivector or as a rank-2 tensor, but there's a duality between these and vectors in 3D.</p>
<p><strong>Scale invariance/tensors in physics</strong></p>
<p>The formal way of expressing your concern -- that the cross product has different "units" from other vectors in your space -- is that the cross product <em>doesn't behave like a vector under scaling</em>. Under scaling, a vector is supposed to scale as <span class="math-container">$v\to\lambda v$</span>, because lengths scale that way, and areas scale as <span class="math-container">$t\to\lambda^2t$</span>. The cross product scales as an area, so it makes a sucky vector.</p>
<p>This is sort of related to the definition of tensors in physics (specifically in relativity) -- a tensor is an object that transforms as a tensor under specific transformations, specifically Lorentz transformations (skews in the t-x/t-y/t-z plane and rotations in the other three, x-y/y-z/z-x). So scalars are invariant under Lorentz transformations, vectors transform as <span class="math-container">$\Lambda_\mu^{\bar\mu}v^\mu$</span>, rank-2 tensors transform as <span class="math-container">$\Lambda_\mu^{\bar\mu}\Lambda_\nu^{\bar\nu}t^{\mu\nu}$</span>, etc. So it's not enough to have the right number of components -- because this can be gamed with symmetries, like it is for the cross product (which has nine components, but only three independent one) -- you need to transform in the right way.</p>
<p>It's a bit more complicated with scaling, because different scalars transform differently (lengths scale like vectors, areas scale like rank-2 tensors), but the idea is the same -- the cross product is not a vector in the physics sense, although it is in the math sense. But nobody in math cares about the cross product anyway.</p>Sat, 09 Jun 2018 15:36:37 GMThttps://math.stackexchange.com/questions/1449396/-/2813670#2813670Abhimanyu Pallavi Sudhir2018-06-09T15:36:37ZAnswer by Abhimanyu Pallavi Sudhir for What makes the Cauchy principal value the "correct" value for a integral?
https://math.stackexchange.com/questions/2450848/what-makes-the-cauchy-principal-value-the-correct-value-for-a-integral/2806611#2806611
1<p>It isn't the "correct value" for the integral any more than the principal root is the correct value for a root or the principal logarithm is the correct value for a logarithm, or setting $C=0$ gives you the correct value of the antiderivative. There are plenty of other values the integral can take, depending on how you take the limit. See my answer to <a href="https://math.stackexchange.com/a/2805722/78451">Why can't $\int_{-1}^1\frac{dx}x$ be evaluated?</a></p>Sun, 03 Jun 2018 14:09:34 GMThttps://math.stackexchange.com/questions/2450848/-/2806611#2806611Abhimanyu Pallavi Sudhir2018-06-03T14:09:34ZAnswer by Abhimanyu Pallavi Sudhir for Why can't $\int_{-1}^1{\frac{dx}{x}}$ be evaluated?
https://math.stackexchange.com/questions/1511181/why-cant-int-11-fracdxx-be-evaluated/2805722#2805722
1<p>I don't know about you, but when I was first introduced to the antiderivative of $1/x$, I was pretty confused. It made sense that the answer was $\ln(x)+C$, but changing this to $\ln|x|+C$ seems to make no sense. It's justified as being just the addition of a constant, $i\pi$, and indeed you can verify the derivative is the same (still $1/x$), but it seems instead that you're only adding $i\pi$ to the function for $x<0$, and nothing when $x>0$. In other words, you're not actually adding a constant at all, but rather the function $i\pi\left(1-H(x)\right)$ (where $H(x)$ is a unit step at 0).</p>
<p>Indeed, this kind of thing wouldn't be OK if we were dealing with ordinary continuous functions -- if you added a constant to one point of the function, that would affect the values of every other point in the function (one way of demonstrating this is the Taylor series). But since $\ln(x)$ has a singularity at $x=0$, the derivative is not defined at that point <em>anyway</em>, so one side of the function can be independent of the other.</p>
<hr>
<p>Why is this important? Well, consider evaluating the integral</p>
<p>$$\int_{-1}^1\frac{dx}x$$</p>
<p>Now, the presence of the singularity would not <em>directly</em> make it a bad idea to use the fundamental theorem of calculus to evaluate this integral (look <a href="https://math.stackexchange.com/questions/2193723/how-to-fix-int-11-frac-dxx2-with-complex-numbers/2804637#2804637">here</a> to understand where it would directly be a bad idea) -- or at least it wouldn't necessarily be, if you choose $\ln|x|$ as the antiderivative, since then the antiderivative would come back from infinity in the same direction, so the same overall path is transversed.</p>
<p><a href="https://i.stack.imgur.com/LtUEi.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/LtUEi.png" alt="enter image description here"></a>
<em>If you don't understand this argument, go look at the link above!</em></p>
<p>But because there are multiple consistent antiderivatives, we <em>still</em> can't apply the fundamental theorem of calculus. For instance,</p>
<p>$$\begin{array}{l}\ln \left| 1 \right| - \ln \left| { - 1} \right| = 0\\\ln \left( 1 \right) - \ln \left( { - 1} \right) = - i\pi \end{array}$$</p>
<p>So instead you define a <em>principle value</em>, skirting around the singularity by taking</p>
<p>$$\int_{-1}^{-\epsilon}\frac{dx}x+\int_{\epsilon}^{1}\frac{dx}x=\ln(\epsilon)-\ln(\epsilon)=0$$</p>
<p>This is okay, because regardless of whether you use $\ln(x)$ or $\ln|x|$ as the antiderivative, the arbitrary constants cancel out on each side of the singularity. I.e. if you have a $+i\pi$ term to the left of the singularity, this exists for both $-1$ and $-\epsilon$, and thus cancels out as in any ordinary integral.</p>
<p>In this sense, $\ln\left|x\right|$ can be considered the "principal antiderivative" of $1/x$. But there's nothing particularly special about this value. One may take the limit a little differently, so you don't approach 0 at the same rate from both sides, for instance --</p>
<p>$$\int_{ - 1}^{ -\epsilon } {\frac{{dx}}{x}} + \int_{n\epsilon}^1 {\frac{{dx}}{x}} = \ln \left( \epsilon \right) - \ln \left( {n\epsilon} \right) = \ln \left( {\frac{1}{n}} \right)
$$</p>
<p>This is equivalent to cancelling out your areas in a different "order" on the graph, which allows you to leave some remainder area.</p>Sat, 02 Jun 2018 17:59:02 GMThttps://math.stackexchange.com/questions/1511181/why-cant-int-11-fracdxx-be-evaluated/2805722#2805722Abhimanyu Pallavi Sudhir2018-06-02T17:59:02ZAnswer by Abhimanyu Pallavi Sudhir for How to "fix" $\int_{-1}^1 \frac {dx}{x^2}$ with complex numbers?
https://math.stackexchange.com/questions/2193723/how-to-fix-int-11-frac-dxx2-with-complex-numbers/2804637#2804637
0<p>When you first looked at an integral like $\int_{-1}^{1}dx/x^2$, your instinct was to apply the fundamental theorem of calculus, and evaluate $[-1/x]|_{-1}^1$. This answer was clearly <em>wrong</em>, but why? Why does having a singularity in between screw up the fundamental theorem of calculus?</p>
<p>Well, you might have the intuition for the fundamental theorem of calculus as having to do with, e.g. a disk expanding outwards, and the derivative of the area being the circumference -- so with some $dr$ added to the radius, $2\pi r\cdot dr$ is added to the area. And with a lot of $dr$'s getting added, the total addition to the circumference -- the integral of $2\pi r$ across the total length of $dr$'s that got added -- is the difference in area over the expansion. </p>
<p><a href="https://i.stack.imgur.com/YPTu4.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/YPTu4.png" alt="enter image description here"></a></p>
<p>But you might imagine a set-up where there's a singularity somewhere in the expansion, so the area suddenly blows up to infinity somewhere in between the expansion, then starts back from 0 <sup><a href="https://i.stack.imgur.com/YPTu4.png" rel="nofollow noreferrer">1</a></sup>. Something's not quite wrong here. Our intuition just broke -- the actual amount of area that got added isn't the same as the total area change here.</p>
<p>Like in our exercise above, let's look at the antiderivative of $1/x^2$ between -1 and 1. And for comparison, we'll keep another function -- a normal, continuous function -- and its antiderivative.</p>
<p><a href="https://i.stack.imgur.com/LuZkI.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/LuZkI.png" alt="enter image description here"></a></p>
<p>And that's the fundamental problem here -- the integral is taking a different path (and if you wrapped the xy-plane around a sphere, you'd actually be able to visualise this path as going all the way to infinity then coming back from behind) from the $F(b)-F(a)$ calculation. $F(b)-F(a)$ is -2, but the path taken by the integral is in fact, $\infty-2$.</p>
<p>On the real line, there's no other path you can take (besides going to infinity and coming back) on which the integral is valid. But if you could just budge the path a bit "out" of the xy-plane, it would be valid, because you wouldn't be going to infinity -- just a really high number. An example way of doing this is to use complex numbers, since the added dimension allows you to draw the path between -1 and 1 a little out of the plane, and take the limit as this "little out" approaches 0. As an example, you can look at the integral:</p>
<p>$$\int_{-1}^1 \frac1{x^2+\epsilon}dx$$</p>
<p>(Explain why this integral may be interpreted as using the complex plane.)</p>
<p>In fact, it is quite natural to use the complex numbers as a way to "poke out" of the real line, since "integrals are done along curves, not between limits" is a central insight from complex calculus.</p>
<p><sup><a href="https://i.stack.imgur.com/YPTu4.png" rel="nofollow noreferrer">1</a></sup> obviously, to be clear on this, we'll need to introduce some concept of time/a parameterisation <em>t</em>, and differentiate with respect to it instead, and claim that there is a singularity in $r(t)$.</p>Fri, 01 Jun 2018 18:12:14 GMThttps://math.stackexchange.com/questions/2193723/how-to-fix-int-11-frac-dxx2-with-complex-numbers/2804637#2804637Abhimanyu Pallavi Sudhir2018-06-01T18:12:14ZComment by Abhimanyu Pallavi Sudhir on Explaining the Main Ideas of Proof before Giving Details
https://mathoverflow.net/questions/301085/explaining-the-main-ideas-of-proof-before-giving-details
Because good proofs are just a formalisation of the intuitive understanding -- rather than wasting space explaining the insights, you can just give them the proof, and an even somewhat experienced reader can re-create the details.Sun, 27 May 2018 04:28:36 GMThttps://mathoverflow.net/questions/301085/explaining-the-main-ideas-of-proof-before-giving-details?cid=750004Abhimanyu Pallavi Sudhir2018-05-27T04:28:36ZAnswer by Abhimanyu Pallavi Sudhir for Intuition behind speciality of symmetric matrices
https://math.stackexchange.com/questions/1788911/intuition-behind-speciality-of-symmetric-matrices/2780461#2780461
3<p>When you were first learning about null spaces in linear algebra, your guess for the null space -- assuming you had some reasonable geometric intuition into the field -- was that the null space was orthogonal to the column space. After all, that makes sense. If your singular transformation collapses/projects <span class="math-container">$\mathbb{R}^2$</span> into a line, then the vectors that get mapped to the origin are the ones perpendicular to the column space.</p>
<p><a href="https://i.stack.imgur.com/2tgry.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/2tgry.png" alt="Is the column space perpendicular to the row space?"></a></p>
<p>Or at least, so it seems -- in reality, though, the projection doesn't need to be so nice and orthogonal. You could, for instance, <em>rotate</em> all vectors in the space by some angle and then collapse it onto a line.</p>
<p>It turns out the null space isn't perpendicular to the column space, but in fact to the <em>row space</em> instead -- these two spaces are only identical for matrices which do not perform a rotation.</p>
<p>This is a very important observation, because it tells you something about the character of matrices -- <strong>asymmetry in a matrix is a measure of how rotation-ish it is</strong>. Specifically, an antisymmetric matrix is the result of 90-degree rotations (like imaginary numbers) and a symmetric matrix is the result of scaling and skews (like real numbers). </p>
<p><span class="math-container">$$A = \underbrace {\frac{1}{2}(A + {A^T})}_{\scriptstyle{\rm{symmetric }}\atop\scriptstyle{\rm{part}}} + \underbrace {\frac{1}{2}(A - {A^T})}_{\scriptstyle{\rm{antisymmetric }}\atop\scriptstyle{\rm{part}}}$$</span></p>
<p>All matrices can bet written as the sum of these two kinds -- a symmetric part and an anti-symmetric part -- much like all complex numbers can be written as the sum of a real part and an imaginary part. And this is fundamentally why symmetric matrices are "special" -- for the same reason that real numbers are special.</p>
<hr>
<p>Notes:</p>
<p>(1) Scaling and skews are actually essentially the same thing, which is why it makes sense to include skews in the group of things that are "essentially real numbers", even though you can't really represent skews with any complex number -- real or otherwise. Skews are just scaling across a different set of axes, called "eigenvectors" (this is also why symmetric matrices have eigenvectors).</p>
<p>(2) My explanation of the analogy (between matrices and complex numbers) is oversimplified -- antisymmetric matrices actually represent <strong>90 degree rotations</strong> only, and these rotations can actually be spirals, which means they do scaling too. But the analogy still holds, because this applies to imaginary numbers too (e.g. the complex number <span class="math-container">$8i$</span> is a rotation by 90 degrees followed by a scaling by 8). </p>
<p>(3) A more accurate way to phrase the analogy is "the <strong>antisymmetric part</strong> of the matrix operates in a sub-space orthogonal to the vector being transformed while the <strong>symmetric part</strong> operates in the direction of the vector itself, so their sum spans all possible vectors of the target space". In other words, the analogy is to the <strong>Cartesian form</strong> of complex numbers -- you get to represent transformations as linear combinations of the vector itself and vectors orthogonal to it.</p>
<p>(4) It is possible to deal with at least some matrices in a way that corresponds to the <strong>polar forms</strong> of complex numbers -- this is done by representing matrices as products of <strong>symmetric matrices and orthogonal matrices</strong>, much like <span class="math-container">$re^{i\theta}$</span> represents complex numbers as products of real numbers and unit complex numbers.</p>Mon, 14 May 2018 08:42:41 GMThttps://math.stackexchange.com/questions/1788911/-/2780461#2780461Abhimanyu Pallavi Sudhir2018-05-14T08:42:41ZMinkowski everything -- spacetime vectors, rapidity
https://thewindingnumber.blogspot.com/2018/05/minkowski-everything-four-vectors-rapidity.html
0<b>Four-vectors and energy-momentum analogies</b><br /><b><br /></b>Let's look once more at the equation<br /><br />$$E=\frac{m}{\sqrt{1-v^2}}$$<br />This looks an awful lot like the equation for time dilation. $E$ is the mass as measured by someone who sees the object moving at $v$ whereas $m$ is the mass as measured by someone who sees the object at rest, e.g. by the object itself.<br /><br />Similarly, we have the equation $p=vE$, which looks an awful lot like the equation $x=vt$. It therefore makes sense to wonder how far this analogy goes. We could start with analysing the invariant.<br /><br />Even if I measure the mass of a 1kg rock as 10kg because of my reference frame, I know that if I brought the bag to rest, I would measure it as 1kg. Much like I can tell people's biological age or look at their clocks to determine their proper time, I can look at the moving thing's mass balance and determine its proper mass $m$.<br /><br />If we just wanted $m$ in terms of the "co-ordinates" $E$ and $p$,<br /><br />$$m = E\sqrt {1 - {v^2}} = \sqrt {{E^2} - {v^2}{E^2}} = \sqrt {{E^2} - {p^2}}$$<br />$${m^2} = {E^2} - {p^2}$$<br />Or in 4 dimensions,<br /><br />$${m^2} = {E^2} - p_x^2 - p_y^2 - p_z^2$$<br />We call $m$ the "proper mass". In general, "proper" means "as measured in the rest frame" -- proper time, proper length, proper mass, whatever. This equation is also useful because unlike the previous thing, this also works when $v=1$ (i.e. for light), and reduces to $E=pc$.<br /><br />But this looks an awful lot like the spacetime interval.<br /><br />That's not all. Consider an object with mass $E$, momentum $p$ and velocity $w=p/E$ in our reference frame $O$. Now boost to a reference frame $O'$ with relative velocity $v$ to $O$. Then the velocity of the object has transformed from $w$ to $\frac{{w - v}}{{1 - wv}}$. So<br /><br />$$\begin{array}{c}E' = \frac{m}{{\sqrt {1 - {{\left( {\frac{{w - v}}{{1 - wv}}} \right)}^2}} }}\\ = \frac{{m(1 - vw)}}{{\sqrt {{{(1 - wv)}^2} - {{(w - v)}^2}} }}\\ = \frac{{m(1 - vw)}}{{\sqrt {(1 - {w^2})(1 - {v^2})} }}\\ = \gamma (v)\left( {1 - vw} \right)\gamma (w)m\\ = \gamma \left( {1 - vw} \right)E\\ = \gamma (E - vwE)\\E' = \gamma (E - vp)\end{array}$$<br />And<br /><br />$$\begin{array}{c}p' = \frac{{m\left( {\frac{{w - v}}{{1 - wv}}} \right)}}{{\sqrt {1 - {{\left( {\frac{{w - v}}{{1 - wv}}} \right)}^2}} }}\\ = \left( {\frac{{w - v}}{{1 - wv}}} \right)E'\\ = \left( {\frac{{w - v}}{{1 - wv}}} \right)\gamma \left( {1 - wv} \right)E\\ = \gamma (wE - vE)\\p' = \gamma (p - vE)\end{array}$$<br />Or alternatively<br /><br />$$\left[ \begin{array}{l}{E'}\\{p'}\end{array} \right] = \gamma \left[ {\begin{array}{*{20}{c}}1&{ - v}\\{ - v}&1\end{array}} \right]\left[ \begin{array}{l}E\\p\end{array} \right]$$<br />In 4 dimensions,<br /><br />$$\left[ \begin{array}{l}{E'}\\{{p'}_x}\\{{p'}_y}\\{{p'}_z}\end{array} \right] = \left[ {\begin{array}{*{20}{c}}1&{ - v}&{}&{}\\{ - v}&1&{}&{}\\{}&{}&1&{}\\{}&{}&{}&1\end{array}} \right]\left[ \begin{array}{l}E\\{p_x}\\{p_y}\\{p_z}\end{array} \right]$$<br />Which is precisely the transformation for time and position.<br /><br />We call vectors that transform like this <b>spacetime vectors</b> or <b>four-vectors</b>. Four-vectors all share the same algebraic properties -- they transform in the same way, they follow vector addition, their norms and in general their dot products are invariant, etc. -- but not necessarily other properties. E.g. energy and momentum have conservation laws, but position and time do not.<br /><br />The norm of a spacetime vector is taken as:<br /><br />$${\left| {\left[ {\begin{array}{*{20}{c}}{{q_0}}\\{{q_1}}\\{{q_2}}\\{{q_3}}\end{array}} \right]} \right|^2} = q_0^2 - q_1^2 - q_2^2 - q_3^2$$<br />Which is distinct from the Euclidean norm, once again telling us that the geometry of spacetime is not Euclidean.<br /><br />Four-vectors are perhaps the most beautiful example of the symmetry between space and time. They essentially allow you to replace ordinary pre-relativistic vectors like momentum with vectors that also have a time component alongside three spatial components, because the world is 4-dimensional. You just need to find a quantity that behaves with the vector like time behaves with position -- i.e. you need to show the two quantities transform between each other in a Lorentz transformation sort of way.<br /><br />You end up with truly mind-boggling results -- we already saw that mass is the time-component of momentum, which explains why mass produces inertia -- an object with mass already devotes a lot of its momentum to moving forward in time, so the more the mass, the more of this momentum you need to transform into the spatial direction. This is really what is meant by the transformation law $p'=\gamma(p-vE)$ for mass $E$, generalising the Galilean $p'=p-vE$ (change $E$ to $M$ if that makes you happy). It also explains why massless (meaning zero rest mass) things can move at the speed of light.<br /><br />Other such four-vectors include:<br /><ul><li>Four-force (time-component: $dE/dt$)</li><li>Four-current (time-component: charge density, space-component: current density)</li><li>Electromagnetic four-potential</li></ul><div>Other quantities, like the electric and magnetic fields, even though they follow similar invariants (in the electromagnetic field example $E^2-B^2$), do not combine to form four-vectors, but instead objects called "tensors", which we will eventually talk about.</div><br />Note that during this transformation (giving something momentum), both mass and momentum increase. Similarly, time dilates when you move something around. This is again because $E^2-p^2$, not $E^2+p^2$ is invariant. The latter would correspond to a circular rotation, with invariant circles, whereas the former corresponds to a skew (a "hyperbolic rotation"), with invariant hyperbolae.<br /><br /><hr /><br /><b>Rapidity and hyperbolic rotations</b><br /><b><br /></b> <br /><div style="text-align: center;"><img src="https://upload.wikimedia.org/wikipedia/commons/8/8a/HyperbolicAnimation.gif" /></div><br /><div class="twn-furtherinsight">Points $(\cos\theta,\sin\theta)$, $(1,\tan\theta)$, $(\cosh\xi,\sinh\xi)$ and $(1,\tanh\xi)$ plotted for varying $\theta$ and $\xi$. While only $\theta$ can be interpreted as an angle too, both $\theta$ and $\xi$ can be interpreted as areas.</div><br />This will be a bit of a DIY section, with some guidance.<br /><br /><b>QUESTION 1</b><br /><b><br /></b><b>(a)</b> Consider the equation $v' = \frac{{v - w}}{{1 - vw}}$. What trigonometric identity does this remind you of? Could you resolve the differences somehow? (Hint: $v=\tanh\xi$)<br /><br /><b>(b) </b>Prove that the Lorentz transformations can be written as<br /><br />$$\begin{array}{l}t' = t\cosh \xi - x\sinh \xi \\x' = x\cosh \xi - t\sinh \xi \end{array}$$<br /><b>(c) </b>Use the hyperbolic analog of angle-addition formulae to show that this is equivalent to, where $\phi=\mathrm{artanh}(x/t)$ is the rapidity of the point $(t,x)$ in the original reference frame.<br /><br />$$\begin{array}{l}t' = s\cosh (\phi - \xi )\\x' = s\sinh (\phi - \xi )\end{array}$$<br /><b>(d) </b>The above result means that rapidity transforms as $\phi ' = \phi - \xi $ (which is itself nice, because it tells you that velocity at low speeds is approximately equal to rapidity by a factor of $c$) and $(t,x) = (s\sinh \phi ,s\cosh \phi )$. Relate the former to the idea of invariant hyperbolae and the interpretation of rapidity as an area (hint, hint: area sweeped out by a conic section... Kepler).<br /><b><br /></b><b>QUESTION 2</b><br /><b><br /></b><b>(a) </b>Results 1(b) and 1(c) are very similar to the effect of rotations on co-ordinate transformations. Here the linear transformations are skews, not rotations, which is why the formulae are different. Draw as many analogs as you can between rotations and skews in linear algebra. Refer to Article <a href="https://thewindingnumber.blogspot.in/2017/08/1103-006.html" target="_blank">1103-006</a>. Think about the rotational transformation matrix, etc.<br /><br /><b>(b) </b>Consider (a) directly in the context of special relativity. Pretending that Lorentz boosts are simply rotations (which would imply a metric signature (+,+,+,+) and treat time exactly like space), explain transformations between time and position, etc. Relate this to the actual, skew-y Lorentz transformations. Describe how relativity would behave in this theory.<br /><br /><b>(c) </b>Write as many relativistic things as you can in the language of rapidity -- the Lorentz factor, the Doppler factor, components of a four-vector (how do $E$ and $p$ look in terms of rapidity), etc.<br /><br /><b>(d) </b>Graph the hyperbolic functions and explain why the graphs make the results in 2(b) make sense.<br /><br /><b>(e) </b>How does rapidity interpretation make certain things, like $c$ being the maximum speed, natural?<br /><br /><b>QUESTION 3</b><br /><b><br /></b><b>(a) </b>Consider once again the transformation $\phi ' = \phi - \xi $. What does this tell you about the relative rapidity $\Delta\phi$? Is this invariant, i.e. do all observers agree on what the relative rapidity between two objects is, like observers did on relative velocity in Galilean relativity?<br /><br /><b>(b) </b>Explain why it would be foolish to expect the quantity $\arctan{v}$, the Euclidean angle (as opposed to rapidity, which we may call the "Minkowskian angle"), to have any physical significance. Think about the quantity $r\arctan{v}$ where $r^2=\Delta t^2+\Delta x^2$ (no minus sign).<br /><br />It's therefore reasonable to define the dot product on spacetime as $\vec a \cdot \vec b = |\vec a||\vec b|\cosh \Delta \phi $ where $\Delta\phi$ is the relative rapidity/Minkowskian angle/difference in rapidity. This expression implies that $|\vec a|^2=\vec a\cdot\vec a$is manifestly (i.e. obviously) Lorentz invariant, since both norms and relative rapidity are invariant.<br /><br /><b>(c) </b>Translate this out of rapidity language, i.e. into a language where rapidity is not used as a parameterisation. You should get $a_0b_0-a_1b_1$ (where 0 and 1 are the temporal and spatial components respectively) in two dimensions.<br /><br />The fact that this modified dot product is invariant under a skew is analogous to how the standard dot product is invariant under rotations ("complex skews"). Indeed, it turns out see that the 4-dimensional Minkowski dot product<br /><br />$${a_0}{b_0} - {a_1}{b_1} - {a_2}{b_2} - {a_3}{b_3}$$<br />Is invariant under skews (between the time axis and some other axis) as well as spatial rotations (and all combinations thereof -- i.e. a general Lorentz transformation), as it contains both a "skew-y" part and a "standard dot product-y" part.<br /><br /><hr /><br />Some interesting things regarding 2(b):<br /><br />A circular Lorentz transformation would transform position and time something similar to this:<br /><br />$$\begin{array}{l}x' = \eta (x - vt)\\t' = \eta (t + vx)\end{array}$$<br />One can also talk about transforming the positive and negative sides of the axes separately.<br /><br />$$\begin{array}{l}x' = \eta (x - vt)\\t' = \eta (t + vx)\\ - x' = \eta ( - x - v( - t))\\ - t' = \eta ( - t - v( - x))\end{array}$$<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://i.stack.imgur.com/p8cEX.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="496" data-original-width="800" height="247" src="https://i.stack.imgur.com/p8cEX.png" width="400" /></a></div>Whereas with hyperbolic functions, there is no sign difference, so you only need to transform twice to return. This is linked to you having to differentiate circular functions four times to return, as opposed to twice for hyperbolic functions, all the sign differences between trigonometric and hyperbolic identities, the whole $ie^{i\theta}$ proof of Euler's formula, etc.four-vectorshyperbolic functionsinvarianceinvariantslinear algebralorentz transformationsminkowski spacetimerapidityrelativityskewsspacetimespecial relativityFri, 11 May 2018 06:21:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-3640812652580919131Abhimanyu Pallavi Sudhir2018-05-11T06:21:00ZAnswer by Abhimanyu Pallavi Sudhir for Why is 1 not a prime number?
https://math.stackexchange.com/questions/120/why-is-1-not-a-prime-number/2757408#2757408
6<p>1 isn't a prime number for the same reason 0 isn't a basis vector.</p>
<p>Positive integers can be written as "almost a linear algebra" of the vector space <span class="math-container">$\mathbb{Z}_{>0}$</span> over the scalar field <span class="math-container">$\mathbb{Z_{\ge0}}$</span> (okay, this is not really a field -- it's a semiring, do what you want with it, but the idea is the same) with:</p>
<ul>
<li>Primes as the "unit basis vectors" </li>
<li>Multiplication as "vector addition" </li>
<li>Exponentiation as "scalar multiplication" (e.g. <span class="math-container">$p^k$</span> represents the scalar <span class="math-container">$k$</span> multiplied by the vector <span class="math-container">$p$</span>)</li>
<li>1 as the vector 0</li>
<li>1 as the scalar 1</li>
<li>0 is the scalar 0</li>
</ul>
<p>One may check this obeys all the axioms of linear algebra, except the existence of negatives (of vectors).</p>
<p>The reason you don't call the zero vector a basis vector is that it doesn't really add anything to the formalism if you consider "<span class="math-container">$0 + e_1 + e_2$</span>" to be the same representation as "<span class="math-container">$e_1+e_2$</span>", and if you consider it to be a different representation, you're violating the idea of each vector having a unique representation in a basis. Instead, 0 is just what you have when you haven't added anything, similarly 1 is just the empty product.</p>
<p>Note that this formalism has a lot of other interesting analogies -- for an example, co-primeness is "orthogonality". You could also extend the formalism to rationals <span class="math-container">$\mathbb{Q}$</span> over the scalar field <span class="math-container">$\mathbb{Z}$</span> -- then it would satisfy the existence of negativeness -- although co-primeness would be more complicated (e.g. 18 would be co-prime to 3/4).</p>Sat, 28 Apr 2018 12:52:57 GMThttps://math.stackexchange.com/questions/120/why-is-1-not-a-prime-number/2757408#2757408Abhimanyu Pallavi Sudhir2018-04-28T12:52:57ZComment by Abhimanyu Pallavi Sudhir on reference for higher spin - not gravitational nor stringy
https://mathoverflow.net/questions/195125/reference-for-higher-spin-not-gravitational-nor-stringy
On <a href="http://www.physicsoverflow.org/27048/reference-for-higher-spin-not-gravitational-nor-stringy?show=27499#a27499" rel="nofollow noreferrer">PhysicsOverflow</a>, there is a link to <a href="http://inspirehep.net/record/265411" rel="nofollow noreferrer">this paper</a> for the same question.Sun, 01 Mar 2015 02:25:25 GMThttps://mathoverflow.net/questions/195125/reference-for-higher-spin-not-gravitational-nor-stringy?cid=493513Abhimanyu Pallavi Sudhir2015-03-01T02:25:25ZComment by Abhimanyu Pallavi Sudhir on Classical and Quantum Chern-Simons Theory
https://mathoverflow.net/questions/159695/classical-and-quantum-chern-simons-theory
This has received an answer on PhysicsOverflow if you're still interested: <a href="http://www.physicsoverflow.org/22251/classical-and-quantum-chern-simons-theory#c22256" rel="nofollow noreferrer">Classical and Quantum Chern-Simons Theory</a>Thu, 14 Aug 2014 13:14:02 GMThttps://mathoverflow.net/questions/159695/classical-and-quantum-chern-simons-theory?cid=447277Abhimanyu Pallavi Sudhir2014-08-14T13:14:02ZComment by Abhimanyu Pallavi Sudhir on What is convolution intuitively?
https://mathoverflow.net/questions/5892/what-is-convolution-intuitively
<a href="http://en.wikipedia.org/wiki/File:Convolution_of_spiky_function_with_box2.gif" rel="nofollow noreferrer">Wikipedia</a>Fri, 17 Jan 2014 16:20:39 GMThttps://mathoverflow.net/questions/5892/what-is-convolution-intuitively?cid=396721Abhimanyu Pallavi Sudhir2014-01-17T16:20:39ZComment by Abhimanyu Pallavi Sudhir on Embedding of F(4) in OSp(8|4)?
https://mathoverflow.net/questions/111110/embedding-of-f4-in-osp84
Cross-posted to: <a href="http://physics.stackexchange.com/q/41155/23119">physics.stackexchange.com/q/41155/23119</a>Mon, 23 Dec 2013 04:35:50 GMThttps://mathoverflow.net/questions/111110/embedding-of-f4-in-osp84?cid=391443Abhimanyu Pallavi Sudhir2013-12-23T04:35:50ZComment by Abhimanyu Pallavi Sudhir on How to compare Unicode characters that "look alike"?
https://stackoverflow.com/questions/20674577/how-to-compare-unicode-characters-that-look-alike
I compared every single pixel of it, and it looks the same.Thu, 19 Dec 2013 09:26:53 GMThttps://stackoverflow.com/questions/20674577/how-to-compare-unicode-characters-that-look-alike?cid=30963612Abhimanyu Pallavi Sudhir2013-12-19T09:26:53ZComment by Abhimanyu Pallavi Sudhir on What is the definition of picture changing operation?
https://mathoverflow.net/questions/152295/what-is-the-definition-of-picture-changing-operation
Related: <a href="http://physics.stackexchange.com/q/12595/23119">physics.stackexchange.com/q/12595/23119</a>Thu, 19 Dec 2013 07:26:36 GMThttps://mathoverflow.net/questions/152295/what-is-the-definition-of-picture-changing-operation?cid=390438Abhimanyu Pallavi Sudhir2013-12-19T07:26:36ZComment by Abhimanyu Pallavi Sudhir on Understanding the intermediate field method for the $\phi^4$ interaction
https://mathoverflow.net/questions/149564/understanding-the-intermediate-field-method-for-the-phi4-interaction
@DanielSoltész: Nope, high-level questions generally get largely ignored there these days.Tue, 26 Nov 2013 14:40:20 GMThttps://mathoverflow.net/questions/149564/understanding-the-intermediate-field-method-for-the-phi4-interaction?cid=384774Abhimanyu Pallavi Sudhir2013-11-26T14:40:20ZComment by Abhimanyu Pallavi Sudhir on Intuition behind the ricci flow
https://mathoverflow.net/questions/143144/intuition-behind-the-ricci-flow/143146#143146
I was about to post the same thing, I think this is very illustrative.Tue, 19 Nov 2013 16:05:08 GMThttps://mathoverflow.net/questions/143144/intuition-behind-the-ricci-flow/143146?cid=383288#143146Abhimanyu Pallavi Sudhir2013-11-19T16:05:08ZComment by Abhimanyu Pallavi Sudhir on What is the relationship between complex time singularities and UV fixed points?
https://mathoverflow.net/questions/134939/what-is-the-relationship-between-complex-time-singularities-and-uv-fixed-points
This actually got twice the number of views here than on Physics.SE.Sun, 10 Nov 2013 14:50:44 GMThttps://mathoverflow.net/questions/134939/what-is-the-relationship-between-complex-time-singularities-and-uv-fixed-points?cid=381229Abhimanyu Pallavi Sudhir2013-11-10T14:50:44ZAnswer by Abhimanyu Pallavi Sudhir for The Fuchsian monodromy problem
https://mathoverflow.net/questions/146099/the-fuchsian-monodromy-problem/148462#148462
1<p>Equation 6.2 is just the Liovelle Action, the action principle for the <em>Liouville Field</em>, which is well-known from the familiar conformal gauge. </p>
<p>$$S_L=\frac{c}{96\pi}\int_\mathcal{M}\left(\dot\varphi^2-\frac{16\varphi}{\left(1-\lvert t\rvert^2\right)^2}\right)\mathrm{d}^2t$$ </p>
<p>... along with some trivial facts about partition functions. </p>
<p>You could of course think of it as the $Z_\mathcal{M}$'s (partition functions) of the metrics being related by the $S_L$'s in the same way that the metrics are related by the Liouvelle field. </p>
<p>And yes, I don't know how to spell "Lioivulle" properly. </p>Sun, 10 Nov 2013 06:53:28 GMThttps://mathoverflow.net/questions/146099/-/148462#148462Abhimanyu Pallavi Sudhir2013-11-10T06:53:28ZComment by Abhimanyu Pallavi Sudhir on Modular Arithmetic in LaTeX
https://mathoverflow.net/questions/18813/modular-arithmetic-in-latex
Haha, I thought this question was about typsetting a paper in $\LaTeX$Fri, 08 Nov 2013 11:34:52 GMThttps://mathoverflow.net/questions/18813/modular-arithmetic-in-latex?cid=379817Abhimanyu Pallavi Sudhir2013-11-08T11:34:52ZAnswer by Abhimanyu Pallavi Sudhir for String theory "computation" for math undergrad audience
https://mathoverflow.net/questions/47770/string-theory-computation-for-math-undergrad-audience/147307#147307
2<p>Derive the Casimir Energy in Bosonic String Theory. </p>
<p>You start with the $\hat L_0$ operator and get rid of the non-vacuum $\displaystyle\frac{\alpha_0^2}{2}+\sum_{n=1}^\infty\alpha_{-n}\cdot\alpha_n$, then you use a Ramanujam sum to do $\zeta$-function renormalisation, from which you find out that the vacuum energy denoted by $\varepsilon_0$ is </p>
<p>$$\varepsilon_0=-\frac{d-2}{24}$$ </p>
<p>However, the most interesting part comes when you go around <a href="https://mathoverflow.net/a/140354/36148">deriving</a> the critical dimension of Bosonic String Theory. </p>
<p>After which, the expression surprisingly simplifyies to a $-1$. </p>
<p>For a more detailed derivation of the above stuff, see <a href="http://arxiv.org/pdf/hep-th/0207142v1.pdf" rel="nofollow noreferrer">these</a> lecture notes/. (Section 4) (Equation 4.5-4.10) </p>Fri, 08 Nov 2013 04:33:41 GMThttps://mathoverflow.net/questions/47770/-/147307#147307Abhimanyu Pallavi Sudhir2013-11-08T04:33:41ZComment by Abhimanyu Pallavi Sudhir on Book on mathematical "rigorous" String Theory?
https://mathoverflow.net/questions/71909/book-on-mathematical-rigorous-string-theory/71998#71998
I don't think that BBS falls into the category of "mathematically rigorous". It's a very good, intuitive book.Fri, 08 Nov 2013 04:17:49 GMThttps://mathoverflow.net/questions/71909/book-on-mathematical-rigorous-string-theory/71998?cid=379753#71998Abhimanyu Pallavi Sudhir2013-11-08T04:17:49ZComment by Abhimanyu Pallavi Sudhir on About the massless supermultiplets in $2+1$ dimensional supersymmetry
https://mathoverflow.net/questions/103392/about-the-massless-supermultiplets-in-21-dimensional-supersymmetry
@S.Carnahan: The OP has voluntarily deleted it, which is weird... I have flagged this as unclear what you're asking.Wed, 06 Nov 2013 16:49:00 GMThttps://mathoverflow.net/questions/103392/about-the-massless-supermultiplets-in-21-dimensional-supersymmetry?cid=379331Abhimanyu Pallavi Sudhir2013-11-06T16:49:00ZAnswer by Abhimanyu Pallavi Sudhir for Does $SO(32) \sim_T E_8 \times E_8$ relate to some group theoretical fact?
https://mathoverflow.net/questions/57529/does-so32-sim-t-e-8-times-e-8-relate-to-some-group-theoretical-fact/147129#147129
5<p>The answer to this question can be found in Lubos Motl's answer to <a href="https://physics.stackexchange.com/q/65092/23119">this question of mine on Physics.SE</a>. </p>
<p>The key here are the weight lattices bosonic representations $\Gamma$ of these gauge groups.</p>
<p>As I understand it, the weight lattice of $E(8)$ is $\Gamma^8$, whereas the weight lattice of $\frac{\operatorname{Spin}\left(32\right)}{\mathbb{Z}_2}$^ is $\Gamma^{16}$. The first fact means that the weight lattice of $E(8)\times E(8)$ is $\Gamma^{8}\oplus\Gamma^8$, </p>
<p>Now, an identity, that $\Gamma^{8}\oplus\Gamma^8\oplus\Gamma^{1,1}=\Gamma^{16}\oplus\Gamma^{1,1} $ , which actually allows this T-Duality. Now, this means that it is <em>this very identity</em> which allows the identity mentioned in the original post. </p>
<p>So, the answer to your question is "<strong>Yes</strong>", there <em>is</em> a group-theoretical fact, and that is that $ \Gamma^{8}\oplus\Gamma^8\oplus\Gamma^{1,1}= \Gamma^{16}\oplus\Gamma^{1,1} $. </p>Wed, 06 Nov 2013 16:46:03 GMThttps://mathoverflow.net/questions/57529/does-so32-sim-t-e-8-times-e-8-relate-to-some-group-theoretical-fact/147129#147129Abhimanyu Pallavi Sudhir2013-11-06T16:46:03ZComment by Abhimanyu Pallavi Sudhir on Count of binary matrices that avoids a certain sub-matrix
https://mathoverflow.net/questions/30362/count-of-binary-matrices-that-avoids-a-certain-sub-matrix/30371#30371
@quid: Ok, I forgot about that.Tue, 29 Oct 2013 12:03:27 GMThttps://mathoverflow.net/questions/30362/count-of-binary-matrices-that-avoids-a-certain-sub-matrix/30371?cid=376986#30371Abhimanyu Pallavi Sudhir2013-10-29T12:03:27ZComment by Abhimanyu Pallavi Sudhir on Can the equation of motion with friction be written as Euler-Lagrange equation, and does it have a quantum version?
https://mathoverflow.net/questions/146042/can-the-equation-of-motion-with-friction-be-written-as-euler-lagrange-equation
Uh, how is this <i>Research-level</i>?Mon, 28 Oct 2013 11:14:41 GMThttps://mathoverflow.net/questions/146042/can-the-equation-of-motion-with-friction-be-written-as-euler-lagrange-equation?cid=376669Abhimanyu Pallavi Sudhir2013-10-28T11:14:41ZComment by Abhimanyu Pallavi Sudhir on Book on mathematical "rigorous" String Theory?
https://mathoverflow.net/questions/71909/book-on-mathematical-rigorous-string-theory/71914#71914
@MichaelKissner: Well, popular + semi-popular, to be precise (it has a semi-popular option).Tue, 17 Sep 2013 13:47:27 GMThttps://mathoverflow.net/questions/71909/book-on-mathematical-rigorous-string-theory/71914?cid=367280#71914Abhimanyu Pallavi Sudhir2013-09-17T13:47:27ZAnswer by Abhimanyu Pallavi Sudhir for Why does bosonic string theory require 26 spacetime dimensions?
https://mathoverflow.net/questions/99643/why-does-bosonic-string-theory-require-26-spacetime-dimensions/140354#140354
5<p>$$$$</p>
<p><em>Note, that here, the $\hat L_n$ are operators on the state given by the sums of the dots of the mode operators, i.e. $\hat L_0=\sum_{k=-\infty}^\infty\hat\alpha_{-n}\cdot\hat\alpha_n$.</em> </p>
<p>Also note that The Virasoro Algebra is the central extension of the Witt/Conformal Algebra so that explains why we have a $D$, it is equivalent to the central charge. </p>
<p>I'll expand on Chris Gerig's answer. </p>
<p>Not only do we need $D=26$, we also need the normal ordering constant $a=1$. The normal ordering constant is the eigenvalue of $\hat L_0$ with the eigenvector the state. </p>
<p>We want to promote the time-like states to spurious, zero-norm states, right? So, we impose the (level 1) spurious state conditions on the state as ffollows ($|\chi\rangle$ are the basis vectors to build the spurious state $\Phi\rangle$ on.) </p>
<p>$$ \begin{gathered}
0 = {{\hat L}_1}\left| \Phi \right\rangle \\
{\text{ }} = {{\hat L}_1}{{\hat L}_{ - 1}}\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left[ {{{\hat L}_{ - 1}},{{\hat L}_1}} \right]\left| {{\chi _1}} \right\rangle + {{\hat L}_{ - 1}}{{\hat L}_1}\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left[ {{{\hat L}_{ - 1}},{{\hat L}_1}} \right]\left| {{\chi _1}} \right\rangle \\
{\text{ }} = 2{{\hat L}_0}\left| {{\chi _1}} \right\rangle \\
{\text{ }} = 2{c_0}\left( {a - 1} \right)\left| {{\chi _1}} \right\rangle \\
\end{gathered} $$</p>
<p>That means that $a=1$. </p>
<p>Now, for a level 2 spurious state, </p>
<p>$$\begin{gathered}
\left[ {{{\hat L}_1},{{\hat L}_{ - 2}} + k{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right]\left| \psi \right\rangle = \left( {3{{\hat L}_{ - 1}} + 2k{{\hat L}_0}{{\hat L}_{ - 1}} + 2k{{\hat L}_{ - 1}}{{\hat L}_0}} \right)\left| \psi \right\rangle {\text{ }} \\
{\text{ }} = \left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_0}{{\hat L}_{ - 1}}{\text{ }}\left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_0}{{\hat L}_{ - 1}}{\text{ }} \\
0 = {{\hat L}_1}\left| \psi \right\rangle = {{\hat L}_1}\left( {{{\hat L}_{ - 2}} + k{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right)\left| {{\chi _1}} \right\rangle = \left( {\left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_0}{{\hat L}_{ - 1}}} \right)\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left( {\left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_{ - 1}}\left( {{{\hat L}_0} + 1} \right)} \right)\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left( {3 - 2k} \right){{\hat L}_{ - 1}}\left| {{\chi _1}} \right\rangle \\
2k = 3 \\
k = \frac{3}{2} \\
\end{gathered} $$ </p>
<p>Since this level 2 spurious state can be written as: </p>
<p>$$ {\left| \Phi \right\rangle = {{\hat L}_{ - 2}}\left| {{\chi _1}} \right\rangle + k{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}\left| {{\chi _2}} \right\rangle }$$ ## </p>
<p>So, then, </p>
<p>$$ \begin{gathered}
{{\hat L}_2}\left| \Phi \right\rangle = 0 \\
{{\hat L}_2}\left( {{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right)\left| {{\chi _2}} \right\rangle = 0 \\
\left[ {{{\hat L}_2},{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right]\left| {{\chi _2}} \right\rangle + \left( {{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right){{\hat L}_2}\left| {{\chi _2}} \right\rangle = 0 \\
\left[ {{{\hat L}_2},{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right]\left| {{\chi _2}} \right\rangle = 0 \\
\left( {13{{\hat L}_0} + 9{{\hat L}_{ - 1}}{{\hat L}_{ + 1}} + \frac{D}{2}} \right)\left| {{\chi _2}} \right\rangle = 0 \\
\frac{D}{2} = 13 \\
\text{Since $L_0|\chi_2\rangle = -|\chi_2\rangle$ and $L_{+1}|\chi_2\rangle=0$, we have }
D = 26 \\
\end{gathered} $$ \ </p>
<p>And then, finally,</p>
<p>Q.E.D. </p>
<p>So, this was done essentially to remove the imaginary norm ghost states and using the Canonical / Gupta - Bleuer formalism. </p>
<p>It's also possible to use , say, e.g. Light Cone Gauge (LCG) quantisation. However, in other quantisation methods, the conformal anomaly is manifest in other forms. E.g., in LCG quantisationn, it is manifest as a failure of lorentz symmetry. A good overview of this method can be found in <strong>Kaku</strong> <em>Strings, Conformal fields, and M-theory</em> (it's the only part of the book that I liked, actually. The rest of the book is too rigorous, without much physical intuition.). </p>Sun, 25 Aug 2013 09:40:17 GMThttps://mathoverflow.net/questions/99643/why-does-bosonic-string-theory-require-26-spacetime-dimensions/140354#140354Abhimanyu Pallavi Sudhir2013-08-25T09:40:17ZAnswer by Abhimanyu Pallavi Sudhir for Why is there a deep mysterious relation between string theory and number theory, elliptic curves, $E_8$ and the Monster group?
https://physics.stackexchange.com/questions/4748/why-is-there-a-deep-mysterious-relation-between-string-theory-and-number-theory/71301#71301
7<p>I'll answer the relation between string theory and $E(8)$ -- a common appearance of $E(8)$ in string theory is in the gauge group of <a href="http://en.wikipedia.org/wiki/Type_HE_theory" rel="nofollow noreferrer">Type HE string theory</a> $E(8)\times E(8)$ (see <a href="https://physics.stackexchange.com/questions/68242/why-do-the-mismatched-16-dimensions-have-to-be-compactified-on-an-even-lattice">here</a> for an explanation why). But it's interesting physically because it embeds the standard model subgroup.</p>
<p>$$SU(3)\times SU(2)\times U(1)\subset SU(5)\subset SO(10)\subset E(6)\subset E(7)\subset E(8)$$ </p>
<p>Indeed, the ones in between are GUT subgroups, and $E(8)$ happens to be the "largest" of the exceptional lie groups.</p>
<p><a href="http://en.wikipedia.org/wiki/Monstrous_moonshine#Borcherds.27_proof" rel="nofollow noreferrer">Wikipedia</a> has some things to say about the connections to monstrous moonshine, I'm not familiar with it. See <a href="https://physics.stackexchange.com/questions/5207/number-of-dimensions-in-string-theory-and-possible-link-with-number-theory?lq=1#comment13658_5207">[1]</a>, <a href="https://physics.stackexchange.com/questions/5207/number-of-dimensions-in-string-theory-and-possible-link-with-number-theory?lq=1#comment13659_5207">[2]</a> re: the connections to number theory. Another example is how "1+2+3+4=10" demonstrates a 10-dimensional theory's ability to explain the four fundamental fources -- EM is the curvature of the $U(1)$ bundle, the weak force is the curvature of the $SU(2)$ bundle, the strong is the curvature of the $SU(3)$ bundle and gravity is the curvature of spacetime.</p>
<p>[Archiving Ron Maimon's comment here in case it gets deleted --]</p>
<blockquote>
<p>There is another point, that E(8) <s>is</s> has embedded E6xSU(3), and on a Calabi Yau, the SU(3) is the holonomy, so you can easily and naturally break the E8 to E6. This idea appears in Candelas Horowitz Strominger Witten in 1985, right after Heterotic strings and it is still the easiest way to get the MSSM. The biggest obstacle is to get rid of the MS part--- you need a SUSY breaking at high energy that won't wreck the CC or produce a runaway Higgs mass, since it seems right now there is no low-energy SUSY. </p>
</blockquote>Tue, 16 Jul 2013 18:46:52 GMThttps://physics.stackexchange.com/questions/4748/-/71301#71301Abhimanyu Pallavi Sudhir2013-07-16T18:46:52ZAnswer by Abhimanyu Pallavi Sudhir for Coincidence, purposeful definition, or something else in formulas for energy
https://physics.stackexchange.com/questions/71119/coincidence-purposeful-definition-or-something-else-in-formulas-for-energy/71121#71121
4<p>Most of them (all of your examples except <span class="math-container">$E=c^2m$</span>, which is really just <span class="math-container">$E=m$</span> anyway) arise from integrating a linear equation like <span class="math-container">$p=mv$</span> as <span class="math-container">$E=\int v\,dp$</span>, and it is often just a convention that we choose the linear relation to have a constant of proportionality of 1, so the integral has a constant of 1/2 (for example, we could've instead chosen, like we do with areas of circles, to have <span class="math-container">$c=2\pi r$</span> and <span class="math-container">$A=\pi r^2$</span>). </p>Mon, 15 Jul 2013 04:01:14 GMThttps://physics.stackexchange.com/questions/71119/-/71121#71121Abhimanyu Pallavi Sudhir2013-07-15T04:01:14ZAnswer by Abhimanyu Pallavi Sudhir for Does the curvature of spacetime theory assume gravity?
https://physics.stackexchange.com/questions/7781/does-the-curvature-of-spacetime-theory-assume-gravity/69092#69092
1<p>No. While the curvature of spacetime -- or even Newtonian gravity, for that matter -- indeed can be modeled as a "potential well", the tendency of matter to lower this potential is an axiom of general relativity, and is not gravity. </p>
<p>The mathematics of general relativity can be derived from four important physical axioms -- (1) the Einstein-Hilbert action, or "gravity is the curvature of spacetime", or equivalently the Einstein-Field Equation, "matter curves spacetime" -- see <a href="https://physics.stackexchange.com/questions/3009/how-exactly-does-curved-space-time-describe-the-force-of-gravity/68707#68707">my answer here</a> for a derivation of the EFE from the action, (2) the geodesic equation, or "the geometry of spacetime moves matter", (3) Newtonian gravity is effective at low energies and (4) special relativity. So while it is true that general relativity assumes some law on whose basis matter moves (the geodesic equation), this law is not "gravity".</p>Tue, 25 Jun 2013 05:58:21 GMThttps://physics.stackexchange.com/questions/7781/-/69092#69092Abhimanyu Pallavi Sudhir2013-06-25T05:58:21ZAnswer by Abhimanyu Pallavi Sudhir for Gravity is an intrinsic property of every atoms?
https://physics.stackexchange.com/questions/68998/gravity-is-an-intrinsic-property-of-every-atoms/69017#69017
2<p>Atoms are overrated among laymen, gravity is a property of all matter, regardless of what particle structures it is comprised of -- for example, light, dark matter, government bureaucrats and other exotic forms of matter all exhibit gravity.</p>
<p>That answers your first question -- I have no idea what the others are supposed to mean, what their words mean and how they are connected.</p>Mon, 24 Jun 2013 09:40:18 GMThttps://physics.stackexchange.com/questions/68998/-/69017#69017Abhimanyu Pallavi Sudhir2013-06-24T09:40:18ZAnswer by Abhimanyu Pallavi Sudhir for How exactly does curved space-time describe the force of gravity?
https://physics.stackexchange.com/questions/3009/how-exactly-does-curved-space-time-describe-the-force-of-gravity/68707#68707
3<p>It is straightforward to see how the <em>geometry</em> of spacetime describes the force of gravity -- you just need to understand the geodesic equation, which in general relativity describes the paths of things subject to gravity and nothing else. This is the "spacetime affects matter" side of the theory.</p>
<p>To understand why curvature in particular, as a property of the geometry, is important, you need to understand the "matter affects spacetime" side of general relativity. The postulate is that the Gravitational Lagrangian of the theory is equal to the scalar curvature -- this is called the "Einstein-Hilbert Action" --</p>
<p>$$S=\int{\left( {\lambda R + {{\mathcal{L}}_M}} \right)\sqrt { - g}\, d{x^4}} {\text{ }} $$</p>
<p>You set the variation in the action to zero, as with any classical theory, and solve for the equations of motion. The conventional way to do this goes something like this --</p>
<p>$$\int{\left( {\frac{{\delta \left( {\left( {{{\mathcal{L}}_M} + \lambda R} \right)\sqrt { - g} } \right)}}{{\delta {g_{\mu \nu }}}}} \right)\delta {g_{\mu \nu }}\,d{x^4}} = 0$$
$$ \sqrt { - g} \frac{{\delta {{\mathcal{L}}_M}}}{{\delta {g_{\mu \nu }}}} + \lambda \sqrt { - g} \frac{{\delta R}}{{\delta {g_{\mu \nu }}}} + \left( {{{\mathcal{L}}_M} + \lambda R} \right)\frac{{\delta \sqrt { - g} }}{{\delta {g_{\mu \nu }}}} = 0 $$
$$ \frac{{\delta R}}{{\delta {g_{\mu \nu }}}} + \frac{R}{{\sqrt { - g} }}\frac{{\delta \sqrt { - g} }}{{\delta {g_{\mu \nu }}}} = - \frac{1}{\lambda }\left( {\frac{1}{{\sqrt { - g} }}{{\mathcal{L}}_M}\frac{{\delta \sqrt { - g} }}{{\delta {g_{\mu \nu }}}} + \frac{{\delta {{\mathcal{L}}_M}}}{{\delta {g_{\mu \nu }}}}} \right)$$</p>
<p>$$ {R_{\mu \nu }} - \frac{1}{2}R{g_{\mu \nu }} = \frac{1}{{2\lambda }}{T_{\mu \nu }}$$</p>
<p>To fix the value of $\kappa=1/{2\lambda}$, we impose Newtonian gravity at low energies, for which we only consider the time-time component, which Newtonian gravity describes (I'll use $C$ for the gravitational constant, reserving $G$ for the trace of the Einstein tensor) -- </p>
<p>$$\begin{gathered}
{G_{00}} = \kappa c^4\rho \\
{R_{00}} = {G_{00}} - \frac{1}{2}Gg_{00} \\
\Rightarrow {R_{00}} \approx \kappa \left( {c^4\rho - \frac{1}{2}\frac{1}{{c^2}}c^4\rho c^2} \right) \approx \frac{1}{2}\kappa c^4\rho \\
\end{gathered} $$</p>
<p>Imposing Poisson's law from Newtonian gravity with $\partial^2\Phi$ approximating $\Gamma _{00,\alpha }^\alpha $,</p>
<p>$$ 4\pi C\rho \approx {\nabla ^2}\Phi \approx \Gamma _{00,\alpha }^\alpha \approx {R_{00}} \approx \frac{\kappa }{2}c^4\rho \\
\Rightarrow \kappa = \frac{{8\pi G}}{{c^4}} \\
$$</p>
<p>(The fact that this is possible is fantastic -- it means that simply postulating that spacetime is curved in a certain sense produces a force that agrees with our observations regarding gravity at low energies.) Giving us the Einstein-Field Equation,</p>
<p>$${G_{\mu \nu }} = \frac{{8\pi G}}{{c^4}}{T_{\mu \nu }}$$</p>Fri, 21 Jun 2013 13:35:32 GMThttps://physics.stackexchange.com/questions/3009/-/68707#68707Abhimanyu Pallavi Sudhir2013-06-21T13:35:32ZAnswer by Abhimanyu Pallavi Sudhir for String theory: why not use $n$-dimensional blocks/objects/branes?
https://physics.stackexchange.com/questions/66948/string-theory-why-not-use-n-dimensional-blocks-objects-branes/68699#68699
3<p>There are, actually. Dilaton already covered the reason through T-duality, so I will discuss the requirement of $p$-branes imposed by Ramond-Ramond potentials. </p>
<p>The worldsheet of a string can couple to a Neveu-Schwarz B-field:
$$q\int_{}^{} {{{h^{ab}}}\frac{{\partial {X^\mu }}}{{\partial {\xi ^a}}}\frac{{\partial {X^\nu }}}{{\partial {\xi ^b}}}B_{\mu \nu }\sqrt { - \det {h_{ab}}} {{\text{d}}^2}\xi } $$</p>
<p>($q$ is the electric charge) The worldsheet of a string can couple to graviton field (spacetime metric):
$$m\int_{}^{} {{{h^{ab}}}\frac{{\partial {X^\mu }}}{{\partial {\xi ^a}}}\frac{{\partial {X^\nu }}}{{\partial {\xi ^b}}}g_{\mu \nu }\sqrt { - \det {h_{ab}}} {{\text{d}}^2}\xi } $$</p>
<p>You can change the "$m$" to any form you like, in terms of the tension/Regge Slope parameter/string length etc.</p>
<p>For a dilaton field,
$${q }\ell _P^2\int_{}^{} {\Phi R\sqrt { - \det {h_{\alpha \beta }}} {\text{ }}{{\text{d}}^2}\xi } $$
Ignore conformal invariance for the time being.</p>
<p>But what about Ramond-Ramond potentials? All is fine with the Ramond-Ramond fields, but the Ramond-Ramond potentials $C_k$are associated with the Ramond-Ramond field $A_{k+1}$ and it is clear that they can't couple similarly to the worldsheet. But it can couple to a higher dimensional worlvolume --
$${q_{{\text{RR}}}}\int_{}^{} {C_{{\mu _1}...{\mu _p}}^{p + 1}\frac{{\partial {x^{{\mu _1}}}}}{{\partial {\xi ^{{a_1}}}}}...\frac{{\partial {x^{{\mu _p}}}}}{{\partial {\xi ^{{a_p}}}}}{h^{{a_0}...{a_p}}}\sqrt { - \det {h^{{a_0}...{a_p}}}} {{\text{d}}^{p + 1}}\xi } $$</p>
<p>Which requires membranes and other higher-dimensional objects. It's interesting to note that while 10-dimensional string theories permit all sorts of these branes, M-theory only permits 2 and 5 dimensional branes.</p>Fri, 21 Jun 2013 10:46:35 GMThttps://physics.stackexchange.com/questions/66948/-/68699#68699Abhimanyu Pallavi Sudhir2013-06-21T10:46:35ZAnswer by Abhimanyu Pallavi Sudhir for What happens with the force of gravity when the distance between two objects is 0?
https://physics.stackexchange.com/questions/68519/what-happens-with-the-force-of-gravity-when-the-distance-between-two-objects-is/68522#68522
2<p>If the distance to a (point-sized) object actually were zero, indeed you'd have a situation of a singularity. But in your example -- being at the centre of the Earth -- the force is actually zero, since equal forces are acting at you from all directions. In general, the inverse-square law is applicable to point particles, and needs to be integrated over all points in the Earth placing a gravitational force on the object to get the resultant force.</p>
<p>$${\vec F_{{\rm{res}}}} = Gm\iiint_{|x|<\,1}{\frac{{dm}}{{{{\left| {\vec r - \vec x} \right|}^2}}}}
$$</p>
<p>Evaluating this in the special cases: where $|\vec r|>1$, you can use the inverse square law replacing the sphere by its center and using its total mass; where $|\vec r|<1$, you can use Newton's shell theorem and discount the part of the Earth above your head.</p>
<p>Regarding your new questions -- if you're at the midpoint between equal masses, you're stationary; if you're on the perpendicular bisector, you'll be drawn towards the midpoint because that's how vectors add up; if you're buried inside one of the two masses, the only forces acting on you will be from the sphere of mud under you and the other mass far away (the other mass isn't part of the shell above you).</p>
<p><strong>Gallery</strong></p>
<p><img src="https://i.stack.imgur.com/tkhpw.png" alt="enter image description here"></p>
<p>Forces being zero inside a spherical shell, since the smaller masses are closer to compensate for their mass, and the larger masses are further away.</p>
<p><img src="https://i.stack.imgur.com/h0Zhy.png" alt="enter image description here"></p>
<p>Newton's shell theorem -- the blue guy only feels gravity from the mud-sphere under his feet.</p>
<p><img src="https://i.stack.imgur.com/IbQeN.png" alt="enter image description here"></p>
<p>Three-body problem where one of the guys is a useless test mass who lives on the perpendicular bisector of the two bodies.</p>
<p><img src="https://i.stack.imgur.com/SOTgP.png" alt="enter image description here"></p>
<p>The orange horse inside the black mass feels gravity from both the mud-sphere under it and from the red mass.</p>Wed, 19 Jun 2013 05:38:52 GMThttps://physics.stackexchange.com/questions/68519/-/68522#68522Abhimanyu Pallavi Sudhir2013-06-19T05:38:52ZAnswer by Abhimanyu Pallavi Sudhir for Is velocity of light constant?
https://physics.stackexchange.com/questions/66856/is-velocity-of-light-constant/68513#68513
1<p>There are two questions here -- is the velocity of light <em>constant</em>, and is it <em>invariant</em>?</p>
<p>The direction/velocity of light changes whenever it interacts with something. This includes gravitational deflection, since things have to change direction in curved spacetime in one sense or another. The velocity isn't constant.</p>
<p>Is it invariant under Lorentz boosts in perpendiculal directions? <em>No.</em> The speed is invariant, but the velocity isn't. This should be fairly clear, but you can prove it with brute force --</p>
<p>We need to apply a boost to light's four-velocity, but the four-velocity of light is actually infinite -- it's (infinity, infinity, 0, 0), except the infinities satisfy a certain relation in the sense of being related through a limit. So we consider an object traveling at speed $w$ in the $x$-direction, boost $v$ in the $y$-direction and let $w\to c$. The four-velocity transforms under this boost as:</p>
<p>$$\left[ {\begin{array}{*{20}{c}}{\gamma (w)}\\{w\gamma (w)}\\0\\0\end{array}} \right] \to \left[ {\begin{array}{*{20}{c}}{\gamma (v)\gamma (w)}\\{w\gamma (w)}\\{ - v\gamma (v)\gamma (w)}\\0\end{array}} \right]$$</p>
<p>The conventional 3-velocity can be extracted here by considering $dx/dt$, $dy/dt$:</p>
<p>$$\frac{{dx}}{{dt}} = \frac{{dx/d\tau }}{{dt/d\tau }} = \frac{{w\gamma (w)}}{{\gamma (v)\gamma (w)}} = \frac{w}{{\gamma (v)}}$$
$$\frac{{dy}}{{dt}} = \frac{{dy/d\tau }}{{dt/d\tau }} = \frac{{ - v\gamma (v)\gamma (w)}}{{\gamma (v)\gamma (w)}} = - v$$</p>
<p>Taking the limit as $w\to 1$, you get a 3-velocity of $(1/\gamma(v),-v, 0)$ -- one may confirm that this is not equivalent to the original three-velocity that was $(1,0,0)$, but nonetheless has the same magnitude (speed is invariant).</p>Wed, 19 Jun 2013 04:17:58 GMThttps://physics.stackexchange.com/questions/66856/-/68513#68513Abhimanyu Pallavi Sudhir2013-06-19T04:17:58ZAnswer by Abhimanyu Pallavi Sudhir for Momentum of a particle?
https://physics.stackexchange.com/questions/68403/momentum-of-a-particle/68427#68427
2<p>The point of defining momentum is to have a conserved vector quantity relating to motion -- the formal definition of this comes from Noether's theorem, where momentum is the conserved charge resulting from translational invariance.</p>
<p>It's often conventional in mechanics to refer to momentum as an "amount of motion" or "how much a mass moves", but this is a rather vague statement, since there's no reason the same description can't be made of kinetic energy, for example.</p>Tue, 18 Jun 2013 06:28:42 GMThttps://physics.stackexchange.com/questions/68403/-/68427#68427Abhimanyu Pallavi Sudhir2013-06-18T06:28:42ZAnswer by Abhimanyu Pallavi Sudhir for Capacitors' working in a circuit
https://physics.stackexchange.com/questions/68387/capacitors-working-in-a-circuit/68426#68426
2<p>The answer is just "yes, obviously, the voltage is zero". The answer below is unnecessarily computational, but I'm keeping it in case someone likes that.</p>
<hr>
<p><strong>Archived answer</strong></p>
<p>I'll assume you are talking about an circuit with a capacitor and resistor inside. Then, let $Q$ be the charge, $t$ be the time, $C$ be the capacitance, $R$ be the resistance, $T$ be the time constant, and $V$ be the electromotive force. You must know of the differential equation:
$$\frac{{{\text{d}}Q}}{{{\text{d}}t}} = \frac{{ - Q + CV}}{T}$$</p>
<p>Separating the equation and integrating,
$$\frac{{{\text{d}}Q}}{{{\text{d}}t}} = \frac{V}{R}\exp \left( { - \frac{t}{T}} \right)$$</p>
<p>Since the current is the rate of change in the charge with respect to time, we can rewrite this equation in the following form:</p>
<p>$$I = \frac{V}{R}\exp \left( { - \frac{t}{T}} \right)$$</p>
<p>So, the potential of the battery being equal to the potential of the capacitor simply means that $V=0$, so
$$I = \frac{0}{R}\exp \left( { - \frac{t}{T}} \right)=0$$</p>
<p>So yes, although it will take an infinite amount of time to reach this point.</p>Tue, 18 Jun 2013 06:20:12 GMThttps://physics.stackexchange.com/questions/68387/-/68426#68426Abhimanyu Pallavi Sudhir2013-06-18T06:20:12ZAnswer by Abhimanyu Pallavi Sudhir for Measuring extra-dimensions
https://physics.stackexchange.com/questions/22542/measuring-extra-dimensions/68414#68414
4<p>The standard way to measure compactified dimensions is to test some inverse-square law (e.g. Newton's, electromagnetic, diffusion) at the scale and see if it breaks down and starts approaching some other (higher power) inverse-power law.</p>
<p>In fact, the inverse-square law has only been verified down to a scale of 0.1mm -- here's a recent experimental paper doing this: <a href="http://arxiv.org/abs/hep-ph/0011014v1" rel="nofollow noreferrer">[1]</a>.</p>
<p>(Yes, you can measure time in metres, by multiplying by the speed of light. This is where "lightseconds" and other such measurements of distance come from. An example motivation for treating this as the unit of the time dimension is from the Minkowski metric, $ds^2=c^2dt^2-dx^2-dy^2-dz^2$, where $ct$ is a dimension analogous to the spatial ones.)</p>Tue, 18 Jun 2013 04:16:35 GMThttps://physics.stackexchange.com/questions/22542/-/68414#68414Abhimanyu Pallavi Sudhir2013-06-18T04:16:35ZAnswer by Abhimanyu Pallavi Sudhir for A change in the gravitational law
https://physics.stackexchange.com/questions/41109/a-change-in-the-gravitational-law/68326#68326
5<p>Such a change requires a 4+1-dimensional spacetime instead of a 3+1-dimensional one -- this would have several serious implications --</p>
<ol>
<li><p>The Riemann curvature tensor gains new "parts" with interesting physical implications with each new spacetime dimension -- 1-dimensional manifolds have no curvature in this sense, 2-dimensional manifolds have a scalar curvature, 3-dimensional manifolds gain the full Ricci tensor, 4-dimensional manifolds get components corresponding to a new Weyl tensor and 5-dimensional geometry gets even more components, and general relativity in this spacetime is capable of explaining electromagnetism, too, so electromagnetism (along with the radion field) starts behaving as a part of gravity.</p></li>
<li><p>Apparently a 5-dimensional spacetime is unstable, according to wikipedia's "privileged character of 3+1-dimensional spacetime"<a href="http://en.wikipedia.org/wiki/Spacetime#Privileged_character_of_3.2B1_spacetime" rel="nofollow noreferrer">[1]</a> (now a transclusion of <a href="https://en.wikipedia.org/wiki/Anthropic_principle#Dimensions_of_spacetime" rel="nofollow noreferrer">[2]</a>).</p></li>
<li><p>The string theory landscape would be a bit smaller, since there are less dimensions to compactify.</p></li>
<li><p>The Ricci curvature in a vacuum on an Einstein Manifold would no longer be exactly $\Lambda g_{ab}$. There will be a coefficient of 2/3.</p></li>
<li><p>The magnetic field, among other things "cross product-ish", could not be written as a vector, unlike the electric field. This is because it would have 6 components whereas the spatial dimension is only 4. So, perhaps humans would become familiar with exterior algebras earlier than us who live in 3+1 dimensions. Either that or we would be trying to find out how magnetism works. Or we would just die out, for all the other reasons.</p></li>
<li><p>In string theory (see e.g. <a href="http://arxiv.org/abs/hep-th/0207249v1" rel="nofollow noreferrer">[3]</a>), gravitational constants in successively higher dimensions are calculated as $G_{n+1}=l_sG_n$, where $l_s$ is the string length (the units must be different in order to accomodate the extra factor of $r$ in Newton's gravitational law). For distance scales greater than the string length, this causes gravity to be much weaker than in our number of dimensions, but stronger for length scales shorter than the string length. It's interesting how gravity's long-range ability peaks at 4 dimensions (it is a contact force below 4 dimensions).</p></li>
</ol>
<p>See also some recent tests of the inverse square law at short length scales (to check for compactification -- <a href="http://arxiv.org/abs/hep-ph/0011014" rel="nofollow noreferrer">[4]</a>.</p>Mon, 17 Jun 2013 10:12:52 GMThttps://physics.stackexchange.com/questions/41109/-/68326#68326Abhimanyu Pallavi Sudhir2013-06-17T10:12:52ZAnswer by Abhimanyu Pallavi Sudhir for Mass of a superstring between two branes?
https://physics.stackexchange.com/questions/46118/mass-of-a-superstring-between-two-branes/68240#68240
2<p>It's similar -- </p>
<p>$${m^2} = \left( {N - a} \right) + {\left( {\frac{y}{{2\pi }}} \right)^2}$$</p>
<p>The important difference is that the number operator and normal ordering constant change for a superstring, and vary by sector.</p>Sun, 16 Jun 2013 11:12:27 GMThttps://physics.stackexchange.com/questions/46118/-/68240#68240Abhimanyu Pallavi Sudhir2013-06-16T11:12:27ZAnswer by Abhimanyu Pallavi Sudhir for Multiplying vectors (answered own question)
https://math.stackexchange.com/questions/414475/multiplying-vectors-answered-own-question/414476#414476
3<p>The dot product and cross product both appear as components of the tensor product of two vectors, $v^\mu w^\nu$, which gives a rank-2 tensor. The dot product is the contraction/trace $v^\mu w_\mu$, which is useful due to its invariance properties, and the cross product appears as the tensor $v^\mu w^\nu - v^\nu w^\mu$ (which in three dimensions allows for an ugly misrepresentation in terms of axes of rotation, which allows this tensor to be written as a vector, even though it doesn't transform as one).</p>
<p>An alternative formulation of this is in the geometric algebra notation, where the cross product is written as the "wedge product", the dot product is still the inner product and the tensor product is called the "geometric product" and is the sum of the two. </p>
<hr>
<p>ARCHIVED ANSWER (Jun 2013) follows, I no longer endorse the below contents --</p>
<ol>
<li><p>Dot product (Scalar Product)</p>
<p>The dot product, you could say, very hand-wavily measures both the overall size of 2 vectors and how parallel they are.</p>
<p>The dot product is related to the magnitudes and angles of the two vectors by:
$$\vec a\cdot\vec b=||\vec a||\mbox{ }||\vec b||\cos\theta$$</p>
<p>So, if the two vectors are orthogonal, their dot product is 0. If they are parallel, their dot product is the product of the two magnitudes. The latter case always happens in scalars. So, in, this sense, the dot product actually is a generalisesation of the normal ordinary scalars' product for scalars in $\mathbb C$</p>
<p>Of course, it is better to use the dot product when measuring orthogonality, only. In fact, often, orthogonality (and not perpendicularity) is defined in terms of the dot product being equal to 0.</p>
<p>Also, note that for complex vectors,
$$\Re(\vec a\cdot\vec b)=||\vec a||\mbox{ }||\vec b||\cos\theta$$</p>
<p>Generally, the dot product is calculated by:
$${\mathbf{a}}\cdot{\mathbf{b}} = \sum {{a_i}\overline {{b_i}} }$$ </p></li>
<li><p>Cross Product (Vector Product)</p>
<p>The cross product of two vectors in $\mathbb R^3$ is a vector orthogonal to these two vectors and has a magnitude of
$$||\vec a\times\vec b||=||\vec a||\mbox{ }||\vec b||\sin\theta$$</p>
<p>It can be calculated as:
$${\mathbf{a}} \times {\mathbf{b}} = \left| {\begin{array}{ccccccccccccccc}
{{{\hat e}_1}}&{{{\hat e}_2}}&{{{\hat e}_3}} \\
\leftarrow &{{{\mathbf{a}}^T}}& \to \\
\leftarrow &{{{\mathbf{b}}^T}}& \to
\end{array}} \right|$$</p>
<p>(not really a determinant -- just a mnemonic, etc. etc.)</p>
<p>Thus, the magnitude of the cross products describes their "orthogonal-ness" and their overall "size". It is 0 whenever the two vectors are parallel. Oh, and of course, these definitions become relatively very complicated in more than 3 dimensions. In more than 3-dimensions, one has to use:
$$\vec a\times\vec b=/(\vec a\wedge\vec b)$$</p>
<p>Here, $\wedge$ is the exterior product and $/$ is a duality between the cross products and the exterior (wedge $\wedge$) products. I once showed that the following generalisation is possible:
$$/\left( {{{{\mathbf{\hat e}}}_m} \wedge {{{\mathbf{\hat e}}}_n}} \right) = {\left( { - 1} \right)^{m + n + 1}}\mathop \bigwedge \limits_{k \ne m,n}^{} {{{\mathbf{\hat e}}}_k}$$</p>
<p>...in any dimension...</p></li>
<li><p>Exterior Product (Wedge Product)</p>
<p>The Exterior Product of 2 vectors is the bivector spanned by them.</p>
<p>Of course, there are many more products, such as the tensor product (The outer product is a special case for vectors and the Kronecker Product is for matrices), the natural product, the Clifford Product etc. Actually, the natural product was defined by me in <a href="http://ccsenet.org/journal/index.php/jmr/article/view/18102" rel="nofollow noreferrer">http://ccsenet.org/journal/index.php/jmr/article/view/18102</a> in a hope to obtain a geometric interpretation of matrices, though it works only for singular matrices.</p></li>
</ol>Sat, 08 Jun 2013 05:30:30 GMThttps://math.stackexchange.com/questions/414475/-/414476#414476Abhimanyu Pallavi Sudhir2013-06-08T05:30:30ZMultiplying vectors (answered own question)
https://math.stackexchange.com/questions/414475/multiplying-vectors-answered-own-question
2<p>I recently realised that asking a question and answering our own question is allowed here, so here is a question I've seen commonly on many sites:</p>
<p>"How does one multiply two vectors?"</p>
<p>This is very open-ended (but basic) question, but here's my answer to it (below in the answers section).</p>linear-algebramatricestensor-productsexterior-algebracross-productSat, 08 Jun 2013 05:30:30 GMThttps://math.stackexchange.com/q/414475Abhimanyu Pallavi Sudhir2013-06-08T05:30:30ZAnswer by Abhimanyu Pallavi Sudhir for How is it that angular velocities are vectors, while rotations aren't?
https://physics.stackexchange.com/questions/286/how-is-it-that-angular-velocities-are-vectors-while-rotations-arent/65738#65738
6<p>You are mixing up different things. A rotation transformation is a transformation of vectors in a linear space -- such a transformation doesn't need to have any angular velocities or anything, and it doesn't even need to have anything to do with a mechanical rotation.</p>
<p>The angular velocity is the rate of a physical rotation, measured as $\vec\omega=d\vec\theta/dt$, where $\vec\theta$ is <em>also</em> a vector, the rotational analog of displacement.</p>
<p>In any case, the $\vec\theta$ is not the same as the matrix of rotation. The latter is a <em>function</em> of $\vec\theta$, but a matrix can be used to represent a lot more things than just a rotation. Note that a rotation can still be modelled as a time-dependent matrix itself, like $\vec{x}(t)=A(t)\vec{x}(0)$, but the matrix is still not the same as the angle of rotation.</p>
<hr>
<p>Note: I've been a bit sneaky in claiming that $\vec\theta$ is a "vector" -- it's really not, although it happens to have 3 components in 3 dimensions so it's conventional to write the "xy" component as the "z" component, "xz" as the "y" component, "yz" as "x", but in general it's best to think of angles as (2, 0) tensors $\theta^{\mu\nu}$. Interestingly, the rotation transformation is a (1, 1) tensor $A^{\mu}{}_{\nu}$.</p>Fri, 24 May 2013 12:20:55 GMThttps://physics.stackexchange.com/questions/286/-/65738#65738Abhimanyu Pallavi Sudhir2013-05-24T12:20:55ZAnswer by Abhimanyu Pallavi Sudhir for Can someone please explain magnetic vs electric fields?
https://physics.stackexchange.com/questions/53916/can-someone-please-explain-magnetic-vs-electric-fields/65091#65091
3<p>The electric and magnetic fields arise as Lorentz duals of each other, with them mixing and transforming between each other through Lorentz boosts. The full picture of the field comes from the electromagnetic field tensor</p>
<p>$$F_{\mu\nu} = \begin{bmatrix}
0 & E_x/c & E_y/c & E_z/c \\
-E_x/c & 0 & -B_z & B_y \\
-E_y/c & B_z & 0 & -B_x \\
-E_z/c & -B_y & B_x & 0
\end{bmatrix}$$</p>
<p>Which satisfies simple identities (see <a href="https://en.wikipedia.org/wiki/Electromagnetic_tensor#Significance" rel="nofollow noreferrer">[1]</a>) equivalent to Maxwell's equations. The electric and magnetic fields are different components of this tensor, placed in similar positions as e.g. the momemtnum and shear stress in the 4d stress tensor.</p>Sun, 19 May 2013 05:01:31 GMThttps://physics.stackexchange.com/questions/53916/-/65091#65091Abhimanyu Pallavi Sudhir2013-05-19T05:01:31Z