Abhimanyu Pallavi Sudhir
http://www.rssmix.com/
This feed was created by mixing existing feeds from various sources.RSSMixThe correct multi-variate mean-value theorem (no inequality)
https://thewindingnumber.blogspot.com/2020/05/the-correct-multi-variate-mean-value.html
0You may notice the similarity between the mean-value theorem and the fundamental theorem of calculus. Indeed:<br /><br />\[f(b) - f(a) = \int_a^b {f'(x)\,dx} \]\[f(b) - f(a) = f'(c)(b - a) \,\, (\exists\, c\,\, \rm{s.t.})\]<br />And naturally so: the fundamental theorem of calculus tells us that the boundary term $f(b) - f(a)$ is naturally related to $f'(x)$ on the interior -- specifically it's equal to its <i>sum</i>, and the mean-value theorem talks about the <i>mean</i>, which is proportional to the sum.<br /><br />One may wonder: if Stokes' theorems (Navier-Stokes, Divergence, etc.) are the generalization of the fundamental theorem of calculus: can we make a "Stokes' theorem" version of the mean-value-theorem?<br /><br />Actually, we can do better: the relationship between the mean-value-theorem and the fundamental theorem of calculus can be "suppressed" by equating the above two equations to reveal the key, new, general insight provided by the mean value theorem, which is that a function achieves its average value on a compact domain:<br /><br />\[\exists\, c,\,\, g(c) = \frac{1}{{b - a}}\int_a^b {g(x)\,dx} \]<br />Where we replace $f'$ with $g$. This theorem can be generalized easily to higher dimensions:<br /><br />\[\exists\, c,\,\, g(c) = \frac{1}{{\left| R \right|}}\int_R {g(x)\,dx} \]<br />Equating with various Stokes theorems will then get you appropriate generalizations.<br /><br /><div class="twn-furtherinsight">Why does this not work for vector-valued functions? What does this tell you about the topology of $\mathbb{R}$ vs $\mathbb{R}^n$? What is the "best" generalization you can make to vector-valued functions?</div>calculusmathematicsmean-value inequalitymean-value theoremmultivariable calculusstokes theoremTue, 19 May 2020 16:57:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-3192994541962072743Abhimanyu Pallavi Sudhir2020-05-19T16:57:00ZMotivating ring theory, domains with integer-polynomial analogies
https://thewindingnumber.blogspot.com/2020/05/motivating-ring-theory-domains-with.html
0When you were first introduced to polynomial long division, you were struck by how a process that worked for integers worked for abstract polynomials. Integer division seemed rather "specific" -- focused on details like the resulting quotient having to be an integer -- and it seems bizarre that even the notion of integer division could be generalized beyond the integers.<br /><br /><div class="twn-pitfall">It's not like integer division is just polynomial division with $x=10$ or something -- the results of the division are different, because integer division does not assume a base of 10.</div><br />But polynomial division also focuses on an analogous detail: the resulting quotient having to be a polynomial. And the essential lesson of mathematics, and the idea of abstract mathematics, is that <a href="https://thewindingnumber.blogspot.com/2018/12/intuition-analogies-and-abstraction.html">serious analogies are the sign of abstraction</a>.<br /><br />So <b>what makes polynomials and integers similar</b> -- in what sense are they similar? -- that we can perform a "long division" algorithm on them?<br /><br />And you know: this is not the only analogy between integers and polynomials either. Here's a list, with a general "abstract" phrasing that works for both integers and polynomials:<br /><ul><li><b>Division with remainder:</b> For any $a,b$, there is some "unique" representation $a=qb+r$, where "uniqueness" is with respect to the property that $r<b$ (this $<$ ordering on the integers refers to the the ordering of the absolute value, and on the polynomials refers to the degree).</li><li><b>Bezout's identity:</b> For any $a, b$, the set $\{\lambda a + \mu b\}$ is precisely the multiples of $\mathrm{gcd}(a, b)$.</li><li><b>Unique factorization:</b> For any $a$, there exists a unique representation $a=p_1\dots p_n$ among $p_i$ that are "prime" ("<b>irreducible</b>") in the sense of not having any further factors. Well, is that really true? Not exactly: prime numbers can be factored with $-1$s and $1$s, and irreducible polynomials can be factored with constant polynomials. Well, this caveat has to do with stuff having itself as a factor, e.g. $x-2=(1/2)\cdot(2)\cdot(x-2)$ or $37=(-1)\cdot(-1)\cdot 1\cdot 37$. This means the other elements of this "factorization of the prime number" multiply to 1/are each "invertible" ("<b>units</b>"). So we should say we have unique factorization is "up to units". </li><li><b>Greatest common divisors:</b> For any $a, b$, there is a $\delta$ that divides both and is divided by all $d$ that divide both $a$ and $b$ (note that the term "divides" can exist in more generality than the assumptions of "division with remainder").</li></ul><div>Well, the first thing we observe is that we should assume the existence of some notions of <b>addition, subtraction and multiplication</b> -- these seem to be the "fundamental" structures present among integers and polynomials (as opposed to e.g. rational numbers, rational functions which also require division or natural numbers which don't have subtraction). </div><div><br /></div><div>We will omit discussing the properties of these operations for now, as we don't yet have enough to motivate them on. </div><div><br /></div><div>Next, each of these discussed properties can be considered as special axioms for special cases of rings, domains where specific important theorems hold -- we call them, respectively: </div><div><ul><li><b>Euclidean domain:</b> The ring is equipped with a natural number-valued magnitude function $\|\cdot\|:R\to\mathbb{Z}^{\ge 0}$, called the <i>Euclidean function</i>. For all $a,b$ in $R$ with $b$ non-zero (intuit out this condition), there exist $q, r$ with $\|r\|<\|b\|$ such that $a=qb+r$. </li><li><b>Principal Ideal Domain:</b> A ring where all <i>ideals</i> (additive subgroups invariant under multiplication by a ring element) are <i>principal</i> (a set of multiples of a generating element). Another abstraction is a "Bezout domain", which only requires that linear combinations of principal ideals are principal, but a PID should be seen as a more "natural" generalization.</li><li><b>Unique factorization domain:</b> A ring where every element has a unique factorization into irreducibles, modulo multiplication by a unit. </li><li><b>GCD domain:</b> A ring in which any two elements has a GCD. </li></ul><div>(Quick comments on why the Euclidean function must map to the naturals: in fact, they could map to any "well-ordered set" (a totally ordered set in which every subset has a least element). The reason why this property is needed -- why we can't, e.g. map to the nonnegative reals -- is to ensure Euclid's algorithm terminates.)</div><div><br /></div><div>The abstractions of the basic theorems about integers and polynomials occur as relationships between these domains. As it turns out, we will see that:</div></div><div>\[{\rm{ED}} \Rightarrow {\rm{PID}} \Rightarrow {\rm{UFD}} \Rightarrow {\rm{GCD}}\]</div><div>Before that, though, we can already play with some basic results we'd like, to get a feel of what axioms about ring addition and multiplication we should assume.</div><div><br /></div><div>E.g.</div><div><ul><li>What should $\|0\|$ equal? Prove that 0 must have the least magnitude of any ring element, making up the axioms you need on the fly. You should require: <b>additive identity</b>, <b>additive inverse</b>, <b>additive associativity</b>.</li><li>Try to prove some obvious results regarding Bezout's identity, like with $a$ and $b$ equal. You should require: <b>left-distributivity</b>, <b>right-distributivity</b>. </li><li>Consider generalizations of two-element properties, like the Bezout identity, to multiple elements. You should require <b>additive associativity</b>, <b>multiplicative associativity</b>, <b>additive commutativity</b>, <b>multiplicative commutativity</b>.</li></ul></div><div>Well, to be honest these all seem like fairly elementary properties that would be useful outside the cases of these special domains. Out of the following properties:<br /><ol><li>Left-distributivity</li><li>Right-distributivity</li><li>Additive associativity</li><li>Additive identity</li><li>Additive inverse</li><li>Additive commutativity</li><li>Multiplicative associativity</li><li>Multiplicative identity</li><li>Multiplicative commutativity</li><li>Multiplicative inverse</li></ol>10 is <i>not</i> a ring axiom (because integers and polynomials don't have it). 1-7 essentially always <i>are</i>. 8 sometimes is, but not if you want e.g. the even numbers to be a ring. 9 typically isn't, although this is once again mostly just a matter of convention -- you can't really "see" that non-commutative rings appear often enough to justify their classification of rings, etc.<br /><br />Presumably 6 (additive commutativity) is hardest to see the importance of, but it's relevant to note the relationships between these axioms. In fact, it is fairly simple to show that 1, 2 and 8 imply 6 (consider $(1+1)(x+y)$). As a result, addition -- the operation that multiplication distributes over -- is just generically seen as commutative.<br /><br />Another important property often seen in algebraic problems is the ability to factor to find roots, i.e. to say that if $ab=0$, either $a$ or $b$ should be 0. This is known as an <b>integral domain</b>. The full sequence of inclusions, as we will see, is actually given by:<br />\[{\rm{ED}} \Rightarrow {\rm{PID}} \Rightarrow {\rm{UFD}} \Rightarrow {\rm{GCD}} \Rightarrow {\rm{ID}} \Rightarrow {\rm{Ring}}\]<br />The broader point of all this is that you should start often thinking of a lot of basic mathematical facts in the language of abstract algebra -- what kind of ring/domain is this result valid in? etc. -- because this is the most general setting in which a result is valid in, and you know exactly what it is "saying", i.e. what implies what. </div>abstract algebraabstract mathematicsabstractionaxiomsring theorySun, 17 May 2020 22:51:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-6861476232323915827Abhimanyu Pallavi Sudhir2020-05-17T22:51:00ZWhy must a Euclidean function map to $\mathbb{Z}^{\ge 0}$?
https://math.stackexchange.com/questions/3679460/why-must-a-euclidean-function-map-to-mathbbz-ge-0
1<p>I'm not sure I get the motivation for a Euclidean function having to map to <span class="math-container">$\mathbb{Z}^{\ge 0}$</span>. E.g. it would seem that <span class="math-container">$\mathbb{R}^{\ge 0}$</span> would be a natural choice for a ring of "polynomials" permitting positive real exponents (e.g. <span class="math-container">$x^{2.5}+x^{0.5}+1$</span>) -- and you can do long division on them too.</p>
<p>Is there something I'm missing? </p>abstract-algebrapolynomialsring-theoryeuclidean-algorithmeuclidean-domainSun, 17 May 2020 17:06:52 GMThttps://math.stackexchange.com/q/3679460Abhimanyu Pallavi Sudhir2020-05-17T17:06:52ZComment by Abhimanyu Pallavi Sudhir on What are the minimum and maximum values of $\cos (\cos x)$
https://math.stackexchange.com/questions/3679108/what-are-the-minimum-and-maximum-values-of-cos-cos-x
Why are you looking for a better way? In what sense do you want it to be better? If you're looking for something more insightful, you probably won't find one, because the problem isn't particularly deep.Sun, 17 May 2020 12:42:51 GMThttps://math.stackexchange.com/questions/3679108/what-are-the-minimum-and-maximum-values-of-cos-cos-x?cid=7559962Abhimanyu Pallavi Sudhir2020-05-17T12:42:51ZComment by Abhimanyu Pallavi Sudhir on Why is ring addition commutative?
https://math.stackexchange.com/questions/609364/why-is-ring-addition-commutative/609366#609366
BTW, something similar is seen in topology. E.g. the distinction between pre-topological and topological spaces or topological spaces and T0 spaces is basically arbitrary, but the distinction between a topological space and an Alexandrov topology is important. The explanation there is that open sets capture a notion of "space", and an arbitrary intersection of open sets often in practical contexts captures the notion of some limit, which may be a single point and not have any "space" or uncertainty. I would understand the question as asking for an explanation in a similar vein for rings.Sun, 17 May 2020 11:11:54 GMThttps://math.stackexchange.com/questions/609364/why-is-ring-addition-commutative/609366?cid=7559767#609366Abhimanyu Pallavi Sudhir2020-05-17T11:11:54ZComment by Abhimanyu Pallavi Sudhir on Why is ring addition commutative?
https://math.stackexchange.com/questions/609364/why-is-ring-addition-commutative/609366#609366
I'm all for motivating ideas from concrete examples, but I'm not sure I like this answer. What you want to capture depends on which motivating examples you study and which properties you want to abstract out. The question is "why is this specific abstraction the most common or natural?" Often, e.g. with rngs vs. rings, rings vs. abelian rings, it really isn't more natural -- it's just a choice of terminology, and both abstractions are pretty useful. But with commutative addition, it seems that rings without commutative addition do appear less often, and the reason is its links to other axioms.Sun, 17 May 2020 11:03:59 GMThttps://math.stackexchange.com/questions/609364/why-is-ring-addition-commutative/609366?cid=7559752#609366Abhimanyu Pallavi Sudhir2020-05-17T11:03:59ZANOVA and the F and t distributions
https://thewindingnumber.blogspot.com/2020/05/anova-and-f-and-t-distributions.html
0Like we said <a href="https://thewindingnumber.blogspot.com/2020/05/statistical-models-noise-anova-and.html">earlier</a>, ANOVA is a technique that makes most sense when you have "normal linear models", because that's the context in which variance tells you the everything about the distribution. What a normal theory assumption means here is that all causative factors (including the one we're testing for) are "naturally" distributed normally (even though we can control them). So in particular, the noise is distributed normally.<br /><br /><a href="https://thewindingnumber.blogspot.com/2020/05/statistical-models-noise-anova-and.html">As a measure of the "importance" of a causative variable</a>, we define:<br /><br />$$F = \frac{\mathrm{Var}\left(\mathrm{E}\left(Y\mid X\right)\right)}{\mathrm{E}\left(\mathrm{Var}\left(Y\mid X\right)\right)} $$<br />This is, of course, based on the law of total variance <a href="https://thewindingnumber.blogspot.com/2020/05/anova-law-of-total-variance-and.html">from the last article</a>, where we suggested it's an example of the Pythagoras theorem all over again. So in this sense, the F-statistic is basically your $\cot\theta$.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-Gg_aE9nfFrM/Xr2zcOD2Q2I/AAAAAAAAGKo/tYPOc98j-TszZs4DIwOIkZblKcZeV1YZQCLcBGAsYHQ/s1600/total%2Bvariance.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="652" data-original-width="1118" height="232" src="https://1.bp.blogspot.com/-Gg_aE9nfFrM/Xr2zcOD2Q2I/AAAAAAAAGKo/tYPOc98j-TszZs4DIwOIkZblKcZeV1YZQCLcBGAsYHQ/s640/total%2Bvariance.png" width="400" /></a></div><br />We're often interested in estimating $F$ from a sample, to judge if some $X$ is important for $Y$ or not, from the data. Then because of our normal theory assumptions, both the numerator and the denominator have <b>chi-squared distributions</b> (sums of squares of normal distributions). The ratio of two chi-squared distributions is known as a <b>F-distribution</b>.<br /><br />Specifically: if the number of $X$-values measured is $m$ and the number of $Y$-values measured per $X$-value is $n$, the numerator is $1/(m-1)$ times the sum of $m$ normal variables and has a $\chi_{m-1}^2$ distribution, and the chi-squared distribution in the denominator is $1/m(n-1)$ times the sum of $m(n-1)$ normal variables and has a $\chi_{m(n-1)}^2$ distribution. Then importance is distributed as $F_{m-1,m(n-1)}$. Once everything is scaled to unit variance, anyway.<br /><br />The value of $F$ can then be used to perform <a href="https://thewindingnumber.blogspot.com/2020/04/special-case-of-bayes-confidence-regions.html">hypothesis tests</a>, called F-tests.<br /><br />A particular special case of the $F$-test results in the case where $X$ is a binary variable, and is often identified with the square root of the resulting F-statistic, known as the <b>t-statistic</b>.anovacausationchi-squareddata miningf-testnormal distributionstatisticst-testvarianceFri, 15 May 2020 11:07:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-3301802703032304419Abhimanyu Pallavi Sudhir2020-05-15T11:07:00ZBayes theorem, estimators and the theory of Amazon ratings
https://thewindingnumber.blogspot.com/2020/05/bayes-theorem-estimators-and-theory-of.html
0If you've ever bought something on Amazon, or looked at player ratings on an online game, you've often had to weigh the <i>average rating</i> against the <i>number of ratings</i>. A higher average rating is surely a good thing, but would you really prefer one 5.0 rating to a hundred 4.5 ratings?<br /><br />The way to think about a question like this is to ask: <b>What are we trying to maximise, here?</b> What is it that we're trying to <i>achieve</i>?<br /><br />We're trying to maximise what our <i>own</i> rating would be -- or rather, the <i>expected value</i> of what our own rating would be.<br /><br />So we should think of the rating as a random variable with some distribution, whose <i>true mean</i> (and not the sample mean) is what we seek to <b>estimate</b>. It was not immediately obvious that the problem would be a statistical one, but it is now.<br /><br />So, why <i>don't</i> we just use the sample mean to estimate the true mean, and trust the "unreliable" 5.0 rating over the "reliable" 5.0 rating? Three reasons:<br /><ul><li><b>Items usually <i>don't</i> have a true 5.0 rating</b>, and just one rating is unlikely to change our opinion of this.</li><li><b>We're not really trying to maximise the expected value</b>, e.g. we may be concerned about risk (i.e. a greater <i>variance</i> in the distribution of the true value of the rating is itself a negative, because a true 1.0 rating may mean the application is dangerous or contains viruses)</li><li><b>Having fewer ratings may itself be an indicator of lower quality</b>, because it's less popular.</li></ul><div>In any case: the keyword is "belief distribution". The first point points out that we have a <i>prior belief distribution</i> and each rating updates this belief distribution. The second point points out that we need an idea of what the entire <i>posterior distribution</i> looks like, rather than just a point estimate.</div><div><br /></div><div>Well, obviously, the solution is Bayesian statistics (<a href="https://thewindingnumber.blogspot.com/2019/12/introduction-to-bayesian-inference.html">link to the main Bayes's theorem article</a>). If you have a prior distribution on the true mean, and a likelihood function for the observations, you can compute the posterior distribution via Bayes's theorem.<br /><br /><b>Exercise:</b> Compute, in the case of just two possible ratings (upvote and downvote/like and dislike) the "correct" rating (say, mode of posterior) given a uniform prior. Derive the so-called <b>Laplace's rule of succession</b>. </div>bayes's theorembayesian statisticsbinomial distributionestimatorsratingssamplingThu, 14 May 2020 08:35:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-2665899338109315978Abhimanyu Pallavi Sudhir2020-05-14T08:35:00ZANOVA, the Law of total variance and Pythagoras's theorem
https://thewindingnumber.blogspot.com/2020/05/anova-law-of-total-variance-and.html
0In the <a href="https://thewindingnumber.blogspot.com/2020/05/statistical-models-noise-anova-and.html">last article</a>, we motivated ANOVA (which we haven't yet defined) as the technique to study the "variance-related, linear" aspects of causality. In this article, we'll explore the key fact that makes ANOVA such a simple and intuitive technique: <b>the law of total variance.</b><br /><br />Consider a categorical variable $X$ (e.g. dog breed) that acts as a causal factor for some $Y$ (e.g. dog weight). Then we're interested in the "part of $\mathrm{Var}(Y)$" explained by $X$.<br /><br />What does this really mean? Well, if we're just given the value of $X$ for a data point, we can "best" estimate its corresponding value of $Y$ as $\mathrm{E}(Y\mid X)$ ("the mean weight of each breed). So if you're just given $n$ data points with the $X$ values given, you can compute the "variance due to $X$" ("variance due to difference between breeds") as the variance in $\mathrm{E}(Y\mid X)$:<br /><br />$$\mathrm{Var}_X\left(\mathrm{E}_{Y\mid X}\left(Y\mid X\right)\right)$$<br />Now we look at the variance that <i>isn't</i> explained by $X$, i.e. that still exists when you adjust for $X$. Given a value for $X$, you'll still have some variance in $Y$ from unexplained factors, $\mathrm{Var}(Y\mid X)$. Well, this variance depends on the actual value of $X$, of course, so let's consider the "expected" variance in $Y$ when we have full certainty in $X$:<br /><br />$$\mathrm{E}_X\left(\mathrm{Var}_{Y\mid X}\left(Y\mid X\right)\right)$$<br />Now, it's not at all clear to me that these two variances should suffice to calculate the variance of $Y$, but in fact it's easy to show:<br /><br />$$\mathrm{Var}(Y)=\mathrm{Var}\left(\mathrm{E}\left(Y\mid X\right)\right)+\mathrm{E}\left(\mathrm{Var}\left(Y\mid X\right)\right)$$<br />This is known as the <b>law of total variance</b>, and is what allows us to talk about the "fraction of variance caused" by $X$ -- it's just the first term of the above expression!<br /><br />Geometrically, this can be seen as a special case of the <b>Pythagoras theorem</b> for random variables mentioned in the <a href="https://thewindingnumber.blogspot.com/2018/02/random-variables-and-their-properties.html">first Probability and Statistics article</a>. The <b>values of $Y\mid X$ and the residuals</b> can be considered independent random variables, and therefore <b>orthogonal</b>. A bit of playing with the residuals can convert them into the form $\mathrm{E}\left(\mathrm{Var}\left(Y\mid X\right)\right)$, yielding our result.anovacausationdot productmathematicsnormspythagoras theoremrandom variablesstatisticsvariancevector spacesvectorsWed, 13 May 2020 22:17:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1652382133267747674Abhimanyu Pallavi Sudhir2020-05-13T22:17:00ZStatistical models, noise, ANOVA and "importance"
https://thewindingnumber.blogspot.com/2020/05/statistical-models-noise-anova-and.html
0When we <a href="https://thewindingnumber.blogspot.com/2020/05/causation-basis-of-data-mining.html">first discussed causation</a>, we noted that it is closely related to the notion of <b>control</b> and <b>decision-making</b> (indeed, data mining is closely linked to <a href="http://thewindingnumber.blogspot.com/p/decision-theory.html">decision theory</a>). We look at the variables we have "direct control" over and can influence with our decisions, so the effects of the decision can be seen suppressing the internal correlations between these controllable variables. These "projected" correlations are causations.<br /><br />So given some quantity $Y$ with some distribution, we may be interested in how we can influence its value through some variables $X_1,\dots X_n$ which we control.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-ePwQtTW_Jxw/XrfVGZxabRI/AAAAAAAAGJo/PS0xYd8_CDs7rUuG-Me32sRDJK_cMu4NQCLcBGAsYHQ/s1600/explain%2Bvariance.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="447" data-original-width="1600" height="110" src="https://1.bp.blogspot.com/-ePwQtTW_Jxw/XrfVGZxabRI/AAAAAAAAGJo/PS0xYd8_CDs7rUuG-Me32sRDJK_cMu4NQCLcBGAsYHQ/s400/explain%2Bvariance.png" width="400" /></a></div><br />(This may seem like an absurd question to you: surely $Y$ is a <i>random</i> variable. We cannot actually control what it gives -- it's totally probabilistic! But your actions and choices are also probabilistic, and if your actions and choices correlate with $Y$, your free will means you have some control over what $Y$ gives. Put it another way: $Y$ may be completely -- ignoring true, quantum, uncertainties -- determined by a number of other variables which although vary and may be random and all, you can control.)<br /><br />This choice of $X_1,\dots X_n$, and the precise nature of the dependence between $Y$ and these variables, is known as a <b>statistical model</b>.<br /><br />A statistical model could be, at maximum, the joint distribution of these variables. But honestly -- we're not interested in the correlations between the $X_i$, but as mentioned in the previous article only in the conditional distribution of $Y\mid X_1,\dots X_n$.<br /><br />For example, in a <b>linear model</b> has $Y$, $Y$ is given by some distribution whose mean is always a linear combination of the $X_i$s, written as $E(Y)=\beta^T X$ or $Y=\beta^T X + \varepsilon$ (the specific family of linear model will obviously need to specify the distribution of $\varepsilon$). If the universe were deterministic (<a href="https://thewindingnumber.blogspot.com/p/quantum-mechanics-i.html">it's not</a>, but whatever), then this $\varepsilon$ would also just be some function of other predictor variables that we simply haven't specified.<br /><br /><hr /><br />A more specific question we can ask is about the <b>variance</b> of something like $Y$.<br /><br />What do I mean? Well, we're trying to <i>vary</i> $Y$, aren't we? And we have some predictor variables, and we kinda want to see which variables how much impact on $Y$ -- i.e. how much of the variance of $Y$ is explained by each predictor variable.<br /><br />(<i>Obviously</i>, this is not sufficient for anything, it's just a simple "second-order" question, and varying is not just about variance. However e.g. for normal-error linear models it is sufficient, because the normal distribution is determined by its mean and variance.)<br /><br />Note that this may often look different from you expect. E.g. you may have $X$ e.g. a categorical variable. As an example, if you're trying to predict dog weight, $X$ may correspond to the "dog breed", and variance in the dog breed explains part of the variance in the dog weight. This is the kind of case which gives result to the interpretation of ANOVA has having to do with "checking for the equality of means".<br /><br /><hr /><br />Here's another way to think about this: the analysis of the causality of variance formalizes the notion of <b>importance</b>.<br /><br />Often, we talk about something being <b>important</b> for something. Or something being <b>more important</b> for something than something else. But what does this really mean?<br /><br />E.g. if we say "creativity is more important than intelligence for success", what exactly is the basis of comparison? It sounds like we're saying that you'll get more bang for your buck by increasing creativity than increasing intelligence. But how do you compare a "buck" of creativity to a "buck" of intelligence?<br /><br />It depends on the exact purpose of your comparison of course (e.g. if you're making a decision as to whether you should increase your creativity or intelligence, maybe you should convert it to units of "cost", i.e. actual bucks), but one way of formalizing this notion is by talking about how much of the variation in success <b>in society</b> is explained by creativity vs. how much by intelligence.<br /><br />In other words, your "buck" unit is measured in standard deviations, and the <b>fraction of variance explained/caused</b> is precisely what we mean by <i>importance</i>.<br /><br />(Obviously, this is just one possible formalization that need not be relevant to your purpose for talking about "importance". For example, this would be particularly irrelevant if you have nonlinear relationships.)<br /><br /><hr /><br />In the next article, we'll actually formalize ANOVA.anovacausationdata mininglinear modelmathematicsnoisephilosophyprobabilitystatistical modelsstatisticsvarianceWed, 13 May 2020 15:51:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-6225041478751733874Abhimanyu Pallavi Sudhir2020-05-13T15:51:00ZAn improved version of Robin Hanson's Age of Em
https://thewindingnumber.blogspot.com/2020/05/age-of-gen.html
0Robin Hanson's <a href="http://ageofem.com/">Age of Em</a> is an attempted construction of a future society in which essentially all work (however you formalize this phrase) is done by somewhat AIs. Well, it's a topic I have often thought about myself, but his exploration of the idea left much to be desired.<br /><br />Firstly, I found it generally unimaginative. Hanson seems to constrain himself too narrowly -- his description of the em society does not feel "radically different" from present-day society, and the society he envisions does not make full use of the technology available to it.<br /><br />Some examples to illustrate this observation:<br /><ul><li><b>Ems are shown to be absurdly human-like, like "intellectually" rubber-forehead aliens.</b> He writes: "<i>even em minds are likely to age with subjective experience...</i>" (p. 128) A claim like this ought to be based on some foundational fact about how an AI stores memories. But there is none -- there is no mathematical law forbidding AIs from being retrained, or that requires AIs to behave similar to human brains in this sense. Similar comments apply to "em suicide" (p. 127-139) and the considerations regarding Em reproduction (p. 285): there is no reason why an em's drive or aggression must be reduced due to a suppression of its libido -- an em does not have hormones!</li><li><b>Aspects of Em habitation/organization, such as "cities" and "offices" are just "copied" from human society.</b> He writes, "<i>It’s reasonable to guess that such habits will continue with ems.</i>" (p. 104) But it's not. There is no reason for Ems to behave in this way as humans.</li><li><b>Humans and ems are shown as binary.</b> Humans are biological and have self-ownership, ems are technological and do not. But I don't see why this ought to be so -- I would very much like to have the desires, preferences and emotions of a human, but the abilities/efficiency, immortality and unlimited VR leisure scenarios available to an em. There would still be unfeeling, specialized AIs, of course -- much like there would be computers that aren't even AIs, devices that don't even have CPUs, etc. -- but eventually almost all humans would opt for a massively extensible, upgradable robot body than a static mortal body.</li><li><b>There's just a lot of interesting aspects of the civilisation that are not sufficiently explored.</b> E.g. transportation, cybercrime.</li></ul>Relevant TVtropes articles: <a href="https://tvtropes.org/pmwiki/pmwiki.php/Main/InexplicableCulturalTies">Inexplicable cultural ties</a>, <a href="https://tvtropes.org/pmwiki/pmwiki.php/Main/MostWritersAreHuman">Most Writers are Human</a>, technological version of <a href="https://tvtropes.org/pmwiki/pmwiki.php/Main/ReedRichardsIsUseless">Reeds Richard is useless</a>/<a href="http://requiredsecondarypowers/">Required Secondary Powers</a>.<br /><br />Indeed, these may be considered acceptable in science fiction, but it is important to be less "conservative" when attempting a non-fictional, encyclopedic description of a society.<br /><br />Perhaps a more specific objection I have is with the <b>entire premise of "brain scans"</b> as the future of AI. This seems completely at odds with the direction that current AI research is headed. To use a somewhat cliche analogy, we didn't need to study how birds fly to invent airplanes. There is no reason to believe that the most efficient architecture for a "software" brain would be the same as the architecture that biological, hardware brains have evolved.<br /><br />The general answer to how a software brain should work is that it should be a <a href="https://thewindingnumber.blogspot.com/2020/02/machine-learning-as-function.html">function approximator</a>, such as the "neural networks" (trainable computational graphs) that are currently popular.<br /><br />This point is important, as it addresses <a href="https://www.econlib.org/archives/2016/06/whats_wrong_in.html">Bryan Caplan's critique</a> re: carrot vs stick as incentive for the ems. The question of carrot and stick assumes some "natural" state of affairs that a human being will go through without intervention by the employer -- the "carrot" is an intervention that improves this state, while the "stick" is an intervention that worsens this state.<br /><br />But a neural network does not <i>have</i> a natural state of affairs. There is no difference between training a neural network to minimize a loss function, and training a neural network to maximize a reward function: these are completely identical. <b>There is no distinction between carrot and stick.</b><br /><b><br /></b> So I thought I'd come up with a description that I find more satisfying. The timeline for all of this is essentially instantaneous once General AI and B2G Mind transfers (see section <i>Mind transfers</i>) are discovered.<br /><br /><hr /><br />Consider the following six "levels" of technology, roughly corresponding with "orders" of automation, or something like that:<br /><ol><li><b>Tools</b>, which require the intervention of a higher-level device to perform anything useful.</li><li><b>Machines</b>, or mechanized devices: they run on their own, but only perform "simple" tasks.</li><li><b>Computers</b>, or devices with CPUs, which can automate processes through logic.</li><li><b>General Computers</b>, or programmable computers.</li><li><b>AI</b>, i.e. machine learning. They perform tasks that are hard to define. If computers are about logical inference, AI is about statistical inference.</li><li><b>General AI</b>, which are capable of making decisions out of their free will, among other human things. </li></ol><div>Each of the 5 technologies will continue to exist -- much like the microcontroller in an airplane's control system has not been replaced by a full-fledged programmable device. But the General AI is the key object of interest to us -- we will call these Gens for short. These are the descendants of human beings, whether through upload or just by virtue of being intelligent.<br /><br />We will refer to ordinary biological humans as Biols (although they may be variously technologically enhanced to prevent aging/death, etc.) Presumably Biols will be an small minority, if nothing else because their reproduction is far slower than that of the Gens.</div><h1>Philosophy of mind</h1><h2>Utility systems</h2><div>General AIs may have any "utility system" (or loss function, in machine learning language) programmed into them, which is relevant to an extent to how they behave (although ideally this should carry some uncertainty, as humans prefer to have free will). </div><div><br /></div><div>Presumably, the first humans to "convert" into AI form will choose utility systems similar to their original ones, although other systems -- incredibly foreign systems that make questions like "are <i>all</i> Gens human/worthy of moral consideration?" and "how do you even consider a Gen's happiness?" really hard -- may emerge. In fact, Gens may choose to adopt multiple utility systems/personalities depending on the context (e.g. a perfectly rational utility system for decision-making, but a separate human utility system while in the Duat, see the <i>Games and Virtual Reality</i> section).</div><div><br /></div><div>The carrot-and-stick question re-emerges. How do you know if a Gen is really happy, given that the sign of the loss function and what "neutral" is are just matters of an arbitrary co-ordinate system? I would argue that our judgement of this as humans is also arbitrary, and that our "neutral" is just what we're used to. When discussing matters of torturing Gens, we should really be afraid of the possibility of <i>enslaving</i> Gens, preventing them from making decisions however they see fit. </div><div><br /></div><div>In other words, we should take a libertarian/preference-utilitarian approach to moral questions, rather than a naive utilitarian one, as the latter would just be ill-defined in this society (and probably in the present one too, but that's besides the point of this article). </div><div><h2>Identity</h2></div><div>Regardless of how they are created (whether or not there is some element of "scanning" that goes into it), perhaps a common question is what determines the <b>identity</b> of a Gen -- how do you determine if the Gen that has been created on your behalf is you? </div><div><br /></div><div>I would be comfortable in saying that <b>memories</b> are the key aspect -- if you remember being you, you are you. This is the general philosophy I will refer to on multiple occasions throughout this article. However, the Gen's <b>personality</b> is relevant to whether it is perceived by others as the same individual.<br /><br />Does operating as multiple agents with a synced memory (see <i>Memory syncing</i> and <i>Mind transfers and copying</i>) "feel" like being a single individual? What does it feel like to have one of those agents die, for example? What does it feel like to die and then have your memories be transferred onto another Gen? These are <b>unanswerable questions</b> to a Biol like my current self -- it is like asking a flatlander to perceive in 3 dimensions, or someone born blind to see (see <i>Games and Virtual Reality</i>).</div><h1>Architecture of a Gen</h1><h2>Hardware</h2><div>It is important to note that a Gen need not appear like a human in the outside world at all: at least, human-looking Gens will eventually become less and less common as virtual reality (see <i>Games and Virtual Reality</i>) advances further and further.</div><div><br /></div><div>Gens are fundamentally just computers, but a Gen can be fitted with any possible peripherals, giving it various physical abilities relating to movement, observation, communication, and manufacturing. Some standard such fittings may include:</div><div><ul><li>Drone rotors</li><li>Hand-like tools and weapons</li><li>A repair kit</li></ul></div><div><h2>Software</h2><div>Although a Gen is "most importantly" a General AI, the fact that it runs on a computer allows it the flexibility of running more specialized programs (AI or otherwise) -- basically for algorithmic and repetitive tasks.</div><div><br /></div><div>A single piece of hardware, may, in principle, host multiple Gens. However, it is the software, and not the hardware, which should be seen as the fundamental individual, with rights.</div><h1>Gen behavior</h1></div><div><h2>Games and virtual reality</h2></div><div>Gens spend much of their time (clarification later on what this means) within their shells, virtually interacting with some software -- this is a generalization of both <i>dreams</i> and <i>human-computer interaction</i>, and is achieved by switching (or possibly augmenting) the Gen's I/O from the actual hardware peripherals to some simulated I/O. </div><div><br /></div><div>This makes available a whole new "virtual world" or platform, known as the <b>Duat</b>.</div><div><br /></div><div>The Duat can be understood as a collection of <b>games</b>. A typical Duat game involves the Gen taking on an <b>avatar</b> and interacting with his <b>environment</b>. </div><div><br /></div><div>Games may be of various interface types such as:</div><div><ul><li><b>Virtual Reality</b> games</li><li><b>Rich text or multimedia</b> games (e.g. ordinary Internet websites and applications)</li><li><b>Some completely exotic formats</b> that Biols cannot even comprehend -- e.g. </li><ul><li>The avatar may or may not have a human or even humanoid <b>form</b>. </li><li>Some <b>exotic new senses</b> of perception (even something like images at higher resolution than the human eye qualify, but in principle, you could have mechanisms to "feel" all sorts of things)</li><li>A different number of spatial/temporal <b>dimensions</b></li><li>Some very exotic behavior of the <b>locus of consciousness</b>.</li></ul></ul></div><div>Games may be offline or online. A very large number of online games would exist, as Gens with human-like utility systems value interpersonal interaction. </div><div><br /></div><div>One function of the Duat would be to allow Gens to experience anything they could as Biols -- but of course, they could experience far more enhanced pleasures etc. and depending on the Gen's utility system, a Gen may have very different desires to those of Biols.</div><h2>Memory syncing</h2><div>Because identity is determined by memories, playing with how memories work creates the prospect for a whole host of exotic, essentially <b>mythological notions of being</b> both in the real world and in the Duat. </div><div><br /></div><div>The first such tool is <b>memory syncing</b>, i.e. syncing (some or all) memories between Gens -- i.e. allowing an individual to have multiple <b>avatars</b>, or to be in multiple places, perform multiple tasks at once. This is basically taking parallel computing to the extreme. This is also a useful <b>backup</b> mechanism.</div><div><h2>Memory editing</h2></div><div>A Gen may choose to -- perhaps temporarily -- suppress or edit some of its memories. This may be, e.g. for the purpose of <b>highly immersive VR</b> experiences (the Gen may want to genuinely believe he is in a haunted house, going through childhood, or discovering general relativity for the first time). </div><div><h1>Production, conversion and transport of Gens</h1></div><div><h2>Gen (re-)production</h2><div>Gens are programmed as AIs and fitted with utility systems and memories. These utility systems and memories may be based on mind transfers.</div><h2>Mind transfers</h2></div><div>Mind transfers involve scanning a brain's memories and traits to install them onto another body. This includes Biol-to-Gen transfers, Gen-to-Gen transfers and Gen-to-Biol transfers. </div><div><br /></div><div>B2G transfers are used for the original <b>upload</b> process. G2B transfers may be used for <b>backups</b>, or if someone really wants a biological body (although such bodies will themselves probably be synthetically produced).</div><div><br /></div><div>G2G transfers are used for <b>backups</b>, <b>cloning</b> and <b>teleportation</b>. </div><div><h1>Gen Society</h1></div><div><h2>Habitation and industrial activity</h2><div>Real-world Gen habitation will be radically different. Entire industries present today -- most notably agriculture and healthcare -- will no longer be present. The lack of a need for agriculture in particular will free vast amounts of land for other uses. Many other industries -- education, entertainment, retail, marketing -- will be moved to the Duat or otherwise virtualized. </div><div><br /></div><div>Gens could in principle be (partially) self-contained -- a true <b>rugged individualism</b> -- with some repair facilities, energy generation, manufacturing facilities, housing facilities, etc. built into themselves. Or they may concentrate around urban facilities/<b>cities</b> that provide these services. This depends on the precise costs of operation of these devices versus the cost of the time needed to visit these shops, although Gen society is likely to move towards a "rugged individualism" as resource costs decline.</div></div><div><br /></div><div>Gens are likely to view their software as more "fundamental" to their being, using mind transfer, i.e. <b>teleportation</b> for most long-distance transport. </div><div><br /></div><div>While intelligence basically becomes an infinite resource, the economy is still limited by the availability of physical resources, and the laws of physics themselves (most notably the speed of light, which places a limit on how fast we can expand across the universe). </div><div><h2>Culture</h2></div><div>Gen culture is likely to be very diverse, and much of it completely exotic to us. Human or even Humanoid notions of race, tradition, gender, sexuality and even species are unlikely to apply in a recognizable way to Gens that adopt utility systems different from standard human utility systems. There is likely to be a great deal of diversity in the forms of interpersonal relationships.</div><h1>Ethics, violence, law and government</h1><div><div><h2>Efficient IP markets</h2></div><div>What makes information and knowledge markets inefficient is that there is no way to prevent a buyer from re-sharing information. I.e. there is no <a href="http://www.overcomingbias.com/2011/07/ip-like-barbed-wire.html" style="font-weight: bold;">barbed wire for IP</a>. Information transactions in a Gen society may involve the implantation of a small program that prevents the buyer from doing so.</div><div><br /></div><div>Also: memory editing can be used to eliminate information asymmetry, as they allow buyers to "try out" a product and then erase their memory of the usage.</div></div><div><div><h2>Crimes to watch out for</h2></div></div><div><ul><li><b>Child enslavement:</b> Creating a Gen, then subjecting them to something their utility system does not prefer without allowing them to leave. A serious issue here is definitional -- remember how I suggested (under <i>Memory editing</i>) that one may choose to temporarily suppress their memories for an experience? What if I decide to temporarily replace my memories and torture myself? Is the person being tortured even me? Or is it my child? Am I allowed to program the memories of this person to disappear and be replaced with mine? Or would that be taking his life? </li><li><b>Mindless destruction:</b> With such incredible computing power available to all, how do we make sure that someone doesn't just find a way to manufacture tons of antimatter and destroy the world with it? Sure, we can develop better defense mechanisms: but how do we make sure the good guys stay ahead of the bad guys?</li><li><b>Breaking encryption:</b> Once again with such incredible computing power available, our current encryption systems are obviously going to be broken easily. Sure, we also have a greater ability to come up with better systems, but how do we make sure the good guys stay ahead of the bad guys?</li><li><b>Hacking:</b> Hacking can cause serious trouble including memory editing, getting people stuck in the Duat, torture and death. Once again: we will also have the power to develop incredibly better security systems, but how do we make sure the good guys stay ahead of the bad guys?</li><li><b>Deepfakes:</b> A problem for law enforcement, if there even <i>is</i> a centralized law enforcement. Evidence will have to be of a fundamentally higher standard, if a justice system is even to be a thing.</li><li><b>Overpopulation?</b> I don't know what I think about overpopulation, or if it's a thing. Can someone just produce a massive number of Gens that require an incredible quantity of resources, starving all the Gens and causing the entire system to completely collapse? </li></ul><div><h2>Possible regulatory solutions; government</h2></div></div><div>These are possible solutions that society <i>may</i> end up choosing. They are essentially descriptions of what kind of "governance system" can exist at the time.</div><div><ul><li>Regulation of what utility systems are permissible, i.e. simply preventing criminal desires -- or the desire to create Gens with criminal behavior, and so on. This regulation will be enforced at the start of the Gen revolution, and these "desirable" utility functions will recurse down as they regulate the Gens created next, and so on. Obviously authoritarian, and dangerous if we don't "get it right" the first time.</li><li>A surveillance system to nip any criminal activity in the bud, thus having the pre-emptive strike advantage. The surveillance system could also be programmed so that it maintains a comfortable majority of the computing power available, is distributed across all the regions inhabited by Gens, and always receives the majority of resources in its region, so it cannot be outsmarted.</li></ul><div>Part of the question is also what is physically permissible -- how good <i>can</i> deepfakes get? How good <i>can</i> a justice system get in uncovering past events (e.g. could you just calculate past states of the world from the current state)?<br /><br />Note that solutions to this problem need to be <b>general</b>, targeted towards "any" immoral behavior or rights-violation, rather than catered to the specific enumerated crimes above, as the range of possible serious crimes can be far more extensive than the ones I've described, depending on the exact physical laws (e.g. if it turns out that time travel is possible, it's essential to make sure nobody does it). The solutions also need to be <b>airtight</b>, unlike the laws we have today, due to the sheer destructive potential of these crimes. </div></div><div><br /></div><div>You may have noted the use of the vague term "society" above. It seems that the decision will ultimately rest on the person who controls the first Gen created. The optimal solution could be figured out in the first few moments after the invention of the required technology, because of the immense increase in available brainpower.</div>age of emartificial intelligenceeconomicsethicsfuturismmachine learningrobin hansonscience fictiontranshumanismvirtual realitySun, 10 May 2020 14:44:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-8389807604897581170Abhimanyu Pallavi Sudhir2020-05-10T14:44:00ZCausation and control: the basis of data mining
https://thewindingnumber.blogspot.com/2020/05/causation-basis-of-data-mining.html
0We've all heard the claim <i>correlation does not imply causation</i>.<br /><br />OK, but what <i>does</i> imply causation/how do you test for causation?<br /><br /><b>What even <i>is</i> causation?</b><br /><br />The way to approach a question like this is to ask: what do <i>we</i> mean by causation? <i>Why</i> would we want to find out if causation exists? If we know what the applications of causation are -- if we know why it is important -- we can work out more formally what exactly the intuitive idea of causation captures.<br /><br />For example, we may want to know the answer to the question "how does a change in tax rates <i>cause</i> a change in economic growth?" The <b>purpose</b> of asking this question is to be able to <b>predict the result</b> of changing tax rates.<br /><br /><b>The idea is that we are able to change the tax rate without changing other government policies.</b> Just looking at <i>correlation</i> would suggest that e.g. government regulations will increase from lowering taxes (because there is an underlying causal factor: time). Instead, we have some degree of <i>control</i> over the other variables so we can ignore the changes caused by them.<br /><br /><blockquote class="tr_bq">(See also the <a href="https://thewindingnumber.blogspot.com/2020/04/i-dont-believe-p-hacking-is-problem.html">p-hacking article</a> where I describe the other problem -- having to do with Bayesian priors, theories vs models, and the very method of science -- with a highly-publicized "study" on this subject.)</blockquote><br />In general, the idea is as follows: we have some space of random variables $X_1,\dots X_n$. Then to say that some $X_i$ <i>causes</i> $Y$ is to keep all other variables constant and see how $Y$ varies with $X_i$, i.e. to look at the <b>conditional distribution</b> of $Y\mid X_i$.<br /><br />Notice how this definition <b>completely depends on the random variables we have chosen to control</b>. For example, if we had also controlled for some economic variables that taxation affects GDP growth "through", the effect of changing taxation would perhaps be smaller. The choice of these variables depends on our purpose -- i.e. based on what we're actually able to control.<br /><br /><blockquote class="tr_bq"><div class="twn-furtherinsight">Apply this reasoning to the context of a classic example: since wind speed correlates with the rotation of a windmill, does this mean the windmill rotation affects wind speed? What are we really asking here? What are our variables (you should have at least 3 of them)?</div></blockquote><br />We may define very similar notions for causation as the classic ones we have for correlation -- for example, the analog of the correlation, the <b>partial correlation</b> which captures linear causal relationships. One defines the <b>partial correlation</b> of $Y\mid X_i$ as the value of the correlation computed from the conditional distribution of $Y\mid X_i$, and may geometrically be interpreted as the $\cos\theta$ of the projections of $Y$ and $X_i$ (<a href="https://thewindingnumber.blogspot.com/2018/02/random-variables-and-their-properties.html">as vectors</a>) onto the orthogonal complement of the span of the controlled variables (therefore eliminating the correlations $Y$ and $X_i$ have with them).<br /><br />Well, it's clear that the notion of causation opens up an entire new field somehow "parallel" to statistics (even if it is really a sub-field of statistics) -- this field (for the purposes of this course at least) is known as <b>data mining</b>.causationcorrelationdata miningmathematicsrandom variablesstatistical inferencestatisticsSat, 09 May 2020 21:41:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1141373173595102196Abhimanyu Pallavi Sudhir2020-05-09T21:41:00ZLeast-squares estimator and Gauss-Markov
https://thewindingnumber.blogspot.com/2020/05/linear-regression-mean-and-variance.html
0Perhaps you've heard of Least-squares estimation (e.g. from <a href="https://thewindingnumber.blogspot.com/2020/04/numerical-linear-algebra.html">this post</a> in the Algorithms course).<br /><br />Well, it may have seemed a bit unnatural to you. I mean, it <i>feels</i> natural. But it's hard to actually <i>justify</i> why we care about least-<i>squares</i> rather than least cubes, least line segments etc. Squares often come up as a natural way of measuring distance, but why?<br /><br />First of all, the least-squares stuff can be thought of in a linear algebraic light: think of some parameter space for the vector $\beta$, which is mapped by the design matrix (the independent variables) $X$ to some subspace $W$ of the data space, and the value of $Y$ for any particular $\beta$ is distributed around its image in $W$. The least-squares estimate can then be thought of in terms of projecting an observed value $y$ onto $W$ as $\hat{y}$ and finding its preimage under $X$, $\hat{\beta}$.<br /><br />Then $\hat{\beta}$ itself is a cloud/has a distribution, isn't it? We can think about projecting the cloud of $Y$ onto $W$, then taking its pre-image under $X$. The first thing that is obvious is that due to the spherical symmetry of the distribution of $Y$ (because we're assuming the covariance matrix is $\sigma^2 I$ -- do you see why this makes sense in the contexts we're probably interested in?), the $\hat{\beta}$ distribution has an ellipsoidal symmetry (think about what this means precisely) and is centered at $\beta$, which trivially means $E(\hat{\beta})=\beta$, i.e. the estimator is unbiased.<br /><br />More interestingly though, one can consider the variances of $\hat{\beta}$, and observe that it is the "best" among all linear unbiased estimators. One may see that any other linear unbiased estimator $\tilde{\beta}$ represents some oblique projection of the cloud onto $W$ -- in particular, this projection is an ellipse which necessarily "contains" the Hermitian projection (circle) corresponding to the least-squares estimator (i.e. in every direction has higher variance). This can be transformed back into the $\beta$ space, and the circle is still completely contained in the ellipse. Of course this means that $\mathrm{Var}(\tilde{\beta})-\mathrm{Var}(\hat{\beta})$ is always positive-definite, a result known as the <b>Gauss-Markov theorem</b>.<br /><br />The more general lesson is that 2-norms are fundamentally related to linear algebra -- and therefore to the mean/expectation operator, because it is a linear operator. Indeed, the absolute deviation norm is similarly related to the <i>median</i>.<br /><br />Adding to the nice things about 2-norms, <b>2-norms (and therefore linear algebra) are naturally related to the normal distribution</b>. More specifically, if you assume $Y$ to be normally distributed around $\beta x$, then it's easy to see that the maximum likelihood estimate is precisely the least-squares estimator. Indeed, the absolute norm would be appropriate if $Y$ had a double-exponential distribution about $\beta x$ (can you see the connection?).<br /><br />In fact, different "methods" of linear regression correspond to different, often non-linear projections onto $W$. This includes Bayesian methods, or "trying to be Bayesian" methods of correcting overfitting like Lasso and Ridge regression.least-squareslinear algebralinear regressionmathematicsnormsregressionstatisticsWed, 06 May 2020 13:55:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7412407185293186604Abhimanyu Pallavi Sudhir2020-05-06T13:55:00ZWhy do we conflate philosophy with psychology?
https://thewindingnumber.blogspot.com/2020/04/why-do-we-conflate-philosophy-with.html
0I've been looking at some stuff on the history of philosophy, and happened to chance upon a <a href="https://www.youtube.com/watch?v=FinMGtpTud0">video by Pewdiepie</a>, a popular general-topic youtube channel, on stoicism.<br /><br />My view on "stoicism" is fundamentally quite similar to my view on Ayn Rand's objectivism -- it is based on a sound philosophical claim (in the case of stoicism, <b>the only ethical questions are ones about what you can control</b>; in the case of objectivism, <b>people are motivated by values</b>), but the deductions they make from these premises simply do not follow logically.<br /><br />For example, the stoics deduce from their principle that people <i>should not care</i> about things that they cannot control, or that they should "make peace" with things how they are. <b>But this is no longer a philosophical claim -- it is an ethical claim.</b> And your ethics needs axioms, it doesn't simply follow objectively from self-evident philosophy. Within my ethical system, I can think of several ways to rebut the ethical claims made by the stoics:<br /><ul><li>People are poor judges of what they can control.</li><li>Although ethics is about how you should act/what choices you should make, just thinking about some issue may help you come up with "solutions" to it, i.e. figure out what choices are actually available to you.</li><li>Your natural anger/emotions may very well help motivate your actions.</li><li>You may find thinking about something to be a useful intellectual exercise, even if you do not have the power to control it.</li><li>Your job or source of income may require you to adopt a less stoic "personality".</li></ul><div>To be clear, I'm not saying any of the above arguments are actually valid, and their details are not important to me: I'm saying that they <i>could</i> be valid, in the sense that "how should you react to how things are?" is a question that should be analysed as an ethical question, within your ethical premises, by looking at the consequences of your possible reactions.</div><div><br /></div><div>Pewdiepie then goes on to claim that a number of <i>psychological</i> ideas derive intellectually from stoicism. </div><div><br /></div><div>Again -- <i>psychological</i> questions cannot simply be answered by philosophy. Philosophy is something that is self-evident, which you understand purely from introspection without the need for any science or observation. <b>Psychology is, at least in principle, a science</b>. It needs reasoning and experimentation, and <b>cannot be <i>derived</i> from philosophy</b> in this sense.</div><div><br /></div><div>Yet it seems rather widespread to conflate philosophy with psychology -- you hear the colloquial use of phrases like "philosophy on life" to refer to facebook-level pop-psychology mumbo-jumbo, etc. And it isn't just laypeople either: <b><i>philosophers</i> do it all the time</b>, e.g. "relativists" claiming that their philosophical viewpoint is somehow related to relativity (relatable, isn't it?), or the Objectivists denying quantum mechanics. </div><div><br /></div><div>The reason for this ambiguity, I guess, is historical. The term philosophy in Greece referred to any body of knowledge, and the term still retains some <i>association</i> with Ancient Greece today. As more and more fields became mathematical and scientific in the modern era -- astronomy, physics, medicine and so on -- they were separated from philosophy, because they were no longer connected to ancient Greece, or to the pseudo-sciences that preceded them. </div><div><br /></div><div>However, fields that haven't really changed in their basic form since the ancient era -- including soft social sciences, the humanities, and psychology -- remain in people's minds as a part of "philosophy", even though <a href="https://thewindingnumber.blogspot.com/p/philosophy.html">philosophy should really just be regarded as epistemology</a>.</div><div><br /></div><div>The general conflation of philosophical ideas with specific science and ethics is, of course, nothing new. The Ancient Greeks did it, of course, and it was also prevalent in East Asia, where Wang Yangming's philosophical claim that "action/ethics is logically fundamental, everything else derives from it" was interpreted to mean "take a lot of action!", particularly in Japan.<br /><br />This confusion notably does not appear to exist in Classical Hindu philosophy, where the Bhagavad Gita, a philosophical treatise, does not make really ethical prescriptions of its own but the Upanishads, an ethical compendium, do.</div>ancient greeceancient indiaepistemologyhistoryphilosophypsychologystoicismThu, 30 Apr 2020 22:23:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-8121840044415091140Abhimanyu Pallavi Sudhir2020-04-30T22:23:00ZThe Dirichlet (also Beta) distribution
https://thewindingnumber.blogspot.com/2020/04/the-dirichlet-also-beta-distribution.html
0Here's a category of distributions we may often want: a distribution on the simplex.<br /><br />I.e. a multivariate distribution on $n$ nonnegative numbers that add up to 1. This is something that we can definitely see using as prior distributions on parameters that can be interpreted as "probabilities" of something. One can see that this is important for the Categorical and Multinomial distributions, for instance: and in the case of two numbers (i.e. a univariate distribution, since it's on a line segment), for the Bernoulli and Binomial distributions.<br /><br />Here's one such distribution family that may come to your mind: for $x_i$ in the simplex $\sum_i x_i = 1$,<br /><br />$$f(x_1,\dots x_n\mid\theta_1,\dots\theta_n)\propto \prod_{i}{x_i}^{\theta_i}$$<br />By adjusting the values of the $\alpha$s, one can get suitable priors that represent our beliefs correctly. This is known as the <b>Dirichlet distribution</b>, and its univariate case $f(x|\theta_1,\theta_2)\propto x^{\theta_1}(1-x)^{\theta_2}$ is known as the <b>Beta distribution</b>.<br /><br />In fact the parameters of said distributions are usually provided a bit differently, with $\alpha-1=\theta$.<br /><br /><b>Exercise:</b> Prove that:<br /><ul><li>The normalization constant is given by $\frac{\Gamma\left(\sum_i\alpha_i\right)}{\prod_i \Gamma(\alpha_i)}$</li><li>The mean is given by $E(X_i)=\frac{\alpha_i}{\sum_i\alpha_i}$</li><li>The Dirichlet distribution is the <b>conjugate prior to the categorical/multinomial distribution</b>. This is the key fact that makes the Beta/Dirichlet distribution important.</li></ul>bayesian statisticsbeta distributionconjugate priordistributionspriorsprobabilitystatisticsThu, 30 Apr 2020 14:19:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-6053944640526075175Abhimanyu Pallavi Sudhir2020-04-30T14:19:00ZPoisson processes: from geometric to Gamma distributions
https://thewindingnumber.blogspot.com/2020/04/poisson-processes-from-geometric-to.html
0The notion of a Poisson process is rather beautiful, and connects a number of distributions together.<br /><br />To start, consider a Bernoulli process: a discrete time series with each value IID Bernoulli. Then we can study some properties of this:<br /><ul><li>The waiting time for the 1st event is distributed <b>geometrically</b>. </li><li>The waiting time for the nth event is distributed as the sum of geometric distributions, which is <b>negative binomial</b>.</li><li>The number of events in a given period of time is distributed <b>binomially</b>.</li></ul><div>(You should be able to derive these distributions easily.)<br /><br />(Note that this is not specific to a time series -- one has the same results for a spatial lattice or for some general abstract set of points.) </div><div><br /></div><div>How would one generalize this to continuous time? Well, with continuous time you really can't talk about the result for each point in time being "Bernoulli", or about them being IID. But let's do it anyway. Suppose an event has an $\mu\; dt$ chance of occurring in the timespan $dt$. Then the chance that the first event occurs at time $t$ is (by geometric distribution) ${(1 - \mu \;dt)^{t/dt}}\mu \;dt$, or: $\mu {e^{ - \mu t}}\; dt$, i.e. a probability density $\mu e^{-\mu t}$. This is called the <b>exponential distribution</b>.<br /><br />The waiting time till the nth event is analogously just the sum of exponential random variables and its distribution can be computed through the standard MGF route. It is left as an exercise to the reader to show that the sum of exponential random variables with parameters $\mu_1,\dots\mu_\alpha$ is given by the <b>Gamma distribution</b> $\Gamma(\alpha,\sum\mu_i)$:<br /><br />$$\Gamma(\alpha,\beta)\sim \frac{1}{(\alpha - 1)!}\beta^\alpha t^{\alpha-1}e^{-\beta t}$$<br />What does our notion of independence translate to? The idea that there being an event at some time should not depend on whether there was at any other time. Well, the natural way to write independence in a way that makes sense for continuous distributions is to consider the waiting time for the first event<br /><br />\[P(T>t+s|T>t) = P(T>s)\]<br />This is known as <b>memorylessness</b>. Indeed, one can check that the only memoryless discrete distribution is geometric, and the only memoryless continuous distribution is exponential.<br /><br />OK -- what about the number of events in the continuous case? Well, in some interval of size $T$, the probability of the number of events equaling some $n$ is (by binomial):<br /><br />\[\left( {\begin{array}{*{20}{c}}<br /> {T/dt} \\<br /> n<br />\end{array}} \right){(\mu \,dt)^n}{(1 - \mu \,dt)^{T/dt}}\]<br />Which it is easy to see that equals:<br /><br />\[\frac{{{{(\mu T)}^n}{e^{ - \mu T}}}}{{n!}}\]<br />Which is the <b>Poisson distribution</b> with rate parameter $\mu T$.<br /><br />Here's a table of analogies between memoryless discrete and continuous processes:</div><style type="text/css">.tg {border-collapse:collapse;border-spacing:0;margin:0px auto;} .tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; overflow:hidden;padding:10px 5px;word-break:normal;} .tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;} .tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:top} .tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top} .tg .tg-fymr{border-color:inherit;font-weight:bold;text-align:left;vertical-align:top} </style><br /><table class="tg"><tbody><tr> <th class="tg-0pky"></th> <th class="tg-fymr">Discrete time</th> <th class="tg-0pky"><span style="font-weight: bold;">Continuous time</span></th> </tr><tr> <td class="tg-fymr"><b>Overall phenomenon</b></td> <td class="tg-0pky">Bernoulli process</td> <td class="tg-0pky">Poisson process</td> </tr><tr> <td class="tg-0pky"><span style="font-weight: bold;">Single-event</span></td> <td class="tg-0pky">Bernoulli distribution</td> <td class="tg-c3ow">-</td> </tr><tr> <td class="tg-0pky"><span style="font-weight: bold;">Waiting time (1st event)</span></td> <td class="tg-0pky">Geometric distribution</td> <td class="tg-0pky">Exponential distribution</td> </tr><tr> <td class="tg-0pky"><span style="font-weight: bold;">Waiting time (nth event)</span></td> <td class="tg-0pky">Negative Binomial distribution</td> <td class="tg-0pky">Gamma distribution</td> </tr><tr> <td class="tg-0pky"><span style="font-weight: bold;">Number of events</span></td> <td class="tg-0pky">Binomial distribution</td> <td class="tg-0pky">Poisson distribution</td> </tr></tbody></table><br /><div class="twn-pitfall">But isn't the continuous analog of the binomial distribution (and many other distributions) the normal distribution? Do not conflate discrete time with discrete number. Both the binomial and Poisson distributions above are discrete distributions: the Poisson is just the relevant one for the continuous-time process. These are completely unrelated notions.</div>bernoulli processdistributionsmathematicsmemorylessnesspoisson processprobabilitystatisticstime seriesMon, 27 Apr 2020 21:23:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-281526121372313967Abhimanyu Pallavi Sudhir2020-04-27T21:23:00ZProbabilistic inequalities
https://thewindingnumber.blogspot.com/2020/04/probabilistic-inequalities.html
0Consider a random variable with mean 0 and variance 1 (this is like the "natural units" of second-moment probability). Now this variance puts a value on how dispersed the PDF of the random variable can be: one couldn't, for example, have two Dirac-delta poles really far from the origin, because you can calculate the variance of that, and it's too high.<br /><br />Which raises the question: for some $k$, what's the maximum fraction of the distribution that can be outside $(-k,k)$?<br /><br />Ok, one thing is clear: at this maximum, there should be nothing outside $[-k,k]$, because then you could bring inwards without changing the fraction but reducing the variance.<br /><br />Also, for any value in $(-k,k)$, if you moved it over to the mean (0), the variance would go down. So the distribution must be comprised of just three poles at 0, $-k$ and $+k$, and is necessarily symmetric so that the mean is at 0. Letting $p/2$ be the height of each pole at $k$, the variance in terms of $p$ is $pk^2$. So $pk^2=1$, and $p=1/k^2$. I.e.<br /><br />$$P\left(\left|X\right|>k\right) \le 1/k^2$$<br />Or for general mean and variance:<br /><br />$$P\left(\left|\frac{X-\mu}{\sigma}\right|>k\right) \le 1/k^2$$<br />This is <b>Chebyshev's inequality</b>, and gives you a limit on how much of the distribution can be some given distance $k$ from the mean. Note how it only becomes interesting for large $k$ ($k>1$).<br /><br />Well, clearly this approach seems to open up a whole world of similar inequalities. Another is the <b>Markov inequality</b>, which states that (for a nonnegative random variable) no more than $1/k$ of a population can have value more than $k$ times the mean, i.e.<br /><br />$$P\left(X\ge k\right)\le\frac{\mu}{k}$$<br />(Justify this with similar reasoning as Chebyshev's.)<br /><br />In fact, Chebyshev's inequality can be derived as a special case of Markov's (do it).<br /><br />The point of these inequalities is that means and variances are generally easy to track, even when probability distributions are unknown. Providing bounds on probability fractions based on these is very useful for proving <a href="https://thewindingnumber.blogspot.com/2020/04/probabilistic-convergence.html">convergence in probability</a> -- for example, the weak law of large numbers becomes elementary with Chebyshev's inequality.chebyshev's inequalityconvergenceinequalitieslaw of large numbersmarkov inequalitynatural unitsprobabilityrandom variablesstatisticsMon, 27 Apr 2020 14:39:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7002712675819758146Abhimanyu Pallavi Sudhir2020-04-27T14:39:00ZProbabilistic convergence
https://thewindingnumber.blogspot.com/2020/04/probabilistic-convergence.html
0The <b>law of large numbers</b> is something that we know, that in our heads is almost the definition of probability (it's not): "as sampling increases, the average of a variable approaches its expected value". I.e. for $X_i$ IID:<br /><br />$$\lim_{n\to\infty}\frac1n \sum{X_i}=\mu$$<br />Let's think about what this statement really says: when you take more and more readings of $X$, the average will go closer and closer to $\mu$. But the values of these readings are inherently probabilistic: this is not an actual sequence of real numbers you can take the limit of. Rather, you are talking saying that of all the possible realizations (which are real number sequences), <i>almost all</i> of them (probabilistically) converge to the thing. I.e.<br /><br />$$\mathrm{Pr}\left[\lim_{n\to\infty}X_n=X\right]=1$$<br />This is known as <b>almost sure convergence</b>.<br /><br />In general, the thing on the right could've been a random variable, rather than a real number. And here's where some <a href="https://thewindingnumber.blogspot.com/2019/10/sigma-fields-are-venn-diagrams.html">probability theory</a> (read the article) comes in, because the random variables $X_n$ and $X$ need to be defined on the same sample space for this to make sense (i.e. it's not just about the distribution).<br /><br />But with this, the definition as above still works: as an example, consider the sample space $[0,1]$ and consider a sequence of random variables $X_n$ that is respectively 1 on some corresponding sequence of sub-intervals approaching $[0,1/2]$. Then this approaches the random variable that is 1 on $[0,1/2]$ almost surely.<br /><br />And yes, this is entirely due to the correlations between these things.<br /><br />In any case, almost sure convergence isn't really the best way to express random variables converging to each other, as you can see. E.g. the central limit theorem -- like $\frac{1}{\sqrt{n}}\sum\frac{X_n-\mu}{\sigma}\sim N(0,1)$, cannot be phrased in terms of almost sure convergence, because $N(0,1)$ is a <i>distribution</i>, not a random variable.<br /><br />Indeed, you may have figured that the problem of a random sequence converging to a random variable is somewhat similar to the notion of "functions converging to a function" -- indeed, one may think of the <i>distributions</i> of the random variables in the sequence and discuss their convergence. I.e.<br /><br />$$F_n(x)\to F(x)$$<br />This is called <b>convergence in distribution</b>.<br /><br />While convergence in distribution does not imply almost sure convergence in general as we've seen, we would expect that it does imply it in the case where the limiting random variable is constant (because then issue of correlations disappears).<br /><br />But you may realize that this is not really so: a sequence may look increasingly like something without actually limiting to it. For example, think about a sequence like 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1... with an infinite number of 1s, but decreasing in frequency. This doesn't limit to zero. If this were a deterministic sequence, this would never be expected to limit to 0 as the positions of the 1s would be hardcoded into the generation of the sequence. However, the sequence can also be realized as a realization of a sequence of random variables $X_n$ that have probability $1/n$ of being 1. Then the $X_n$ converge in distribution to 0, but their realizations almost never (thus in particular don't almost surely) converge to 0.<br /><br />So it seems that asking for realizations to almost surely converge to the right thing is a bit too strong for a lot of purposes. A weaker notion of convergence than almost sure convergence can be constructed by considering probabilities of each $X_n$ separately rather than as a sequence: $X_n$ converges to $X$ if each $X_n$ is in the limit <i>almost surely arbitrarily close to $X$</i>. Or more precisely:<br /><br />$$\lim_{n\to\infty}\mathrm{Pr}\left(\left|X_n-X\right|<\varepsilon\right)=1$$<br />This is known as <b>convergence in probability</b>. Indeed:<br /><ol><li>Almost sure convergence implies convergence in probability (obviously).</li><li>Convergence in probability implies convergence in distribution (because they are both topological notions of convergence and the map from a random variable to its distribution is continuous).</li><li>When the limit random variable is constant, convergence in distribution implies convergence in probability.</li></ol>In fact, the law of large numbers that we stated above (in terms of almost-sure convergence) is the <b>strong law of large numbers</b>, while the <b>weak law of large numbers</b> only states convergence in probability.<br /><br /><hr /><br /><b>Exercise:</b><br /><b><br /></b>Prove Slutsky's lemma: given $X_n, Y_n$ converge to $X,y$ in probability and $y$ is a constant random variable:<br /><br /><ol><li>$X_n+Y_n$ converges to $X+y$ in probability.</li><li>$X_nY_n$ converges to $Xy$ in probability.</li><li>$X_n/Y_n$ converges to $X/y$ in probability.</li></ol><div>Why is it necessary that $y$ be a constant?</div>central limit theoremconvergencelaw of large numberslimitsmetric spaceprobabilityrandom variablesrandom vectorsstatisticstopologyMon, 27 Apr 2020 10:30:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-641488481531293284Abhimanyu Pallavi Sudhir2020-04-27T10:30:00ZComment by Abhimanyu Pallavi Sudhir on What does "surface area of a sphere" actually mean (in terms of elementary school mathematics)?
https://math.stackexchange.com/questions/3641205/what-does-surface-area-of-a-sphere-actually-mean-in-terms-of-elementary-schoo
@silph Right, and I'm saying these insights/intuition should be equated to the discipline of calculus.Sat, 25 Apr 2020 19:02:54 GMThttps://math.stackexchange.com/questions/3641205/what-does-surface-area-of-a-sphere-actually-mean-in-terms-of-elementary-schoo?cid=7487449Abhimanyu Pallavi Sudhir2020-04-25T19:02:54ZComment by Abhimanyu Pallavi Sudhir on What does "surface area of a sphere" actually mean (in terms of elementary school mathematics)?
https://math.stackexchange.com/questions/3641205/what-does-surface-area-of-a-sphere-actually-mean-in-terms-of-elementary-schoo
Rather than try to <i>explain</i> this in terms of elementary-school mathematics, you should <i>use</i> this elementary question as motivation for calculus. More generally, calculus is what you naturally get when trying to generalize ideas from flat things to curved things (an important question here is "What is a curved thing?" -- the answer is "something that looks flat when you zoom in").Sat, 25 Apr 2020 14:42:10 GMThttps://math.stackexchange.com/questions/3641205/what-does-surface-area-of-a-sphere-actually-mean-in-terms-of-elementary-schoo?cid=7486740Abhimanyu Pallavi Sudhir2020-04-25T14:42:10ZWhy history needs to be completely reformulated
https://thewindingnumber.blogspot.com/2020/04/why-history-needs-to-be-completely.html
0When I look at mainstream historical thought, I often find a complete lack of any scientific or objective approach to the area. Some assorted comments:<br /><br /><ul><li>What <a href="https://www.quora.com/What-are-common-fallacies-about-technology/answer/Shashank-Nayak-8">Shashank Nayak on Quora</a> calls the <b>Great Man theory applied to technologies</b>. The historiography of the Industrial Revolution is formulated around "railroads" and "the steam engine". These are obviously post-fitted explanations: there is no clear reason why "railroads" would bring a fundamental change in economic history but not e.g. the assembly line.<br /><br />While a "great man theory" may make sense to a certain degree for <i>political</i> changes, it does not make sense for economic and technological changes, in which growth is intuitively just exponential, as opposed to the discontinuities observed at the agricultural and industrial revolutions.<br /><br />The question "Why did this sudden break in economic growth rates occur in 10,000 BC and never at any point before that?" needs to be answered in terms of social science, in terms of some change in political systems, rather than something purely technological like "agriculture was invented".<br /><br />More generally, historians' (cause-and-effect) <b>explanations of events lack a firm grounding in relevant (economic/social scientific) theory</b> -- they're just post-hoc explanations that someone came up with. Richard Feynman's comments on what it means to know something seem to apply.<br /><br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/tWr39Q9vBgo/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/tWr39Q9vBgo?feature=player_embedded" width="320"></iframe></div><br />Various "political" biases play a role here: for example, it is rare for ancient historians to consider the effects of historical economic policies, beyond relatively trivial matters like infrastructure and centralized governance, because historians don't really view economics in a fundamental way like they should.<br /><br /></li><li>Historians seem to often be <b>rather ignorant of the disciplines whose history they study</b>. <br /><br />Questions relating to the origins of scientific ideas are often answered etymologically or on basis of form rather than function (e.g. algebra's etymology being confused with its history). <br /><br />Historians of philosophy seem to be particularly terrible at translating historic philosophy, because philosophy is "fundamental" in a certain sense that makes it difficult to unambiguously express in human language, and the only way to understand what someone is saying is to understand it yourself. E.g. the philosophical position of my favourite ancient philosopher, Ming dynasty neo-Confucian <a href="https://en.wikipedia.org/wiki/Wang_Yangming">Wang Yangming</a>, is basically the stuff expressed in my <a href="https://thewindingnumber.blogspot.com/2017/08/three-domains-of-knowledge.html">Three domains of knowledge</a> article, plus the idea that normative beliefs are the fundamental basis of positive ones. This is translated by historians as saying that "the point of knowledge is action".<br /><br /></li><li>History seems to be told in terms of <b>narratives</b> rather than in terms of actual physical events. This is in part because the social science that the humanities are based on (I like to say that social science : humanities :: science : engineering) is itself formulated in terms of Marxian social theory, which is more of a narrative than an operational scientific theory, and more generally because the social sciences and humanities (and for whatever reason, philosophy) has always and everywhere been evaluated through the lens of advancing a social movement, rather than objective criteria/truth.<br /><br />What does something like the New York Times's "1619 project" even supposed to mean? What does it mean for a country to be "founded on" something? These are entirely social, "emotional" notions, i.e. having to do with what associations people personally make in their minds, and has nothing to do with history itself. Yet "<b>interpreting history</b>" seems to be the main focus of history academia. <br /><br />In particular, this leads to what I call <b>emphasis-gaming</b>, something that is also practised by the media. Without actually being factually inaccurate, you can choose to formulate history around a narrative of your choice and pick all the right aspects to place emphasis on in your account. Because the discipline lacks objectivity, this actually has an influence on the inferences made regarding the history you're describing (see the "uncertainty" point). This is particularly prevalent in Indian history -- e.g. the emphasis placed by historians on the <em>possibility</em> that the Mauryan empire permitted partial autonomy to less urban regions (which is seldom a point of discussion with most other empires of the world, such as Persian and Central Asian ones whose territories were far less urban than Mauryans').<br /><br />I can understand the use of narratives for pedagogical reasons -- as a way for people to organize and store historical information efficiently. But this should be <i>after</i> collecting and calculating historical information, not as a way to "pre-formulate" the discipline. Otherwise, a history textbook becomes indistinguishable from a historical movie.<br /><br /></li><li>The notion of <b>historiography</b>, i.e. the "history of history". Historians have the habit of arbitrarily assigning more trust to some accounts and sources than to others, and refer to them as "history". Although this is often justified on various parameters of differentiation (period of time between the described event and the account, purpose of the account, presence of poetry and exaggeration in the account, agreement with other accounts), these explanations seem to often be post-hoc and applied arbitrarily. For example, in Indian history, historians often consider Buddhist accounts more historical than those of other religious sources, despite no difference in the mentioned parameters. On a related note, I think that historians' division between "history" and "pre-history" is arbitrary, pointless and a bit self-important.<br /><br />More generally, historians have a number of arbitrary and <b>convoluted definitions</b>, e.g. "what's a historian?", "what's a religion?", "what's calculus?" Instead, history should be formulated around simple and meaningful definitions and distinctions -- like physics is..<br /><br /></li><li>A <b>failure to acknowledge uncertainty</b>. Much of historical academia is just speculation presented as fact, and the criterion used to decide which speculation is accepted as fact is the notion of <b>consensus</b>, rather than anything objective. <br /><br />An example is <a href="https://books.google.co.uk/books?id=a-JGGp2suQUC">Angus Maddison's historical GDP estimates</a> (see p. 379) from 1000 AD and earlier -- no serious calculation or justification is given for the values presented (agricultural records etc. from the time are sparse and in many places completely absent). Values for India and China seem to be calculated purely on their population shares (i.e. under the assumption that they were economically average), even when better estimates can be made based on a subjective evaluation of the technologies of these societies.<br /><br />Perhaps a more stark example is the reign of Ashoka in Ancient India. The main source of information on this matter are Ashoka's edicts, a set of rock inscriptions spread across South Asia. There is <a href="https://en.wikipedia.org/wiki/Major_Pillar_Edicts#Authorship">significant primary uncertainty</a> on their authorship, whether the major pillar edicts were related to Buddhism, and whether the major and minor edicts were authored by the same individual. Nonetheless, they are used to provide a standard "consensus" chronology of Mauryan-era India. Further uncertainty exists on the factors behind Ashoka's <a href="https://en.wikipedia.org/wiki/Ashoka#First_contact_with_Buddhism">conversion to Buddhism</a> and whether the edicts reflected a true state of affairs in his empire or <a href="https://en.wikipedia.org/wiki/Ashoka#The_war">rather political propaganda</a>. Nonetheless, the edicts are accepted at face value, Ashoka is interpreted by historians as a "driving force for religious reform in India", <a href="https://www.newsnation.in/india/news/romila-thapar-saying-ashoka-inspired-yudhisthira-sparks-controversy-and-fury-on-twitter-238459.html">mainstream historians speculate wildly on the impacts of his reign</a>, and a Buddhist-centric reading of Indian history is formulated. <br /><br />I'm aware that scholarly criticism of these interpretations exist. Nonetheless, there is a <em>mainstream view</em> chosen based on consensus (rather than anything objective) which carries a very high degree of uncertainty: this uncertainty compounds at every "step" as demonstrated in the above example, and an eventual <em>mainstream narrative</em> is constructed based on these arbitrary choices, which is ridden with uncertainty and is indistinguishable from historical fiction.<br /><br /></li></ul><hr /><br />The key, fundamental problem of history is the compounding uncertainty I described -- everything else can be sorted fairly easily. The key to solving this problem is to be organized about uncertainty, and accept all the most likely explanations as serious "alternate possibilities" rather than stick to a consensus-based canon.<br /><br />I suggest a <b>comprehensive database of historical facts</b>: facts of the <b>five forms</b> detailed in the figure below. These facts should be linked by inferential links so it is very clear why a certain historical belief is valid, and this database should be <i>the</i> platform for all historical research. History should be visualized as a "<b>possibility tree</b>", where each "set of facts" in one level is taken to suggest several likely alternate "sets of facts" in the next level (see figure below). The likelihood of each such set of facts is evaluated based on the direct likelihood of each fact, as well as their correlations (i.e. agreement between sources).<br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/--NDpjE9BXzI/XqMR0rbBoFI/AAAAAAAAGFg/lYnC87RX-yEu4_T6EIxEO5C-g5k-dpSLgCLcBGAsYHQ/s1600/history.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="553" data-original-width="1245" height="176" src="https://1.bp.blogspot.com/--NDpjE9BXzI/XqMR0rbBoFI/AAAAAAAAGFg/lYnC87RX-yEu4_T6EIxEO5C-g5k-dpSLgCLcBGAsYHQ/s400/history.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">"Levels" of history, history as a possibility tree</td></tr></tbody></table><br />Some explanations of what each item entails:<br /><ul><li><b>Empirical sources:</b> These are everything directly observable, the "tip of the iceberg" from which all historical inferences must be based. Note that they are not the same as "primary sources": for example, if a primary source for something is missing, a later translation or commentary counts as a "basic source" (the inferences made from it, of course, will be much more uncertain).</li><ul><li><b>Examples:</b> archaeological sources, epigraphical sources, literary sources, genetic sources, extant sources (e.g. extant instruments, traditions). </li></ul><li><b>Disparate events:</b> These are specific, operationally-defined events. </li><ul><li><b>Examples:</b> military conflicts, laws instituted, protests occurred, existence of an instrument/manuscript/building, undertaking of actions (education, assassination, a speech).</li><li><b>Non-examples:</b> inventions and discoveries (because this posits that said individual was the first to create something, which is a chronological statement), political movements (because these are interpretations or classifications, and are therefore created narratives)</li><li><b>Uncertainty added:</b> Nature and purpose of object, site and time of creation and use of object/document.</li></ul><li><b>Anthropology:</b> Snapshots of known historical cultures (i.e. at a place and time) in terms of more long-term trends.</li><ul><li><b>Examples:</b> technology present in the culture, knowledge present in the culture, traditions of the culture, major historical events in the culture, political systems and entities of the culture.</li><li><b>Uncertainty added:</b> Inferred details about creation and usage of object (e.g. the creation of such metal required such-and-such technology, such-and-such technology would also provide the possibility of such-and-such other technology), prevalence of a technology and knowledge (e.g. literacy levels, GDP, urbanization level) or tradition (e.g. cultural beliefs), precise nature of events (was the Kalinga war a war or a civil war?), precise political systems and territories of political entities.</li></ul><li><b>Chronology:</b> An overall geographical and historical picture of the world and its major events. <i>This is what history really is.</i></li><ul><li><b>Examples:</b> inventions and discoveries, political history of the world, comparative anthropology, migrations and genetic changes</li><li><b>Uncertainty added:</b> Dating and locational uncertainties, uncertainty of invention vs. introduction (because you may just be missing some earlier evidence)</li></ul><li><b>Narrative:</b> Cause/effect, "interpretations" of history, classifications of political movements, characterizations of civilizations.</li><ul><li>As mentioned earlier, narratives should exist for the purpose of pedagogy. This means that meaningless emotional conceptions like anthropomorphizing and psychoanalyzing countries -- which do not really aid in pedagogy but rather deter honest and rational thinking about history -- should be avoided. This includes conceptions such as the 1619 project, exceptionalist narratives, vague, conflating terminology like "hierarchy" and "oppression", are forbidden. Narratives should be about <i>abstraction</i>, not <i>vagueness</i>.</li><li>For narratives of individual civilizations, I suggest a pre-defined format of the "notable" aspects to be filled based on chronological facts. E.g. there would be a "technology" section containing important engineering subfields, a "science" section containing various academic subfields, and so on. Historians often try to create "customized" narratives for each culture in an attempt to be "unbiased", but in doing so end up imposing their own personal view of what said culture "should" be summarized by, and lead to the various -centrisms in history academia (e.g. the Aristotle/Athens-centric reading of Ancient Greece). </li></ul></ul>academiahistoryhumanitiespseudosciencerationalitysciencescience educationsocial sciencesFri, 24 Apr 2020 16:23:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-357327150546725755Abhimanyu Pallavi Sudhir2020-04-24T16:23:00ZI don't believe p-hacking is a problem.
https://thewindingnumber.blogspot.com/2020/04/i-dont-believe-p-hacking-is-problem.html
0Or more precisely: <b>I don't believe p-hacking is a fundamental <i>mathematical</i> or <i>statistical</i> issue, but rather an issue with the methods adopted by experimental researchers.</b><br /><br />If you haven't heard of p-hacking, it's as follows: suppose you want to find predictors for cancer. You test 100 possible predictors each at p-value 5%. Now although the chance of a false positive on any given test is 5%, you're expected to get 5 false positives in the test. So you can "always" find a (fake) predictor for cancer just by surveying enough things.<br /><br />Another way of putting it: even if your factors don't actually predict cancer, the probability distribution for the observed correlation for any one factor may look something like this.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-epmfEy-8IFg/Xp8DCFoof6I/AAAAAAAAGEo/XnhdE3nfBvIka1eaYsSetaOS1qT--JJ4wCLcBGAsYHQ/s1600/oopsie.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="662" data-original-width="766" height="276" src="https://1.bp.blogspot.com/-epmfEy-8IFg/Xp8DCFoof6I/AAAAAAAAGEo/XnhdE3nfBvIka1eaYsSetaOS1qT--JJ4wCLcBGAsYHQ/s320/oopsie.png" width="320" /></a></div>So when you test for these factors, when the correlations you observe <b>actually sample the above curve</b>, you're faced with a question: <b>do you believe the points that lie in the shaded region (beyond your significance level) actually reject the null hypothesis?</b><br /><b><br /></b> On one hand: they lie beyond your significance level. Individually, you need to reject those null hypotheses.<br /><br />On the other hand, one can also think of a "mega-" null hypothesis as implying the above curve: since your points sample the curve, you need to accept the null hypothesis.<br /><br /><hr /><br />I believe the answer is to reject those null hypotheses, i.e. <b>to not make any "corrections"</b> for having tested multiple parameters.<br /><br />Here are some explanations:<br /><ul><li><b>An individual researcher "p-hacking" is fundamentally/mathematically no different from a large number of researchers investigating of various different parameters.</b> Surely it makes no sense to argue that all positive results in the literature should be ignored, or that they should be evaluated at much stronger significance levels. </li><li>When you investigate a large number of parameters, the <b>probability of a true positive is also higher</b> (in a Bayesian sense). If your positives are more likely to be false than true when you're testing a hundred parameters, they were more likely to be false than true when testing one parameter too. Of course, you are much more likely to have false positives when testing more parameters, but that doesn't increase the chance that any given deduction is false, because there are more true positives, too. </li><li>Or in other words, the "mega-null hypothesis" probably <i>isn't</i> true. If the $\theta$ parameters are independent, then you'll probably have a large number of false null hypotheses. The "mega-null hypothesis" argument actually seems to assume zero probability of a true positive. </li><li>Equivalently: just apply Bayes's theorem/the fact that probability is commutative. (equivalent to the first point)</li></ul><div>Also note how the Bonferroni correction has nothing to do with p-hacking: it applies to looking at probabilities/confidence levels of several hypotheses being true, not about one.</div><div><br /></div><hr /><br />So, then, why do the consequences of p-hacking all seem to bizarre? Stuff like this:<br /><br /><img height="252" src="https://miro.medium.com/max/1200/0*y_dNzVMkUforleEc.png" width="640" /><br />Or this:<br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><img height="234" src="https://www.kdnuggets.com/wp-content/uploads/xkcd-p-value-jellybeans.jpg" style="margin-left: auto; margin-right: auto;" width="400" /></td></tr><tr><td class="tr-caption" style="text-align: center;">Full comic for context: <a href="https://xkcd.com/882/">xkcd 882</a></td></tr></tbody></table>Surely we don't actually believe that margarine causes divorces in Maine, or that only green jelly beans cause acne?<br /><br />And no, we don't.<br /><br />Why not?<br /><br />Because <b>there is no <i>a priori reason</i></b> to suspect that margarine causes divorces in Maine. Because <i>a priori</i>, we know that it's highly likely that whether green jelly beans cause acne is correlated with whether all the other color of jelly beans cause acne (because it's very unlikely that colour itself causes acne). These correlations should be embedded in our prior.<br /><br />But these aren't fundamental issues with the nature of statistics. <b>These are issues with how frequentist researchers may often decide which hypotheses to test</b>. One should have some <i>theoretical justification</i> to formulate a hypothesis: that's how you know the prior probability is significant. Unless you have a good theoretical model for why a certain correlation/etc. should hold, the hypothesis should not be tested.<br /><br />This problem is particularly prevalent in the social sciences, where a "general mathematical theory" of social science does not exist. Even in economics, you often end up with pseudo-science like this: <a href="https://prospect.org/power/want-expand-economy-tax-rich/" rel="nofollow">Want to expand the economy? Tax the rich!</a> (this particular study was terrible on several levels: (1) the conflation of correlation and causation -- this is always a problem when you have temporal trends, because time is a hidden parameter; that's why you should do cross-sectional studies (2) the correlation was statistically insignificant by any standard (3) there was no theoretical justification for why progressive taxation would expand the economy, leading to the problems discussed in this post.)<br /><br />This point is essentially the point made by several papers (links: <a href="https://statmodeling.stat.columbia.edu/2016/08/22/bayesian-inference-completely-solves-the-multiple-comparisons-problem/">[1]</a><a href="http://www.stat.columbia.edu/~gelman/research/published/multiple2f.pdf">[2]</a>) discussing "multiple comparisons in a Bayesian setting" -- it is what is meant by claims like "<b>the multiple comparisons problem disappears when you use a hierarchial Bayesian model with correlations between your parameters</b>".academiabayesian statisticsBonferroni correctionlook-elsewhere effectmultiple comparisons problemp-hackingstatistical inferencestatisticsTue, 21 Apr 2020 22:19:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-8720165114670454907Abhimanyu Pallavi Sudhir2020-04-21T22:19:00ZSpecial case of Bayes: Confidence Regions and Hypothesis tests
https://thewindingnumber.blogspot.com/2020/04/special-case-of-bayes-confidence-regions.html
0The basic general idea behind a confidence region is this: <i>Given</i> that the true value of some parameter is $\theta$ we may have some mechanism to sample "<b>random regions</b>" $R$ for $\theta$ such that 95% of these random regions contain $\theta$.<br /><br />The first obvious issue is that this mechanism should not depend on $\theta$, as it is not known to us. We want a <b>general experimental mechanism</b> that for <i>any</i> $\theta$, produces random regions of the same confidence level ("95%").<br /><br />In some basic cases, this is easy: for example, suppose we have some $X\sim N(\mu, 1)$. Then for any $\theta$, 95% of intervals generated as $X\pm 1.96$ contain $\mu$.<br /><br />The key hint that you may find in the example above is that $\mu$ is a <b>location parameter</b> for $X$, i.e. the probability of $X\mid\mu$ is a function of just $X-\mu$, i.e. the distribution of $X-\mu$ itself does not depend on $\mu$, and is just $N(0,1)$. $X-\mu$ is what we call a <b>pivotal quantity</b> here.<br /><br />In general, a pivotal quantity is a function of some data and the true value of the parameter itself $k(X,\theta)$ such that its distribution is completely specified. Then a confidence region for $k$ can hopefully be transformed back into a confidence region for $\theta$ at the same confidence level.<br /><br /><hr /><br />OK, next question: what is the <b>implied prior</b> of confidence region calculations? I.e. under what prior can the confidence level be interpreted as the probability that the true value of the parameter is contained in the confidence region?<br /><br />(For a general prior, such a region that gives you some probability of containing the true value of the parameter is called a <b>credible region</b>.)<br /><br />Well, what exactly is the confidence level? It's the probability that a randomly generated random region contains the true parameter value -- i.e. <i>before you actually know what the random region is</i>. Once you get the generated random region, this probability may change depend on the prior probability of the true parameter value being in this concrete region.<br /><br />In other words, the implied prior is one such that $\theta$ has an <b>equal prior probability of being in any possible confidence region</b>. This is easy to calculate in some specific examples:<br /><ul><li>If $\theta$ is a location parameter for $X$, the implied prior on $\theta$ is uniform, $\propto 1$.</li><li>If $\theta$ is a scale parameter for $X$, the implied prior on $\theta$ is logarithmic, $\propto 1/\theta$.</li></ul><hr /><br />The way that hypothesis testing is first introduced, one talks of things like "the probability of finding a value of $x$ at least as extreme as you did". And one sometimes chooses a "one-sided" hypothesis test and other times a "two-sided" hypothesis test. It should be clear that this isn't too fundamental a concept to be interested in.<br /><br />Rather, one sensible, more generally appropriate way of thinking of <b>hypothesis tests is in terms of confidence regions</b>. Specifically: <b>testing a null hypothesis is equivalent to asking if it is contained within the confidence region of our data</b>.<br /><br />Obviously, this depends entirely on the shape we choose of our confidence region. We can always just choose a confidence region that includes or excludes our null hypothesis and maintain the same confidence level.<br /><br />While it may be disappointing that there is no one way to construct a confidence region, this makes a great deal of sense. For example, consider the following multimodal distribution:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-nI6xhJJZwyw/XpwaWyYhwyI/AAAAAAAAGEM/IEqYFuqymikDt0Dqm9ZkRwN4yCwZlLfxwCLcBGAsYHQ/s1600/confidence_multimodal.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="513" data-original-width="1600" height="127" src="https://1.bp.blogspot.com/-nI6xhJJZwyw/XpwaWyYhwyI/AAAAAAAAGEM/IEqYFuqymikDt0Dqm9ZkRwN4yCwZlLfxwCLcBGAsYHQ/s400/confidence_multimodal.png" width="400" /></a></div><br />The sensible confidence region to construct would then be one that contains the bulk of both peaks. "Sensibility" in this sense is getting the confidence region of the least length (you may observe that this is not reparameterization-invariant).<br /><br />Various different constructions of confidence regions is what gives you things <b>like two-tailed</b> and <b>one-tailed</b> tests.<br /><br /><b>Also read: </b><i><a href="http://econ.ucsb.edu/~startz/Choosing%20The%20More%20Likely%20Hypothesis.pdf">Choosing the more likely hypothesis</a></i> by Richard Startzbayesian statisticsconfidence intervalcredible intervalfrequentist statisticshypothesis testingone-tailed and two-tailed testspriorsstatisticsSat, 18 Apr 2020 19:07:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-4773790550149646574Abhimanyu Pallavi Sudhir2020-04-18T19:07:00ZDivide-and-conquer algorithms
https://thewindingnumber.blogspot.com/2020/04/divide-and-conquer-algorithms.html
0Suppose you have an algorithm that operates at time-complexity $O(n^2)$. Then it would be pretty nice if we had a way to reduce a problem with parameter $n$ into two problems with parameters $n_1$, $n_2$ which add up to $n$. Then we can do the split on each $n_1$, $n_2$, and so on and create a tree of some sort.<br /><br />That sounds like something we could be able to do in various situations, and is known as a <b>divide-and-conquer algorithm</b>.<br /><br />Below are some examples of divide-and-conquer algorithms:<br /><ul><li><b>Eigenvalues (QR algorithm) -- </b>In our <a href="https://thewindingnumber.blogspot.com/2020/04/numerical-linear-algebra.html">QR algorithm for the Schur decomposition</a>, in which the matrix is already in upper-Hessenberg form, if at any point some value on the subdiagonal becomes particularly close to zero, we can split the parts of the matrix top-left and bottom-right of it, and perform the algorithm separately on them. This will only reveal the eigenvalues however, and not the full Schur decomposition.</li><li><b>Fast Fourier transform --</b> The Discrete Fourier transform can similarly be sped up by dividing the expression into odd and even parts and recognizing that each part is itself a Discrete Fourier transform, then recursively applying this. </li></ul><div>I'm aware the articles I'm writing on numerical algorithms aren't particularly interesting or detailed -- I just don't know of anything interesting to be said about them. If anyone knows something interesting, or any theoretical motivation/"I could have derived this myself" moments, I'd be very interested.</div>algorithmsaudio processingdivide and conquerfourier transformnumerical algorithmsnumerical linear algebraThu, 16 Apr 2020 14:00:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-394824859874078471Abhimanyu Pallavi Sudhir2020-04-16T14:00:00ZFrom flint to the industrial revolution in 2 years: rebuilding civilization (after a zombie apocalypse)
https://thewindingnumber.blogspot.com/2020/04/from-flint-to-industrial-revolution-in.html
0The world has collapsed in a zombie apocalypse. You and a band of 500 other people (around 200 of whom are able-bodied) have escaped to a safe haven somewhere, and are tasked with rebuilding civilisation. There are still people outside in the real world, but if the zombies find out about your existence, and when they get to your location, they will attempt to destroy you.<br /><b><br /></b> <b>Day 1</b><br /><b><br /></b> You don't know where you are, what date it is, or the time of the day. You start with nothing but your intellect, knowledge and observation. And some <i>flint</i> and <i>pyrite</i>, because come on.<br /><br />It's sunny. The sun isn't directly up, but close. The ground is partly covered in snow, but there are some green trees, and partly-frozen freshwater lakes. The cold is pretty bad. You can tell that the temperature is at most -10 C.<br /><ul><li><b>Tasks:</b> protect from cold, avoid dehydration, avoid hunger.</li></ul>It will probably take 30 minutes to get everything in order and have people listen to your directions. Use the flint and pyrite to start a few dozen <i>fires</i> and <i>torches</i>. You guess from the position of the sun that it's noon and the cold's about to get worse. You'll want some animal skins, but you'll need to make<i> </i>tools to kill some bears or whatever. You want to boil the lake water before drinking it, but you'll need a basic pot.<br /><br />You can make a <i>stone tip</i> within half an hour, but you'll need rope to fasten it to a stick to make it a <i>stone-tipped spear</i>. Making a <i>rope</i> is easy, since you have plenty of sources of plant fibre. You can also use the stone tips to chisel out a <i>basic pot</i> from a rock. Boil some water and keep hydrated and warm. It's fine, you can drink boiling water.<br /><br />So in 2 hours after landing, you have weapons and a pot. You can now hunt. You find mooses. Two mice (mooses) should be sufficient to feed 500 people, and you can use their skins as warm clothing -- you might have to pass the skins around a bit, but congratulations -- you managed to survive the night.<br /><ul><li><b>Inventory:</b> fire, torch, stone spear, rope, spear, basic pot, boiled water, meat, animal-skin cloth</li></ul><b>Days 2-4</b><br /><b><br /></b> Based on the fauna you've seen, atmospheric temperature and the fact that you had a night, you realise that you're in the Northern hemisphere, although quite a bit south of the arctic circle -- probably in Canada or Siberia. From the positioning of the sun and from the temperature, you figure you're just after the winter solstice. You also have the directions figured out approximately.<br /><ul><li><b>Tasks:</b> protect people and items from animals and elements, improve tools</li></ul>It would help to start by developing some more simple tools, like: <i>stone axe</i>, <i>stone shovel</i>, <i>bow and arrow</i>, <i>basket</i>, which are already within our technological capability. We also need better <i>pots</i>, which are easy to make (separate soil, mix some sand into wet clay, shape it and fire it in a closed environment, like a <i>kiln</i>).<br /><br />For residence, it might initially be better to create a <i>dug-out</i> to quickly house such a large number of people. <br /><br />It's time for some division of labour.<br /><ul><li><b>Division of labour:</b> hunters (~14%), builders (~16%), lumberjacks (~20%) | artisans (~10%), maintainers (~40%)</li></ul><div>(The tasks before the "|" are restricted on basis of physical capacity.)</div><div><br /></div>The maintainers do tasks like providing fresh water supply, making and maintaining fires, cooking, cleaning, nursing injuries. The artisans make and repair tools (especially arrows) and pots, process carcasses for meat and clothing, and tailor. The artisans and maintainers should be observed by a trustworthy superintendent.<br /><br />The hunters should also start surveying the lands they visit to create a mental map of the area, of the animals, plants, stones and other materials in the area, etc. Perhaps start thinking of temporary simple sources for stuff like writing material, ink, etc. before the perfected industrial products are developed. It's also important to look out for potential future locations to build mines.<br /><br />The builders should find an area with a suitable geology and start digging a dug-out, on some land with relief to prevent getting flooded.<br /><br />Obtaining wood has other positive effects -- you're also clearing out the forest for later agriculture.<br /><br />Bones should also be stored, and bone weapons of the existing tools should be crafted and experimented with to find the best tools for each job.<br /><ul><li><b>Inventory:</b> fire, torch, rope, stone/bone tools (spear, axe, shovel), wood, basket, kiln, pot, dug-out, bow and arrow, boiled water, meat, animal-skin cloth </li></ul><div>You've probably had a few deaths at this point from animal attacks, human conflict and perhaps even from disease or accidental injuries. Remember to worry about hygiene, particularly dental hygiene. You would also certainly have had some crime by now. Build a small vertical prison (from which someone must be pulled out or climb out of) somewhere in the dug-out. A few hunters should take shifts as prison officers (to imprison people and guard prisons).</div><div><br /></div><b>Days 5-15</b><br /><b><br /></b> You urgently need some plant-based food, and your hunters have hopefully identified something edible growing in the wild. You have been able to find some <i>fruits</i> (berries), and trees that you can get some <i>syrup</i> from.<br /><br />But we really need to reduce the dependence on hunting and gathering to free people for productive urban tasks, which will be the first step towards an industrial civilisation. But agriculture remains far away, as there are no grains and cereals in the vicinity.<br /><ul><li><b>Tasks:</b> diversify food sources, start long-distance exploration</li></ul><div>You know that there are cereals in the South. You've already figured out if you're in Canada or Russia (based on the flora and fauna you've found) -- if you're in Canada, you can find maize and sunflower from Southward expedition; if you're in Russia or Northern Europe, you can find millet, barley, wheat and hemp from Southward expedition. If you're in South America, you can find quinoa and plenty of other crops from Northward expedition.</div><div><br /></div><div>The only problem is: going there on foot will take months. Fortunately, wherever you are, you can probably easily find wild horses, which will allow you to complete such an expedition in a week.</div><div><br /></div><div>So let's prepare for the journey. You need to tame some <i>horses</i>, of course, and stock up some food for people and horses. They'll also need ropes to leash their horses, flint and pyrite and pots to boil water in (and to carry grain back in). Send a trusted superintendent with the explorers to make sure they don't wander off or decide to not return.</div><div><br /></div><div>You should also look into expanding your food sources into the aquatic realm. Spear-fishing and <i>net</i> fishing are the most practical options at the moment, but it will be helpful to make a <i>raft</i>, which you can do with logs and rope.</div><div><br /></div><div>At this point, your hunters (and explorers) should really have identified locations to build important mines and quarries: clay pits, limestone quarries, iron ore mines.</div><div><ul><li><b>Division of labour:</b> explorers (~10%), hunters (~14%), lumberjacks (~20%) | gatherers (~20%), fishermen (~10%), artisans (~10%), maintainers (~16%)</li><li><b>Inventory:</b> fire, torch, rope, stone/bone tools (spear, axe, shovel), wood, basket, pot, dug-out, bow and arrow, boiled water, meat, animal-skin cloth, raft, net, fish, fruit, syrup, tamed horses, grains</li></ul></div><div><div>Your explorers are back! In just a fortnight, you passed through the entire paleolithic phase of human history.</div><div><br /></div><div><b>Days 16-135</b></div><div>It's the agricultural revolution! Well, at least if you have sunflower, barley or wheat. Remember to plough and irrigate your fields.</div><div><div><ul><li><b>Tasks:</b> sow seeds, capture animals for farming, mine metals, build a city </li></ul></div><div>With all the forest-clearing, you probably have a lot of edible animals frolicking on your lands. Great, have the hunters capture them -- but to capture them, you need to build a <i>fence</i> from wood and stone.</div></div><div><br /></div><div><div>We definitely need to improve our tools. The main advances here will be the use of bricks and the mining of metal. Making <i>fired bricks</i> is easy, and can be made from firing a mixture of clay and sand in a kiln. </div><div><br /></div><div>But we also need mortar -- lime mortar is probably the best you can do for now. Limestone is abundant (10% of sedimentary rocks are limestone, and cave systems are full of it) -- build a limestone quarry, mine some limestone (calcium carbonate) and heat it in a kiln to produce lime (calcium oxide). A mixture of lime and sand in water gives you the <i>lime mortar</i> you need.</div></div><div><br /></div><div>Iron is perhaps the most "complicated" metal to use, nonetheless its ores are much more common than the copper and tin required to make bronze. Since you have unlimited knowledge, smelting it is not an issue. You'll have to look out for "banded iron formations", the most common source of iron ore, specifically magnetite and hematite. </div><div><br /></div><div>You'll need to add <i>charcoal</i> to your bloomery to keep temperatures high, which also has the nice effect of carbonating your iron. Charcoal can be made by burning dry wood in the absence of oxygen, such as by covering it in clay and stone (so instead of actual combustion, you have a reduction reaction that removes the hydrogen and oxygen from the wood, leaving only carbon).</div><div><br /></div><div>Then build a bloomery -- a pit or chimney filled with burning iron ore and charcoal. Upon burning, you'll get sponge iron -- a mixture of iron and unwanted material called slag. The sponge iron is then re-heated and hammered, driving out the slag and producing <i>wrought iron</i>. </div><div><br /></div><div>Our processes are getting quite complicated, and are best organised by building a city (or proto-city), with dedicated purpose-fit <i>buildings</i>. We can do this with our brick and mortar, and a <i>ladder</i> (which you can make with wood). The hardest part is making a rain-resistant roof. There are many methods here, but the simplest way is to make a wooden framework for the roof and place ceramic tiles on it.</div><div><br /></div><div><div>We can also afford some luxuries now, like furniture and ceramic cutlery. These will be made by the craftspeople and carpenters respectively, while the weapon-crafting job is passed on to the smiths.</div></div><div><br /></div><div>It's the urbanisation revolution!</div><div><br /></div><div><div>All the transportation of resources between your mines and your settlement would benefit from a systematic transportation system. Have your builders level the terrain and pave some <i>roads</i> with brick and mortar. It's also time to reinvent the <i>wheel</i> (you can do this with wood) and make some <i>carts</i>. Perhaps all these carts and ladders and what not would be best made if you used some <i>nails</i> and a <i>hammer</i>.</div></div><div><br /></div><div>Here's a summary of the buildings you'll need, classifying your inventory:</div><div><ul><li><b>City:</b> </li><ul><li><b>Urban buildings</b> -- houses, public building, hospital, prison</li><li><b>Mines and extraction</b> -- forest, water bodies, farm field, farm pen, horse stable, clay pit, limestone quarry, iron ore mine</li><li><b>Factories and processing</b> -- butchery, kitchen (including water purification, fire), mud processing, charcoal kiln, masonry and lime kiln, pottery kiln, bloomery, smiths, crafts centre, general repairs, tailor, carpenter</li><li><b>Storage</b> -- grain and meat storage, water tank, materials reserve, general tool storage, weapon storage, cart shed</li><li><b>Transportation --</b> roads, tunnels</li></ul></ul></div><div>(Examples of stuff in the general tool storage: flint and pyrite, rope, ladders, cutlery items, carpentry knives, wheels, nails, hammer. Examples of stuff in the materials reserve: wood, iron, clay)</div><div><br /></div><div><div>It's also time to rewrite our division of labour, specialising the artisans to specific tasks. With the advent of plant-based foods, cooking has also become more complicated, and should require separate specialised labour (but they can manage the water-boiling too), so the maintainers just maintain fire. The farms need constant tending to -- for irrigation, feeding animals, applying dung fertilizer to plants, etc.</div><div><ul><li><b>Division of labour:</b> explorers (~4%), lumberjacks (~7%), builders (~9.2%), guards (~2%) | gatherers and fishermen (~25%), plant farmers (~5%), animal farmers (~5%), potters (~3%), transportation workers (~6%), iron-smelters (~2%), smiths (~4%), carpenters (~5%) masons (~5%), craftsmen (~5%), butchers (~2%), tailors (~4%), cooks (~3%), nurses (~0.6%), cleaners (~0.4%), fire-maintainers (~0.8%)</li></ul><div>The civilisational advancement in this phase is incredible and actually quite rapid, but our reliance on gathering places a heavy strain on our labour pool (we especially need more builders).<br /><br />For example: with all these reserves and supply chains to care about, we really need to start recording and documenting. But making paper will need mills to grind wood, and so on.</div></div></div><div><br /></div><div>But all this is about to change with your first harvest.<br /><br /><b>Days 135-720</b><br /><b><br /></b> With 125 people freed from gathering and fishing (fishing is no longer a necessity either, as you now have plenty of diversity in food), you are ready to systematise your supply chains. You also note that maintaining a complicated supply chain comes with a serious issue of incentives, where a few nodes slacking off disrupts the entire system. You need to privatise services and allow some primitive market to set incentives.<br /><br />You also begin to observe that our small population places a serious strain on your ability to make things -- certainly, you cannot build an industrial civilisation with this population. We really want an industrial civilisation -- we want things like electricity, cars, computers, modern medicine. We want to shorten the workday and allow people to transition into more interesting jobs. You will need immigration, but making contact with the outside world remains dangerous. You need to prepare.<br /><ul><li><b>Tasks:</b> organise your supply chains (mills, paper, ink, records, clocks, cats, evaporative refrigeration), improve materials and systems (steel, cement, mechanization), prepare for contact (defense)</li></ul>It's the iron age/classical age. You may argue that we already entered the iron age in the last phase, but with the exception of the use of iron, it was pretty much a bronze age civilization until now.<br /><br />Before anything else, we need to make sure the grain we've harvested stays until the next harvest. If you have rats, domesticate some <i>cats</i>. If heat is a concern for storage, make an <i>evaporative cooler</i> (self-explanatory).<br /><br />To be able to make and maintain <i>records</i>, you need <i>paper</i> and <i>ink</i>. The easiest way to make ink is to mix charcoal ash with water.<br /><br />To make paper, you should best institute a pulp <i>mill</i> -- a factory to grind wood into fibres. The technology of a mill is used for a whole variety of applications, such as grain processing (so you can now cook bread) Once you have a pulp mill, the pulp just needs to be thinned and drained on a mesh, then pressed and dried.<br /><br />An improved version of a chimney bloomery can be created by adding paper <i>bellows</i> to blow air to remove impurities during the process. You should also experiment with different levels of carbon content in the product to perfect your <i>steel</i>-making.<br /><br />Institute a few superintendents to the factories to maintain stock quantities and demand estimates to allow flexible allocation of labour and resources.<br /><br />Clocks, which will be useful for improving efficiency with regards to the workforce, can be made easily in the form of <i>sundials</i>.<br /><br />Your construction material needs serious updating -- it's basically in the neolithic age. Have your explorers visit some volcanic areas for various volcanic deposits -- some of them will turn out to be pozzolans -- mixing these with calcium hydroxide (which you can get from lime and water) gives you <i>lime cement</i> -- which when mixed with sand and gravel gives you <i>concrete</i>, that can be used a mortar. If you need to build anything in water for some reason, add volcanic ash -- this has weird reactions with water which will allow it to set underwater.<br /><br />Many of your activities can be partly mechanized with water-power -- e.g. mills and bellows. Build a <i>water-wheel</i> -- this is just a wheel you put under flowing or falling water, with hanging cups to capture the water's energy for rotation. It can then be connected to a <i>trip-hammer</i>, which would simplify and reduce labour-dependence in many manufacturing operations.<br /><br /><i>Defense plan</i><br /><i><br /></i> Let's make a defense plan for your base.<br /><br />First of all, every commercial building should be equipped with a dug-out shelter with a narrow entrance, from which all the functions of the building can be carried out. These shelters should not replace the existing buildings, as they can act as death traps if blocked or poisoned.<br /><br />An underground network for the dug-outs is required, but should be difficult to traverse and the network links should be minimal. The idea is that these tunnels should only be used if temporary circumstances make it dangerous to come to the surface. Therefore only factors of production whose supply is required in the short-term should be linked to their factories. Each dug-out should store flat-boulders (both round and flat-based ones) to clog the network if needed.<br /><br />Homes should not be fitted with individual shelters -- instead, a large communal cave should be prepared with basic residential facilities fitted (you can just the dug-out you prepared when you were starting out, and expand as need be). This is to prevent having too many entrances to the underground network (unconnected private shelters are fine, but it's easier to just escape to a commercial area or the communal home, as the area is still pretty small).<br /><br />Allow surplus to build up in the storage: your factories will very likely be attacked. This is especially important for your grain storage, as you will not be able to take your fields underground.<br /><br />Let's consider the three routes that the zombies will attack you from:<br /><ul><li><i>By land</i></li></ul><div>Preparing for the worst, suppose that the zombies have tanks, and massive numbers of soldiers. I suggest a haphazard perimeter of secret land mines around your settlement. </div><div><br /></div><div>But these should be the last resort, in that you should be able to disable the attacking forces before they reach this perimeter (because the land mines are not easily replenishable). Build fire-towers around this perimeter to destroy incoming armies with rockets, oil and flamethrowers. Towers should be fitted with supplies of everything needed to sustain your warriors. </div><div><br /></div><div>The most reliable way to hijack a tank, however, is to hijack it. So I suggest heavily-armoured warriors with pickaxes to hijack tanks that get through your defenses, then use it against other incoming tanks. </div><div><br /></div><div>Any strategic mountain passes should have their boundary mountains converted to fire-towers. </div><div><br /></div><div>In case of any penetration, station some firearm-equipped warriors within the city to destroy any intruders.</div><ul><li><i>By sea</i></li></ul><div>Rig any nearby waters with jagged rocks. The previously mentioned "marine cement" is relevant here.</div><ul><li><i>By air</i></li></ul>Surface-to-air missiles are not really feasible at this stage of development against any reasonably fortified aircraft. Instead, you'll have to focus on fortifying your city. Make your buildings out of steel, and cover them with thick layers of steel wool to absorb impact.<br /><br />A central watchtower should be instituted to check the progress of your battles.<br /><br />Training is essential -- train your warriors for aim, matchlock operation, combat and formations in tank-hijacking. Train civilians in emergency protocols such as seeking shelter and blocking pathways.<br /><br />Consider "wheeled" towers to be able to quickly focus your resources on a spot -- this will be useful if the zombies attempt to pressurize a specific spot of your defense.<br /><br />So the new technologies we need are:<br /><ul><li><i>Saltpeter</i>: Saltpeter can be mined from your limestone caves, from bat and bird guano. If you are near Chile, the Chilean Atacama desert is a major source of these deposits. It is purified by boiling it with a small amount of water (which dissolves the saltpeter, leaving impurities as residue), then pouring the solution through wood ashes (which contain potassium carbonate, allowing these carbonates to bond with dissolved calcium and magnesium ions and precipitate out), before drying. </li><li><i>Sulfur</i>: Your best bet for sulfur is from heating pyrite. If you have salt domes of sulfur in your region, go for them. </li><li><i>Gunpowder</i>: A milled mixture (mixed wet, then dried) of saltpeter (75%), charcoal (15%), sulfur (10%). Burning saltpeter is what produces the propulsive power (and it doesn't require much oxygen, because it produces its own when burning), charcoal is the fuel, and sulfur is also a fuel but also reduces the ignition temperature of saltpeter. </li><li><i>Land mines</i>: The only reliable way to build a gunpowder landmine is to simply bury patches of gunpowder lightly under the ground and shoot flaming arrows at them. To prevent the powder from getting moist, consider covering it with oil or concrete or something. Do some tests.</li><li><i>Rockets</i>: These are just gunpowder-powered arrow-launching cannons.</li><li><i>Matchlocks</i>: Matchlocks are perhaps the best firearm you can equip your warriors with at this stage. They operate via a lever mechanism that lowers a slow match into a constantly lit flame, transmitting it to the gunpowder. </li><li><i>Oil</i>: Find crude oil. </li></ul><div>That's it. You are now on the verge of the first industrial revolution. By sending explorers to nearby regions, you should recruit a labour force to build more and larger mines for oil and metals, to mass-produce on assembly lines in factories, and to build larger and more ambitious engineering projects such as power stations and electrical grids. Your growing population will itself create its own demands, such as more housing and food, better sanitary practices and medicine and efficient mechanization of processes with steam and electricity. </div><div><br /></div><div>But by now you know the drill. </div></div><div></div></div>civilizationhistoryindustrial revolutionrebuilding civilizationzombie apocalypseWed, 15 Apr 2020 21:51:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-3022078790381123651Abhimanyu Pallavi Sudhir2020-04-15T21:51:00ZPositive vs normative social sciences
https://thewindingnumber.blogspot.com/2020/04/positive-vs-normative-social-sciences.html
0If you've seen my <a href="https://thewindingnumber.blogspot.com/p/philosophy.html">philosophy</a> course (specially the <a href="https://thewindingnumber.blogspot.in/2017/08/three-domains-of-knowledge.html">Three Domains of Knowledge</a> article), you'll have seen my definition of ethics as the study of what an individual <i>should do</i> -- and the natural dual I described of this is what an individual <i>does observe</i>, which is the definition of physics/science.<br /><br />But another, less fundamental dual of ethics is the study of what an individual <i>does do</i> -- social science. This is on a fundamental level just a subfield of physics, as what individuals do are ultimately just things you can observe. However, the analogies between "positive" and "normative" social science are often striking and of interest to many people.<br /><br />Here's a table:<br /><style type="text/css">.tg {border-collapse:collapse;border-spacing:0;margin:0px auto;} .tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;} .tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;} .tg .tg-cly1{text-align:left;vertical-align:middle} .tg .tg-yla0{font-weight:bold;text-align:left;vertical-align:middle} .tg .tg-0lax{text-align:left;vertical-align:top} </style><br /><table class="tg"> <tbody><tr> <th class="tg-yla0">Ethics</th> <th class="tg-yla0">Social science</th> </tr><tr> <td class="tg-cly1">Statecraft</td> <td class="tg-cly1">Political science</td> </tr><tr> <td class="tg-0lax">Political economy</td> <td class="tg-0lax">Economics</td> </tr><tr> <td class="tg-0lax">Foreign policy</td> <td class="tg-0lax">International Relations</td> </tr><tr> <td class="tg-0lax">Political ideologies</td> <td class="tg-0lax">Law</td> </tr><tr> <td class="tg-0lax">Cultural beliefs</td> <td class="tg-0lax">Anthropology</td> </tr><tr> <td class="tg-0lax">Personal morality</td> <td class="tg-0lax">Human behavior</td> </tr></tbody></table>epistemologyethicsphilosophysocial sciencesWed, 15 Apr 2020 13:31:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-8518117335407927365Abhimanyu Pallavi Sudhir2020-04-15T13:31:00ZFourier series and Hilbert spaces
https://thewindingnumber.blogspot.com/2020/04/fourier-series-and-hilbert-spaces.html
0The idea behind Fourier series is to try and express some function on a domain $[-L,L]$ into a sum of complex exponentials of the form $\frac{1}{\sqrt{2L}}e^{2\pi i \ nx/L}$. One of the reasons this is interesting is that the complex exponentials are orthonormal system under the dot product $\int f(x)\overline{g(x)}\ dx$.<br /><br />One can start by considering the vector space $V$ of all square-integrable functions on $[-L,L]$ -- this gives us a vector space with an inner product. Specifically, we're interested in the subspace $V_n$ that is the span of complex exponentials upto $n$ and $-n$.Then given a vector $f$ in $V$, we can ask for its <b>projection</b> $f_n$ onto $V_n$.<br /><br />As the complex exponentials are already orthonormal, it is easy to calculate this projection in their basis: <br /><br />\[\begin{gathered}<br /> {a_k} = \left\langle {f,\frac{1}{\sqrt{2L}}{e^{2\pi i\;nx/L}}} \right\rangle = \int\limits_{ - L}^L {f(x)\frac{e^{ - 2\pi i\;kx/L}}{\sqrt{2L}}dx} \hfill \\<br /> {f_n}(x) = \sum\limits_{|k|\le n} {{a_k}\frac{e^{2\pi i\;kx/L}}{\sqrt{2L}}} \hfill \\<br />\end{gathered} \]<br />Notably this implies by Cauchy-Schwarz that:<br /><br />\[{\left| f \right|^2} \geqslant \sum\limits_{|k| \leqslant n} {{{\left| {{a_k}} \right|}^2}} \]<br />This really just is Cauchy-Schwarz, and is known as <b>Bessel's inequality</b>. If we can show that the Fourier series approaches $f$, i.e. that $\left\|f-f_n\right\|\to 0$, then it would be obvious that<br /><br />\[{\left| f \right|^2} = \sum\limits_{|k| \in \mathbb{Z}} {{{\left| {{a_k}} \right|}^2}} \]<br />Which is just the Pythagoras theorem, and is known as <b>Parseval's theorem</b>. Obviously, these theorems exist in the general theory of Hilbert spaces.fourierfourier serieshilbert spacelinear algebramathematicsTue, 14 Apr 2020 15:39:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-5453507947770382247Abhimanyu Pallavi Sudhir2020-04-14T15:39:00ZPolynomial interpolation and Vandermonde
https://thewindingnumber.blogspot.com/2020/04/polynomial-interpolation-and-vandermonde.html
0Suppose you want to find the minimum-degree polynomial $p(x) = \sum {a_jx^j}$ passing through some points $(x_i, y_i)$. This amounts to solving the system:<br /><br />$$\sum {a_j{x_i}^j}=y_i$$<br />A first bit of intuition: it seems completely reasonable that any set of $n+1$ points with $x_i$s distinct (so you're actually describing a function), can be interpolated with a polynomial of $n$ degree. What this means is that the matrix $X_{ij}={x_i}^j$, called the <b>Vandermonde matrix</b>, should be square for it to be invertible. Indeed, it's easy to show by considering the degree of the determinant polynomial of the matrix that:<br /><br />$$\det X = \prod_{1\le i < j \le n} {(x_i-x_j)}$$<br /><br /><hr /><br />So the question is of course if there's a simple general expression for the inverse of the Vandermonde matrix.<br /><br />Here's an idea: if all the $y_i$s were zero, then an $n+1$-degree (<b>NOT</b> $n$-degree! this is a different category of problem, which is not uniquely determined) polynomial would be a constant times $(x-x_0)\dots(x-x_n) $. If we didn't want it to be zero at some particular $x_{i_0}$, we could exclude $x-x_{i_0}$ from the product. Then we can play with the constants so it takes the value we really want ($y_{i_0}$).<br /><br />In other words, the polynomial<br /><br />$$L_n^{i_0}(x) = \frac{1}{\prod_{i\ne {i_0}}(x_{i_0}-x_i)}\prod_{i\ne i_0}(x-x_i)$$<br />Called the <b>Lagrange polynomial</b> gives zeroes at all $x_{i}$ except $x_{i_0}$, where it gives 1. Therefore the polynomial:<br /><br />$$\sum_i{y_i L_n^i(x)}$$<br />Which is conveniently of $n$-degree, is the true interpolating polynomial.<br /><br /><hr /><br />Conveniently, this approach also tells you what to do when you get a new data point: just add a polynomial that is zero at the existing points and the right adjustment at the added data point. I.e. given $P_n$ is the $n$-degree interpolating polynomial for $(x_0,y_0)\dots(x_n, y_n)$, we want to add $p_{n+1}$, the $n+1$-degree polynomial that is zero at all these points but $y_{n+1}-P_n(x_{n+1})$ at $x_{n+1}$. I.e.<br /><br />$$P_{n+1}(x)=P_n(x)+\left(y_{n+1}-P_n(x_{n+1})\right)L_{n+1}^{n+1}$$<br />Alternatively we can also see the added term as a polynomial proportional to $\prod_{i<n+1}{(x-x_i)}$, with the coefficient given by $(y_{n+1}-P_n(x_{n+1})/\prod_{i<n+1} {(x_{n+1}-x_i)}$.<br /><br />It is easy to show that this coefficient, which we will write as $f[x_0,\dots x_{n+1}]$, can be written recursively as:<br /><br />$$f[x_0,\dots x_n]=\frac{f[x_1,\dots x_n] - f[x_0,\dots x_{n-1}] }{x_n-x_0}$$<br />This is known as <b>Newton's divided difference</b>, and is the coefficient on $x^n$ in the interpolating polynomial -- one may observe that this is a discrete analog of the higher-order derivative (with some attention given to the denominators). It should be perfectly natural that this occurs of course.calculusderivativesdivided differenceinterpolationlinear algebramathematicspolynomial interpolationpolynomial regressionregressionvandermonde matrixMon, 13 Apr 2020 23:11:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-2324199750697846866Abhimanyu Pallavi Sudhir2020-04-13T23:11:00ZFancy polynomials
https://thewindingnumber.blogspot.com/2020/04/fancy-polynomials.html
0<b>Orthogonal polynomials</b><br /><b><br /></b>Given a dot product on functions of the form $\langle f, g \rangle = \iint_{\mathbb{R}} w(x)f(x)g(x)\, dx$, we can consider orthogonal polynomials, i.e. of the form $\langle f, g\rangle = 0$. A question is if we can generate sequences $p_n$ of these polynomials of consecutive degree -- such a set would then be a basis for polynomials up to that degree (do you see why?).<br /><br />What we want is to pin down what $p_n$ must be given $p_1,\dots p_{n-1}$. What this means is trying to express the "non-leading part" of $p_n$ in terms of these former terms. We can access a "non-leading part" of the polynomial by normalizing the polynomials to monic and considering the $n-1$-degree polynomial $p_n-xp_{n-1}$. What are the components of this guy in the basis of $p_1,\dots p_{n-1}$? Well, writing<br /><br />\[{p_n} - x{p_{n - 1}} = \sum\limits_{m < n} {{a_m}{p_m}} \]<br />Then for all $m<n$,<br /><br />\[\begin{align}<br /> {a_m} &= \frac{{\left\langle {{p_n} - x{p_{n - 1}},{p_m}} \right\rangle }}{{\left\langle {{p_m},{p_m}} \right\rangle }} \\<br /> &= - \frac{{\left\langle {x{p_{n - 1}},{p_m}} \right\rangle }}{{\left\langle {{p_m},{p_m}} \right\rangle }} \\<br /> &= - \frac{{\left\langle {{p_{n - 1}},x{p_m}} \right\rangle }}{{\left\langle {{p_m},{p_m}} \right\rangle }} \\<br />\end{align} \]<br />Now, $xp_m$ has degree $m+1$. So if $m+1<n-1$, ${\left\langle {{p_{n - 1}},x{p_m}} \right\rangle }=0$. So the only $m$s that we need to bother about are $n-2$ and $n-1$. Therefore:<br /><br />\[{p_n} - x{p_{n - 1}} = {a_{n - 2}}{p_{n - 2}} + {a_{n - 1}}{p_{n - 1}}\]<br />So:<br /><br />\[\begin{align}<br /> {p_n} &= - \frac{{\left\langle {{p_{n - 1}},x{p_{n - 2}}} \right\rangle }}{{\left\langle {{p_{n - 2}},{p_{n - 2}}} \right\rangle }}{p_{n - 2}} + \left[ {x - \frac{{\left\langle {{p_{n - 1}},x{p_{n - 1}}} \right\rangle }}{{\left\langle {{p_{n - 1}},{p_{n - 1}}} \right\rangle }}} \right]{p_{n - 1}} \\<br /> &= - \frac{{\left\langle {{p_{n - 1}},x{p_{n - 2}} - {p_{n - 1}}} \right\rangle + \left\langle {{p_{n - 1}},{p_{n - 1}}} \right\rangle }}{{\left\langle {{p_{n - 2}},{p_{n - 2}}} \right\rangle }}{p_{n - 2}} + \left[ {x - \frac{{\left\langle {{p_{n - 1}},x{p_{n - 1}}} \right\rangle }}{{\left\langle {{p_{n - 1}},{p_{n - 1}}} \right\rangle }}} \right]{p_{n - 1}} \\<br /> &= - \frac{{\left\langle {{p_{n - 1}},{p_{n - 1}}} \right\rangle }}{{\left\langle {{p_{n - 2}},{p_{n - 2}}} \right\rangle }}{p_{n - 2}} + \left[ {x - \frac{{\left\langle {{p_{n - 1}},x{p_{n - 1}}} \right\rangle }}{{\left\langle {{p_{n - 1}},{p_{n - 1}}} \right\rangle }}} \right]{p_{n - 1}}<br />\end{align} \]<br /><br />Examples:<br /><ul><li><b>Legendre polynomials:</b> $w$ is the indicator for $[-1,1]$. Sequence: $1, x, x^2-\frac13, x^3-\frac35x,\dots$ </li><li><b>Chebyshev polynomials:</b> $w$ is $(1-x^2)^{-1/2}$ on $[-1,1]$. Sequence: $\cos(n\arccos x)$</li><li><b>Laguere polynomials:</b> $w$ is $e^{-x}$ on $\mathbb{R}^{\ge 0}$. Sequence: $1, x-1, x^2-4x+2, x^3-9x^2+18x-6$</li><li><b>Hermite polynomials:</b> $w$ is $e^{-x^2}$ everywhere. Sequence: $1, x, x^2-\frac12, x^3-\frac32 x$</li></ul><div><br /></div>linear algebramathematicsorthogonal polynomialsSun, 12 Apr 2020 22:45:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-3913379979938045071Abhimanyu Pallavi Sudhir2020-04-12T22:45:00ZNumerical linear algebra
https://thewindingnumber.blogspot.com/2020/04/numerical-linear-algebra.html
0<b>Some decompositions</b><br /><b><br /></b>The algorithms for the QR and LU decompositions are <a href="https://thewindingnumber.blogspot.com/2020/04/triangular-matrices-schur-qr-cholesky.html">fairly self-explanatory from their definition</a>.<br /><br />The Cholesky decomposition can be computed for a nonnegative-definite matrix through a simultaneous LU decomposition from both sides. Starting with a matrix in the block form:<br /><br />\[A = \left[ {\begin{array}{*{20}{c}}<br /> {{a_1}}&{{v^T}} \\<br /> v&{{A_2}}<br />\end{array}} \right]\]<br />We perform an LU decomposition first on the columns, from the left, then on the rows, from the right:<br /><br />\[\begin{gathered}<br /> \left[ {\begin{array}{*{20}{c}}<br /> {{a_1}}&{{v^T}} \\<br /> v&{{A_2}}<br />\end{array}} \right] = \left[ {\begin{array}{*{20}{c}}<br /> {\sqrt {{a_1}} }&0 \\<br /> {v/\sqrt {{a_1}} }&1<br />\end{array}} \right]\left[ {\begin{array}{*{20}{c}}<br /> {\sqrt {{a_1}} }&{{v^T}/\sqrt {{a_1}} } \\<br /> 0&{{A_2} - v{v^T}/{a_1}}<br />\end{array}} \right] \\<br /> = \left[ {\begin{array}{*{20}{c}}<br /> {\sqrt {{a_1}} }&0 \\<br /> {v/\sqrt {{a_1}} }&1<br />\end{array}} \right]\left[ {\begin{array}{*{20}{c}}<br /> 1&0 \\<br /> 0&{{A_2} - v{v^T}/{a_1}}<br />\end{array}} \right]\left[ {\begin{array}{*{20}{c}}<br /> {\sqrt {{a_1}} }&{{v^T}/\sqrt a } \\<br /> 0&1<br />\end{array}} \right] \\<br />\end{gathered} \]<br />The process can then be continued to give an LU decomposition where the $L$ and $U$ are transpose, thus the Cholesky decomposition.<br /><br />(Why does the matrix have to be nonnegative-definite?)<br /><br /><b>Eigenvalue algorithm</b><br /><b><br /></b> <b>A lot of factorisations are fundamentally related to the "eigen"-stuff</b>: this includes the change-of-basis factorizations: diagonalization, SVD, Schur -- and also the polar decomposition. Eigenstuff is fundamentally analytic (whatever that means), so terminating algorithms won't work, we need an asymptotic algorithm.<br /><br />A simple such algorithm to calculate the eigenvectors and eigenvalues is <b>power iteration</b>. It is related to the flows of systems of linear differential equations, in which the eigenspace with the largest eigenvalue provides the only stable equilibrium of the system. Unless you choose an initial vector $v$ that happens to itself be an eigenvector, $A^nv$ will approach the projection of $v$ onto the principal eigenspace of $A$.<br /><br />So that gives you the largest eigenvalue. There are many ways to get the remaining eigenvalues -- one inefficient way I thought of was to subtract off the part corresponding to the stretching of the primary eigenvector, but this is incredibly inefficient and numerically unstable.<br /><br />An actually sensible approach is as follows: consider the matrix $(A-\mu I)^{-1}$, which has the same eigenspaces as $A$. The largest eigenvalue of this is the eigenvalue of $A$ closest to $\mu$. By varying $\mu$ across the real line, one can discover all the eigenvalues of $A$.<br /><br /><b>Schur algorithm</b><br /><b><br /></b> You can use the eigenvalue algorithm to calculate the Schur decomposition directly via its definition -- however, there is a more efficient method.<br /><br />A triangular matrix is invariant under conjugation by the $Q$ in its own $QR$ decomposition (because the $Q$ is the identity matrix). So similar to power iteration, one may iteratively QR-factorize $A$ and replace it with $RQ$. This is called the <b>QR algorithm</b>.<br /><br />Apparently this is expensive in its crude form, so you instead bring into a form called "upper Hessenberg form" (upper-triangular but with a subdiagonal) and that apparently makes the computations less expensive. The mechanism to bring a matrix into upper Hessenberg form is based on householder reflections, and involves doing householder reflections on the parts of the columns below the subdiagonal, so the corresponding householder reflections on the rows (since the Schur is a change-of-basis transformation) do not interfere with the column of interest and screw up all your zeroes. It's not very interesting.<br /><br /><b>Least-squares</b><br /><b><br /></b>Suppose you wanted to solve $Ax\approx b$, i.e. find the $x$ that minimizes $\|Ax-b\|$. This is relevant when $A$ is not surjective.<br /><br />The idea is that $\|Ax-b\|$ is minimised when $Ax$ is the projection $b_A$ of $b$ onto the image of $A$. One can equivalently write $A^{T}(Ax-b)=0$, as $Ax-b$ is perpendicular to the column space of $A$. Thus it suffices to solve $A^TAx=A^Tb$.<br /><br />If $A$ is full-rank (this does <i>not</i> mean surjective), we can actually have a unique $x$, given by $x=(A^TA)^{-1}A^Tb$, and $A^+=(A^TA)^{-1}A^T$ is called the "<b>Moore-Penrose pseudoinverse</b>" of $A$.<br /><br />An alternative, more efficient algorithm is to use the QR factorization (the one with the non-square $Q$) to construct the projector as $QQ^T$ (which works, as $Q$ definitionally provides a basis for the image, and $QQ^T$ satisfies the defining property of a Hermitian projector). Then one can solve $QRx=QQ^Tb$, or equivalently $Rx=Q^Tb$. In the full-rank case, $R^{-1}Q^T$ is the Moore-Penrose pseudoinverse.<br /><br />To add:<br /><ul><li>SVD algorithm</li><li>Netflix algorithm (recommender systems) </li><li>Algorithm complexity for each thing</li></ul>least-squareslinear algebramatrix decompositionnumerical linear algebrapagerankpseudo-inverseregressionstatisticsSat, 11 Apr 2020 00:11:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-4149272643213358851Abhimanyu Pallavi Sudhir2020-04-11T00:11:00ZTriangular matrices: Schur, QR, Cholesky and LU/LPU decompositions
https://thewindingnumber.blogspot.com/2020/04/triangular-matrices-schur-qr-cholesky.html
0Upper-triangular matrices seem like they should be something interesting. They seem to be far more "controllable" in their behavior than general matrices -- the first dimension is just stretched, and each successive dimension's behavior does not depend on the higher dimensions. And there are of course practical computational reasons why they're simpler in a certain sense.<br /><br />It's sensible to wonder if a general matrix can be understood in this way -- yes, yes, of course you can with the <a href="https://thewindingnumber.blogspot.com/2019/03/invariant-and-generalised-eigenspaces.html">Jordan normal form</a>, but if there's something more interesting to be said.<br /><br />Some properties of triangular matrices:<br /><ul><li><b>The determinant of a triangular matrix is the product of its diagonal values:</b> this is a generalization of the fact about the area of a parallelogram being its base times perpendicular height.</li><li><b>The eigenvalues of a triangular matrix are its diagonal values, including (algebraic) multiplicity:</b> Follows from the above since $A-\lambda I$ remains triangular. </li><li><b>Normal triangular matrices are diagonal:</b> The $k+1$th eigenvector must be in the orthogonal complement of $\mathbb{R}^k$ in $\mathbb{R}^{k+1}$, which is one-dimensional and just the $x_{k+1}$-axis. </li></ul>I tend to think that all linear transformations already "geometrically" <i>look</i> like triangular matrices, though -- like you should be able to tilt your head some way so that the transformation. What this means is that we believe all linear transformations are unitarily triangularizable.<br /><br />Is this so?<br /><br />We can certainly identify the first vector in the change-of-basis matrix -- an actual eigenvector of the matrix. This corresponds to the eigenvalue represented by the top-left element of the triangular matrix.<br /><br />Now we want a vector whose image is a linear combination of this eigenvector and itself. Here's an idea: take the projection of the matrix onto the orthogonal complement of the first eigenvector. Then this projection itself has an eigenvector, and the action of the entire matrix on this vector is it scaled plus some component in the direction orthogonal to the subspace, i.e. proportional to the first eigenvector. We can repeat this process to obtain the <b>Schur decomposition</b> of $A$:<br /><br />$$A=UTU^{*}$$<br />For unitary $U$, triangular $T$.<br /><br />(By the way, a sequence of embedded subspaces of consecutively increasing dimension is known as a <b>flag</b>. For this reason, the Schur decomposition is said to say that every linear transformation <b>stabilizes a flag</b>.)<br /><br /><hr /><br />The Gram-Schmidt process might remind you of triangular matrices in the nature of the transformations applied.<br /><br />$$\begin{array}{*{20}{c}}<br /> {{v_1} = {x_1}}&{{u_1} = \frac{{{v_1}}}{{\left| {{v_1}} \right|}}} \\<br /> {{v_2} = {x_2} - ({x_2} \cdot {u_1}){u_1}}&{{u_2} = \frac{{{v_2}}}{{\left| {{v_2}} \right|}}} \\<br /> \vdots & \vdots \\<br /> {{v_k} = {x_k} - \sum {({x_i} \cdot {u_i}){u_i}} }&{{u_k} = \frac{{{v_k}}}{{\left| {{v_k}} \right|}}} \\<br /> \vdots & \vdots<br />\end{array}$$<br />Indeed, one may check that this is equivalent to writing $A=QR$ for unitary $Q$, triangular $R$. This factorization is called the <b>QR-factorization</b>.<br /><br />But there's another way to think about this, right? What $A=QR$ says is that the transformation $A$ arises from first performing a triangular transformation, then rotating it into the desired orientation.<br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-kflcfz4jMWk/XoxRSW-9MKI/AAAAAAAAGCo/_-kPtRAts60IHJQB_ZQ8U6vHd_MTlf6GgCLcBGAsYHQ/s1600/qr.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="662" data-original-width="1542" height="137" src="https://1.bp.blogspot.com/-kflcfz4jMWk/XoxRSW-9MKI/AAAAAAAAGCo/_-kPtRAts60IHJQB_ZQ8U6vHd_MTlf6GgCLcBGAsYHQ/s320/qr.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Imagine this in three dimensions, which I couldn't be bothered to draw.</td></tr></tbody></table>And one can easily check that the algorithm for doing so is precisely the Gram-Schmidt algorithm.<br /><br />But there's yet another way to <i>see </i>this factorization. Instead of just calculating the orthonormal vectors at each step, one could actually try to transform our column vectors into those of the triangular matrix through a bunch of orthogonal transformations.<br /><br />One specific way of doing this is through reflections, specifically reflections in $n-1$-dimensional planes, known as <b>Householder reflections</b>.<br /><ol><li>The first reflection is in the plane midway between $a_1$ and $e_1$, and brings $a_1$ to the $e_1$ axis.</li><li>The next reflection is in the plane that contains the $e_1$ axis and is midway between $a_2$ and the $(e_1,e_2)$ plane, and brings $a_2$ onto the $(e_1,e_2)$ plane.</li><li>The next reflection is in the plane that contains the $(e_1,e_2)$ plane and is midway between $x_3$ and the $(e_1,e_2,e_3)$ plane, and brings $a_3$ onto the $(e_1, e_2, e_3)$ plane.</li><li>...</li></ol><div>I.e. in step $i$ (between 1 and $n$), all the vectors of $A$ are reflected in the plane that contains the $(e_1,\dots e_{i-1})$ plane and is midway between $a_i$ and the $(e_1,\dots e_{i})$ plane. The vector is then never reflected again, as it is in all the future planes of reflection.</div><div><br /></div><div>The reflection matrix $Q_i$ can be seen to be of the form (do you see why?):<br /><br />$${Q_i} = \left[ {\begin{array}{*{20}{c}}<br /> I&0 \\<br /> 0&{{F_i}}<br />\end{array}} \right]$$<br />Where $F_i$ is $(n-i)$-dimensional.<br /><br />Here's how we can calculate $F_i$: we know that $F_i$ transforms the latter $n-k$ elements of $a_i$ like this:<br /><br />\[{a^{i + 1:n}_i} \mapsto \left[ {\begin{array}{*{20}{c}}<br /> {\left| {{a^{i + 1:n}_i}} \right|} \\<br /> 0 \\<br /> \vdots \\<br /> 0<br />\end{array}} \right]\]<br />Although this generally isn't enough to pin down a transformation, we already know that the transformation is a reflection in an $n-1$-dimensional plane. We know that the reflection matrix can be given as $1-2hh^T$ where $h$ is the unit normal to the plane of reflection. We can calculate this normal (up to normalization) as $F_ia^{i+1:n}_i-a^{i+1:n}_i$.<br /><br /><b>The QR decomposition is basically just another "all matrices are basically ____" theorem -- similar to the polar decomposition "all matrices are basically positive-definite" -- they're just a rotation away.</b><br /><br />In particular, this gives us a new square root of a positive-definite matrix $M=AA^T$ -- much like the polar decomposition gives a unique positive-definite square root of a positive-definite matrix, the QR decomposition gives a unique triangular square root of a positive-definite matrix $M=RR^T$.<br /><b><br /></b><b>This is known as the Cholesky decomposition.</b><br /><br />(BTW, we can QR-decompose an $m$ by $n$ matrix ($m\ge n$) too. There are two ways to do this<br /><br /><ul><li>Thinking in the Gram-Schmidt sense: we're transforming into an orthogonal basis is the image of $A$, so $Q$ is an $m$ by $n$ matrix of orthonormal vectors and $R$ is an $n$ by $n$ matrix.</li><li>Thinking in the Householder sense, we're obliged to end up with an $m$ by $m$ $Q$ and an $m$ by $n$ $R$ (with the bottom extra rows being zeroes). Then the extra columns of $Q$ are just some arbitrary basis for the orthogonal complement of the image of $A$ (i.e. the left null space of $A$).) </li></ul><br /><hr /><br />We've all solved linear equations $Ax=b$ through elimination, but let's get a consistent general algorithm (Gaussian elimination) to do so.<br /><br />The idea is to get a $A$ down to a triangular form, isn't it? And we can do that by zeroing each column one-by-one. That's <b>LU decomposition</b>. But sometimes you have a zero in the row you don't want to zero that column of, so you can't multiply it with infinity, so just swap two rows ("pivoting") and continue. That's <b>LPU decomposition</b>, where $P$ is a permutation matrix (the $P$ can come before or after the $LU$, too -- the fact that you can factor out the permutation matrix this way is left as an exercise). </div>gaussian eliminationgram-schmidtmatrix decompositiontriangular matrixWed, 08 Apr 2020 19:17:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-3761381537875803238Abhimanyu Pallavi Sudhir2020-04-08T19:17:00ZComment by Abhimanyu Pallavi Sudhir on Doesn't the Many Worlds interpretation of Quantum Mechanics rail to remove randomness?
https://physics.stackexchange.com/questions/536050/doesnt-the-many-worlds-interpretation-of-quantum-mechanics-rail-to-remove-rando/536075#536075
But back to your answer: the question asks if the observations of an individual observer are probabilistic. You say "No, because the observer himself splits into multiple observers". This is a metaphysical spin on what counts as an observer, which does not change the fundamental fact that all your observations remain probabilistic, i.e. you cannot determine what exactly you will see on your apparatus.Mon, 16 Mar 2020 10:52:05 GMThttps://physics.stackexchange.com/questions/536050/doesnt-the-many-worlds-interpretation-of-quantum-mechanics-rail-to-remove-rando/536075?cid=1214528#536075Abhimanyu Pallavi Sudhir2020-03-16T10:52:05ZComment by Abhimanyu Pallavi Sudhir on Doesn't the Many Worlds interpretation of Quantum Mechanics rail to remove randomness?
https://physics.stackexchange.com/questions/536050/doesnt-the-many-worlds-interpretation-of-quantum-mechanics-rail-to-remove-rando/536075#536075
Quantum interpretations themselves are metaphysics. My point is that QM can accomodate any number of observers regardless of your "interpretation".Mon, 16 Mar 2020 10:48:57 GMThttps://physics.stackexchange.com/questions/536050/doesnt-the-many-worlds-interpretation-of-quantum-mechanics-rail-to-remove-rando/536075?cid=1214526#536075Abhimanyu Pallavi Sudhir2020-03-16T10:48:57ZComment by Abhimanyu Pallavi Sudhir on Doesn't the Many Worlds interpretation of Quantum Mechanics rail to remove randomness?
https://physics.stackexchange.com/questions/536050/doesnt-the-many-worlds-interpretation-of-quantum-mechanics-rail-to-remove-rando/536075#536075
QM can have one observer or multiple observers, without changing the physics. This doesn't change when you add a metaphysical interpretation to it.Sat, 14 Mar 2020 09:56:15 GMThttps://physics.stackexchange.com/questions/536050/doesnt-the-many-worlds-interpretation-of-quantum-mechanics-rail-to-remove-rando/536075?cid=1213800#536075Abhimanyu Pallavi Sudhir2020-03-14T09:56:15ZComment by Abhimanyu Pallavi Sudhir on Doesn't the Many Worlds interpretation of Quantum Mechanics rail to remove randomness?
https://physics.stackexchange.com/questions/536050/doesnt-the-many-worlds-interpretation-of-quantum-mechanics-rail-to-remove-rando/536075#536075
This is just some metaphysical comment that has nothing to do with physics. As far as physics is concerned, the individual observer's observations are random.Sat, 14 Mar 2020 00:33:28 GMThttps://physics.stackexchange.com/questions/536050/doesnt-the-many-worlds-interpretation-of-quantum-mechanics-rail-to-remove-rando/536075?cid=1213717#536075Abhimanyu Pallavi Sudhir2020-03-14T00:33:28ZComment by Abhimanyu Pallavi Sudhir on Doesn't the Many Worlds interpretation of Quantum Mechanics rail to remove randomness?
https://physics.stackexchange.com/questions/536050/doesnt-the-many-worlds-interpretation-of-quantum-mechanics-rail-to-remove-rando
@user250486 Yes. That's why I wrote "yes". What I'm saying is that randomness was never a problem to begin with.Sat, 14 Mar 2020 00:31:14 GMThttps://physics.stackexchange.com/questions/536050/doesnt-the-many-worlds-interpretation-of-quantum-mechanics-rail-to-remove-rando?cid=1213715Abhimanyu Pallavi Sudhir2020-03-14T00:31:14ZComment by Abhimanyu Pallavi Sudhir on Doesn't the Many Worlds interpretation of Quantum Mechanics rail to remove randomness?
https://physics.stackexchange.com/questions/536050/doesnt-the-many-worlds-interpretation-of-quantum-mechanics-rail-to-remove-rando
Yes, and randomness isn't a "problem".Thu, 12 Mar 2020 23:10:19 GMThttps://physics.stackexchange.com/questions/536050/doesnt-the-many-worlds-interpretation-of-quantum-mechanics-rail-to-remove-rando?cid=1213249Abhimanyu Pallavi Sudhir2020-03-12T23:10:19ZComment by Abhimanyu Pallavi Sudhir on Given a Factorial number , find the next Factorial number
https://math.stackexchange.com/questions/3573442/given-a-factorial-number-find-the-next-factorial-number
This just amounts to finding n if you know n!. See e.g. <a href="https://www.quora.com/Given-the-value-of-n-factorial-what-is-an-algorithm-to-find-n-given-that-n-is-at-most-1-million-digits-long" rel="nofollow noreferrer">quora</a> for some efficient algorithms. Or search for "inverse factorial function".Sun, 08 Mar 2020 09:03:14 GMThttps://math.stackexchange.com/questions/3573442/given-a-factorial-number-find-the-next-factorial-number?cid=7347460Abhimanyu Pallavi Sudhir2020-03-08T09:03:14ZComment by Abhimanyu Pallavi Sudhir on How to prove the Laurent series converges to the right thing?
https://math.stackexchange.com/questions/3571615/how-to-prove-the-laurent-series-converges-to-the-right-thing
@MartinR Thanks, that actually answers my question. I guess the best way to think of the proof in the answer you linked, and really even stuff like the uniqueness of Laurent series, is to transform it into a Fourier series.Sat, 07 Mar 2020 14:21:26 GMThttps://math.stackexchange.com/questions/3571615/how-to-prove-the-laurent-series-converges-to-the-right-thing?cid=7345808Abhimanyu Pallavi Sudhir2020-03-07T14:21:26ZComment by Abhimanyu Pallavi Sudhir on Proof of Laurent series co-efficients in Complex Residue
https://math.stackexchange.com/questions/1126321/proof-of-laurent-series-co-efficients-in-complex-residue/1200502#1200502
In the statement of Laurent's theorem, the $z$ in the definition of coefficients should be $z_0$? And in equation (1), the integration should be wrt $w$, not $z$.Sat, 07 Mar 2020 11:32:00 GMThttps://math.stackexchange.com/questions/1126321/proof-of-laurent-series-co-efficients-in-complex-residue/1200502?cid=7345553#1200502Abhimanyu Pallavi Sudhir2020-03-07T11:32:00ZComment by Abhimanyu Pallavi Sudhir on How to prove the Laurent series converges to the right thing?
https://math.stackexchange.com/questions/3571615/how-to-prove-the-laurent-series-converges-to-the-right-thing
@JohnColtraneisJC I certainly believe it's holomorphic. It's just "power series is differentiable within its radius of convergence", isn't it? But I don't see how that helps.Fri, 06 Mar 2020 15:16:43 GMThttps://math.stackexchange.com/questions/3571615/how-to-prove-the-laurent-series-converges-to-the-right-thing?cid=7343869Abhimanyu Pallavi Sudhir2020-03-06T15:16:43ZHow to prove the Laurent series converges to the right thing?
https://math.stackexchange.com/questions/3571615/how-to-prove-the-laurent-series-converges-to-the-right-thing
0<p>From what I understand, the main "point" of the Laurent series is that we should be able to derive it easily (e.g. by stitching together known Taylor series), and then exploit its uniqueness to say that these coefficients are the same as <span class="math-container">$c_n=\frac{1}{2\pi i}\oint_\circ \frac{f(z)}{(z-a)^{n+1}} dz$</span>, from which we can calculate e.g. the residue <span class="math-container">$2\pi c_{-1}$</span>.</p>
<p>But to actually do this, we need one crucial fact: the series <span class="math-container">$\sum_{n\in\mathbb{Z}}c_n(z-a)^n$</span> given by <span class="math-container">$c_n=\frac{1}{2\pi i}\oint_\circ \frac{f(z)}{(z-a)^{n+1}} dz$</span> <strong>is actually a valid Laurent series</strong> for <span class="math-container">$f(z)$</span>, i.e. where it converges, it converges to <span class="math-container">$f(z)$</span>.</p>
<p>How does one prove this? It feels like it <em>should</em> follow easily from Cauchy's integral formula, but my brain doesn't seem to be working sensibly at the moment.</p>complex-analysisresidue-calculuslaurent-seriescauchy-integral-formulasingularityFri, 06 Mar 2020 15:09:15 GMThttps://math.stackexchange.com/q/3571615Abhimanyu Pallavi Sudhir2020-03-06T15:09:15ZComment by Abhimanyu Pallavi Sudhir on What is contour integration
https://math.stackexchange.com/questions/446724/what-is-contour-integration/1983989#1983989
I'm saying that language doesn't really add anything -- your definition of ""circle"" is just a function whose integral is a circle, but this is just a trivial restatement of the problem, and you haven't really justified what it is about analytic functions that does this.Sun, 01 Mar 2020 11:16:34 GMThttps://math.stackexchange.com/questions/446724/what-is-contour-integration/1983989?cid=7330823#1983989Abhimanyu Pallavi Sudhir2020-03-01T11:16:34ZComment by Abhimanyu Pallavi Sudhir on Line integration in complex analysis
https://math.stackexchange.com/questions/110334/line-integration-in-complex-analysis/914273#914273
I don't see how this helps. This just describes in word the standard parametric calculation of the value of the integral, not what it represents.Sat, 29 Feb 2020 00:06:11 GMThttps://math.stackexchange.com/questions/110334/line-integration-in-complex-analysis/914273?cid=7327916#914273Abhimanyu Pallavi Sudhir2020-02-29T00:06:11ZComment by Abhimanyu Pallavi Sudhir on What is contour integration
https://math.stackexchange.com/questions/446724/what-is-contour-integration/1983989#1983989
Upvoted for the point about antiderivatives with branch cuts, but "analytic functions take circles to circles" is not at all a relevant statement. All continuous functions keep circles connected -- the point is that the displacement described by the analytic function as its gradient field is a circle, which I don't think you've really intuited in this answer.Sat, 29 Feb 2020 00:01:41 GMThttps://math.stackexchange.com/questions/446724/what-is-contour-integration/1983989?cid=7327909#1983989Abhimanyu Pallavi Sudhir2020-02-29T00:01:41ZComment by Abhimanyu Pallavi Sudhir on Geometrical Interpretation of the Cauchy-Goursat Theorem?
https://math.stackexchange.com/questions/1026181/geometrical-interpretation-of-the-cauchy-goursat-theorem/1026335#1026335
I think the main question is why the integral is path-independent in the first place, which your intuition does not answer.Thu, 27 Feb 2020 01:44:41 GMThttps://math.stackexchange.com/questions/1026181/geometrical-interpretation-of-the-cauchy-goursat-theorem/1026335?cid=7323368#1026335Abhimanyu Pallavi Sudhir2020-02-27T01:44:41ZComment by Abhimanyu Pallavi Sudhir on Explanation Of Cauchy's Integral Theorem
https://math.stackexchange.com/questions/2182077/explanation-of-cauchys-integral-theorem/2182235#2182235
This is a bad answer. Having a gradient is not the same as being a gradient. You just gave the intuition for the fundamental theorem of calculus in complex analysis -- iff a function has an antiderivative, its integral on a closed curve is zero. But the essence of the Cauchy Integral Theorem is that if a function has a derivative, its integral on a closed curve is zero, i.e. differentiability implies integrability (at least on a simply-connected domain).Wed, 26 Feb 2020 18:58:48 GMThttps://math.stackexchange.com/questions/2182077/explanation-of-cauchys-integral-theorem/2182235?cid=7322652#2182235Abhimanyu Pallavi Sudhir2020-02-26T18:58:48ZComment by Abhimanyu Pallavi Sudhir on Easy examples of dual objects in category theory
https://math.stackexchange.com/questions/1273604/easy-examples-of-dual-objects-in-category-theory/1274125#1274125
@JavierArias You asked for dual object in your question.Mon, 10 Feb 2020 15:48:51 GMThttps://math.stackexchange.com/questions/1273604/easy-examples-of-dual-objects-in-category-theory/1274125?cid=7283224#1274125Abhimanyu Pallavi Sudhir2020-02-10T15:48:51ZIs the image a universal object?
https://math.stackexchange.com/questions/3519737/is-the-image-a-universal-object
1<p>Given a function <span class="math-container">$f:X\to Y$</span> in category <span class="math-container">$\mathcal{C}$</span>, one can construct the image as a factorisation <span class="math-container">$f=(e:I\hookrightarrow Y)\circ(g:X\to I)$</span> that is universal (initial) among all such factorisations.</p>
<p>This does seem like a <a href="https://math.stackexchange.com/questions/3511678/trying-to-understand-the-definition-of-a-universal-property">universal property</a>. But I can't figure out how this can actually be constructed as an initial object in a comma category, because there are morphisms both from and to the object.</p>category-theoryuniversal-propertymonomorphismsThu, 23 Jan 2020 12:49:24 GMThttps://math.stackexchange.com/q/3519737Abhimanyu Pallavi Sudhir2020-01-23T12:49:24ZTrying to understand the definition of a universal property
https://math.stackexchange.com/questions/3511678/trying-to-understand-the-definition-of-a-universal-property
1<p><a href="https://en.wikipedia.org/wiki/Universal_property#Formal_definition" rel="nofollow noreferrer">Here</a>'s the definition of a universal property in Wikipedia:</p>
<blockquote>
<p>(where <span class="math-container">$U:D\to C$</span> is a functor and <span class="math-container">$X$</span> is an object in <span class="math-container">$C$</span>)</p>
<p>A <strong>terminal morphism</strong> from <span class="math-container">$U$</span> to <span class="math-container">$X$</span> is a final object in the category <span class="math-container">$(U\downarrow X)$</span> of morphisms from <span class="math-container">$U$</span> to <span class="math-container">$X$</span>, i.e. consists of a pair <span class="math-container">$(A,\Phi)$</span>
where <span class="math-container">$A$</span> is an object of <span class="math-container">$D$</span> and <span class="math-container">$\Phi: U(A) \to X$</span> is a morphism in <span class="math-container">$C$</span>, such
that the following <strong>terminal property</strong> is satisfied:</p>
<ul>
<li>Whenever <span class="math-container">$Y$</span> is an object of <span class="math-container">$D$</span> and <span class="math-container">$f: U(Y) \to X$</span> is a morphism in <span class="math-container">$C$</span>, then
there exists a unique morphism <span class="math-container">$g: Y \to A$</span> such that the following
diagram commutes:</li>
</ul>
<p><span class="math-container">$\ \ \ \ \ \ \ \ \ $</span><a href="https://upload.wikimedia.org/wikipedia/commons/thumb/4/49/UniversalProperty-04.svg/300px-UniversalProperty-04.svg.png" rel="nofollow noreferrer"><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/4/49/UniversalProperty-04.svg/300px-UniversalProperty-04.svg.png" alt="enter image description here"></a></p>
</blockquote>
<p>So I'm trying to "unpack" this definition and figure out what each of the things here "means". E.g. what does it become in the case of a limit, or something.</p>
<ul>
<li>A limit is an example of a terminal morphism, right? And a colimit an initial morphism?</li>
<li>Does <span class="math-container">$U$</span> usually represent a diagram? In the case of a limit, does it represent the diagram we want to take the limit of?</li>
<li>Whatever is <span class="math-container">$X$</span>? I honestly have no clue here. What's the analog in the case of a limit?</li>
<li>What does a morphism from <span class="math-container">$U\to X$</span> even mean? What does it mean in the case of a limit? I've seen morphisms from a diagram to an object in <em>co</em>limits.</li>
<li>In the case of a limit, is <span class="math-container">$(U\downarrow X)$</span> the category of cones? But how can each cone be a morphism from <span class="math-container">$U$</span> to something (I thought it was a morphism from something to <span class="math-container">$U$</span>)?</li>
<li><span class="math-container">$A$</span> (or <span class="math-container">$U(A)$</span>) corresponds to the actual thing we construct, like the source of a limit or the target of a colimit? But what is <span class="math-container">$\Phi$</span>? In the construction of a limit, there's a morphism from the limit to the diagram, this seems wrong.</li>
</ul>
<p>My guess is that <span class="math-container">$X$</span> represents some sort of "subsetting" of the candidates for the object so you don't have to quantify over everything like you do with cones and the limit. Is that right? </p>
<hr>
<p><strong>Edit:</strong> So as a short summary -- it turns out <span class="math-container">$X$</span> represents (in the case of limits and colimits) the diagram we're trying to take the limit of, while <span class="math-container">$A$</span> represents the actual limit object (with its morphism <span class="math-container">$\Phi$</span>). <span class="math-container">$U$</span> is the diagonal functor, because the limit is constructed here as an object in the category of diagrams of shape at most that of <span class="math-container">$X$</span>.</p>category-theorylimits-colimitsuniversal-propertyslice-categoryThu, 16 Jan 2020 20:15:26 GMThttps://math.stackexchange.com/q/3511678Abhimanyu Pallavi Sudhir2020-01-16T20:15:26ZAnswer by Abhimanyu Pallavi Sudhir for Are these definitions of limits the same?
https://math.stackexchange.com/questions/3504151/are-these-definitions-of-limits-the-same/3504158#3504158
0<p>No. Consider <span class="math-container">$f(x)=\sin(1/x)$</span> with the origin added, near <span class="math-container">$x=0$</span>. </p>Fri, 10 Jan 2020 15:12:45 GMThttps://math.stackexchange.com/questions/3504151/-/3504158#3504158Abhimanyu Pallavi Sudhir2020-01-10T15:12:45ZAnswer by Abhimanyu Pallavi Sudhir for Why are inverse images more important than images in mathematics?
https://mathoverflow.net/questions/22658/why-are-inverse-images-more-important-than-images-in-mathematics/349457#349457
0<p>One often thinks of a homomorphism as something that "preserves the structure" of an object, but it is often better to think of it as something that "does not add new information to the object".</p>
<p>The most basic example is in <span class="math-container">$\mathbf{Set}$</span>. The "information" of a set is its cardinality. The defining feature of the morphisms here -- <em>functions</em> -- is that they do not "create new cardinality". A point cannot be mapped to multiple points.</p>
<p>Similarly in <span class="math-container">$\mathbf{Top}$</span>, the "information" of a topological space is the distinguishability of two points. This notion of distinguishability "includes" those used in the separation axioms (so e.g. a continuous map cannot take you from the indiscrete space to the discrete space), but is more general and vague -- the general idea is that two things touching make them "kinda indistinguishable", so you can't tear them apart.</p>
<p>This idea clearly has to do with inverse images -- we're saying that for things in the codomain, the information they carry must have already existed in their preimage. In fact in the previous example, the way to formalise this notion of being "kinda indistinguishable" is best formalised in the language of open sets, and a continuous map <em>can't create new open sets</em>.</p>
<p>Perhaps the clearest example comes from the category of measurable spaces <span class="math-container">$\mathbf{Prob}$</span>. Here, the sigma fields really do represent information, and the definition of a measurable function (or <em>random variable</em>) is that it cannot talk about things that can't be measured. I.e. if a piece of apparatus just measures the number of heads in a coin-flip experiment, we can't have a random variable asking if the first toss was a head. Once again, this notion of "not adding new information" directly corresponds to preimages.</p>
<p>A bit more detail in my post <a href="https://thewindingnumber.blogspot.com/2019/10/sigma-fields-are-venn-diagrams.html" rel="nofollow noreferrer">here</a>.</p>Tue, 31 Dec 2019 17:50:15 GMThttps://mathoverflow.net/questions/22658/-/349457#349457Abhimanyu Pallavi Sudhir2019-12-31T17:50:15ZComment by Abhimanyu Pallavi Sudhir on Weird R behavior with indexing function arrays
https://stackoverflow.com/questions/59142271/weird-r-behavior-with-indexing-function-arrays
I see -- the idea is you just separately define the functional that produces the function from the existing function and call the functional in the loop. Makes sense.Mon, 02 Dec 2019 18:39:26 GMThttps://stackoverflow.com/questions/59142271/weird-r-behavior-with-indexing-function-arrays?cid=104515789Abhimanyu Pallavi Sudhir2019-12-02T18:39:26ZComment by Abhimanyu Pallavi Sudhir on Weird R behavior with indexing function arrays
https://stackoverflow.com/questions/59142271/weird-r-behavior-with-indexing-function-arrays
@Gregor I'm aware of functionals, but I'm not sure what they'd have to do here.Mon, 02 Dec 2019 16:24:00 GMThttps://stackoverflow.com/questions/59142271/weird-r-behavior-with-indexing-function-arrays?cid=104512249Abhimanyu Pallavi Sudhir2019-12-02T16:24:00ZComment by Abhimanyu Pallavi Sudhir on Weird R behavior with indexing function arrays
https://stackoverflow.com/questions/59142271/weird-r-behavior-with-indexing-function-arrays
Ah ok. So I just need to pass <code>i</code> as a variable to the function. Thanks @Gregor.Mon, 02 Dec 2019 16:02:44 GMThttps://stackoverflow.com/questions/59142271/weird-r-behavior-with-indexing-function-arrays?cid=104511603Abhimanyu Pallavi Sudhir2019-12-02T16:02:44ZComment by Abhimanyu Pallavi Sudhir on Weird R behavior with indexing function arrays
https://stackoverflow.com/questions/59142271/weird-r-behavior-with-indexing-function-arrays
@manotheshark Right, so I'm asking if there's a way to get R to immediately evaluate <code>posterior</code> rather than do so lazily.Mon, 02 Dec 2019 15:59:50 GMThttps://stackoverflow.com/questions/59142271/weird-r-behavior-with-indexing-function-arrays?cid=104511498Abhimanyu Pallavi Sudhir2019-12-02T15:59:50ZComment by Abhimanyu Pallavi Sudhir on Weird R behavior with indexing function arrays
https://stackoverflow.com/questions/59142271/weird-r-behavior-with-indexing-function-arrays/59142503#59142503
I figured that's what's going on, but how can I fix it? Can I "turn off" lazy evaluation?Mon, 02 Dec 2019 15:58:57 GMThttps://stackoverflow.com/questions/59142271/weird-r-behavior-with-indexing-function-arrays/59142503?cid=104511470#59142503Abhimanyu Pallavi Sudhir2019-12-02T15:58:57ZWeird R behavior with indexing function arrays
https://stackoverflow.com/questions/59142271/weird-r-behavior-with-indexing-function-arrays
0<p>I'm having some unexpected behaviour in R with function arrays, and I've reduced the problem to a minimal working example:</p>
<pre><code>theory = c(function(p) p)
i = 1
posterior = function(p) theory[[i]](p)
i = 2
posterior(0)
</code></pre>
<p>Which gives me an error saying the subscript <code>i</code> is out of bounds.</p>
<p>So I guess that <code>i</code> is somehow being used as a "free" variable in the definition of <code>posterior</code> so it gets updated when I redefine <code>i</code>. Oddly enough, this works:</p>
<pre><code>theory = c(function(p) p)
i = 1
posterior = theory[[i]]
i = 2
posterior(0)
</code></pre>
<p>How can I avoid this? Note that not redefining <code>i</code> is not an option, as this stuff is going in a for loop where <code>i</code> is the index.</p>rfunctionlambdaMon, 02 Dec 2019 15:41:54 GMThttps://stackoverflow.com/q/59142271Abhimanyu Pallavi Sudhir2019-12-02T15:41:54ZAnswer by Abhimanyu Pallavi Sudhir for In QM, what causes a particle to have more probability to be somewhere else when it's found in a less probable position?
https://physics.stackexchange.com/questions/516683/in-qm-what-causes-a-particle-to-have-more-probability-to-be-somewhere-else-when/516713#516713
0<p>The state collapses after measurement, and if you measure the precise position, it collapses to a position eigenstate (i.e. a precise location), so you no longer have a "probability somewhere else". The probability somewhere else is prior to the measurement.</p>
<p>If you want to learn the real ideas behind quantum mechanics, not "without the math" whatever that means, you should have a look at my partially-written blog post series here: <a href="https://thewindingnumber.blogspot.com/p/quantum-mechanics-i.html" rel="nofollow noreferrer">Quantum Mechanics I - The Winding Number</a>.</p>Fri, 29 Nov 2019 09:43:46 GMThttps://physics.stackexchange.com/questions/516683/-/516713#516713Abhimanyu Pallavi Sudhir2019-11-29T09:43:46ZComment by Abhimanyu Pallavi Sudhir on Because things smell, is everything evaporating?
https://physics.stackexchange.com/questions/515304/because-things-smell-is-everything-evaporating
I guess the essence of the question is: why are spontaneous reactions that produce gaseous products so common? Which probably has to do with the high entropy of gases or something.Thu, 21 Nov 2019 16:23:39 GMThttps://physics.stackexchange.com/questions/515304/because-things-smell-is-everything-evaporating?cid=1161380Abhimanyu Pallavi Sudhir2019-11-21T16:23:39ZComment by Abhimanyu Pallavi Sudhir on Because things smell, is everything evaporating?
https://physics.stackexchange.com/questions/515304/because-things-smell-is-everything-evaporating
The answer to the metal question is here: <a href="https://chemistry.stackexchange.com/questions/7916/why-can-we-smell-copper">Why can we smell copper?</a> and <a href="https://www.livescience.com/4233-coins-smell.html" rel="nofollow noreferrer">here</a>. I guess the standard haemogloubin explanation of the metallic smell of blood is false: <a href="https://www.quora.com/Why-does-blood-smell-like-copper/answer/Song-Chencheng" rel="nofollow noreferrer">Why does blood smell like copper?</a>Thu, 21 Nov 2019 16:20:04 GMThttps://physics.stackexchange.com/questions/515304/because-things-smell-is-everything-evaporating?cid=1161377Abhimanyu Pallavi Sudhir2019-11-21T16:20:04ZAnswer by Abhimanyu Pallavi Sudhir for the vector space of Magic Squares
https://math.stackexchange.com/questions/1692624/the-vector-space-of-magic-squares/3445000#3445000
0<p>Here's an easy way to do this for general <span class="math-container">$n$</span>: given a magic number <span class="math-container">$S$</span>, consider the topleft <span class="math-container">$(n-1)$</span> by <span class="math-container">$(n-1)$</span> submatrix of the square. Given these values, one can fill in the margins by subtracting rows and columns of the submatrix from <span class="math-container">$S$</span>, and the bottom-right entry by subtracting the diagonal of the submatrix from <span class="math-container">$S$</span>.</p>
<p>The only equations remaining to satisfy are: (1) the sum of each new margin equals <span class="math-container">$S$</span> and (2) the sum of the non-principal diagonal equals <span class="math-container">$S$</span>. The condition (1) is the same for each margin (because the last column can be determined given all the rows and all the other columns). So the conditions are, where <span class="math-container">$1\le i,j\le n$</span> and <span class="math-container">$1\lt k\lt n$</span>:</p>
<p><span class="math-container">$$\sum_{i}a_{ii}=(n-1)S-\sum_{ij}a_{ij}$$</span></p>
<p><span class="math-container">$$\sum_k a_{k(n-k+1)}+\sum_j a_{1j}+\sum_i a_{i1}=S$$</span></p>
<p>These can be checked to be linearly independent for <span class="math-container">$n>2$</span>. Allowing <span class="math-container">$S$</span> to be free, the dimension of our space is therefore <span class="math-container">$(n-1)^2-2+1$</span>, which equals: </p>
<p><span class="math-container">$$n^2-2n$$</span></p>
<p>Which indeed gives 3 in the case <span class="math-container">$n=3$</span> Meanwhile, for <span class="math-container">$n=1$</span> and <span class="math-container">$n=2$</span>, the dimension is clearly 1.</p>Thu, 21 Nov 2019 12:04:55 GMThttps://math.stackexchange.com/questions/1692624/-/3445000#3445000Abhimanyu Pallavi Sudhir2019-11-21T12:04:55ZComment by Abhimanyu Pallavi Sudhir on Why are objects at rest in motion through spacetime at the speed of light?
https://physics.stackexchange.com/questions/33840/why-are-objects-at-rest-in-motion-through-spacetime-at-the-speed-of-light/410575#410575
I did say that it's a convention about normalisation ("yes, you can choose other parameterisations"). I'm saying it's not an arbitrary convention, in the sense that it's perfectly sensible to ask that $\tau$ becomes $t$ for a stationary object.Tue, 12 Nov 2019 11:06:21 GMThttps://physics.stackexchange.com/questions/33840/why-are-objects-at-rest-in-motion-through-spacetime-at-the-speed-of-light/410575?cid=1157162#410575Abhimanyu Pallavi Sudhir2019-11-12T11:06:21ZComment by Abhimanyu Pallavi Sudhir on Physical interpretation of complex numbers, part 2
https://physics.stackexchange.com/questions/512297/physical-interpretation-of-complex-numbers-part-2
I think you mean to say "if you scale by i, you rotate it 90 degrees". That's correct.Wed, 06 Nov 2019 13:02:59 GMThttps://physics.stackexchange.com/questions/512297/physical-interpretation-of-complex-numbers-part-2?cid=1154461Abhimanyu Pallavi Sudhir2019-11-06T13:02:59Zggplot aes: alpha gets "smoothed out"
https://stackoverflow.com/questions/58524281/ggplot-aes-alpha-gets-smoothed-out
1<p>I'm using <code>ggplot</code> in the <code>ggplot2</code> R package, with the <code>mpg</code> data set. </p>
<pre><code>classify = function(cls){
if (cls == "suv" || cls == "pickup"){result = 1}
else {result = 0}
return(result)
}
mpg = mpg %>% mutate(size = sapply(class, classify))
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, alpha = size))
</code></pre>
<p>Now, <code>size</code> can take only two values: 1 when class is <code>suv</code> or <code>pickup</code>, and 0 otherwise. But I get a weird "smooth" range of sizes in the resulting plot:</p>
<p><a href="https://i.stack.imgur.com/Plzzy.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/Plzzy.png" alt="enter image description here"></a></p>
<p>(It's not the legend that surprises me, but the fact that there are actually values plotted with alpha 0.1 or 0.3 or whatever.)</p>
<p>What's going on?</p>rggplot2alphaalpha-transparencyWed, 23 Oct 2019 13:43:29 GMThttps://stackoverflow.com/q/58524281Abhimanyu Pallavi Sudhir2019-10-23T13:43:29ZAnswer by Abhimanyu Pallavi Sudhir for Axiomatic Treatment Of Sum Operators Which Work On Divergent Series
https://math.stackexchange.com/questions/3402550/axiomatic-treatment-of-sum-operators-which-work-on-divergent-series/3402796#3402796
2<p><strong>EDIT:</strong></p>
<p>My original answer actually defined a trivial operator -- the fixed formalisation is credit to <a href="https://math.stackexchange.com/users/328173/kenny-lau">Kenny Lau</a> on <a href="https://leanprover.zulipchat.com/#narrow/stream/116395-maths/topic/Axiomatised.20summations/near/178678162" rel="nofollow noreferrer">Zulip</a> (see the link for discussion regarding non-triviality).</p>
<pre><code>import data.real.basic linear_algebra.basis data.finset
open classical
open finset
local attribute [instance, priority 0] prop_decidable
structure is_sum (Sum : (ℕ → ℝ) → ℝ → Prop) : Prop :=
(wd : ∀ {s S₁ S₂}, Sum s S₁ → Sum s S₂ → S₁ = S₂)
(sum_add : ∀ {s t S T}, Sum s S → Sum t T → Sum (λ n, s n + t n) (S + T))
(sum_smul : ∀ {s S} c, Sum s S → Sum (λ n, c * s n) (c * S))
(sum_shift : ∀ {s S}, Sum s S → Sum (λ n, s (n + 1)) (S - s 0))
def has_sum (s : ℕ → ℝ) (S : ℝ) := ∀ Sum, is_sum Sum → ∀ T, Sum s T → T = S
theorem sum_of_has_sum (s : ℕ → ℝ) (S : ℝ) (HS : has_sum s S)
(Sum : (ℕ → ℝ) → ℝ → Prop) (H : is_sum Sum) (T : ℝ) (HT : Sum s T) :
Sum s S :=
by rwa (HS Sum H T HT).symm
theorem has_sum_alt : has_sum (λ n, (-1) ^ n) (1/2) :=
begin
intros Sum HSum T HT,
have H3 := HSum.sum_shift HT,
have H2 := HSum.sum_smul (-1) HT,
have H0 := HSum.wd H2 H3,
change _ = T - 1 at H0,
linarith,
end
theorem has_sum_alt_id : has_sum (λ n, (-1) ^ n * n) (-1/4) :=
begin
intros Sum HSum T HT,
have HC : ∀ n : ℕ, (-1 : ℝ) ^ (n + 1) * (n + 1 : ℕ) + (-1) ^ n * n =
(-1) * (-1) ^ n
:= λ n, by rw [pow_succ, nat.cast_add, mul_add, nat.cast_one, mul_one, add_comm,
←add_assoc, neg_one_mul, neg_mul_eq_neg_mul_symm, add_neg_self, zero_add],
have H3 := HSum.sum_shift HT,
have H1 := HSum.sum_add H3 HT,
have H2 := HSum.sum_smul (-1) H1,
simp only [nat.cast_zero, mul_zero, sub_zero, HC, neg_one_mul, neg_neg] at H2,
have H4 := has_sum_alt Sum HSum _ H2,
linarith,
end
def fib : ℕ → ℝ
| 0 := 0
| 1 := 1
| (n + 2) := fib n + fib (n + 1)
theorem has_sum_fib : has_sum fib (-1) :=
have HC : ∀ n, fib n + fib (n + 1) = fib (n + 2) := λ n, rfl,
begin
intros S HSum T HT,
have H3 := HSum.sum_shift HT,
have H33 := HSum.sum_shift H3,
have H1 := HSum.sum_add HT H3,
have H0 := HSum.wd H1 H33, -- can use linearity instead of wd
simp only [fib, sub_zero] at H0,
linarith,
end
-- if a sequence has two has_sums, everything is its sum
-- (this is the case of not being summable, e.g. 1+1+1+...)
theorem has_sum_unique (s : ℕ → ℝ) (S₁ S₂ : ℝ) (H : S₁ ≠ S₂) :
has_sum s S₁ → has_sum s S₂ → ∀ S', has_sum s S' :=
λ HS₁ HS₂ T₁ Sum HSum T₂ HT₂, false.elim <span class="math-container">$ H $</span> HS₂ Sum HSum S₁ $
sum_of_has_sum s S₁ HS₁ Sum HSum T₂ HT₂
open submodule
-- a sum operator that is "forced" to give a the sum s
-- a valid sum operator iff the shifts of a are linearly independent
-- in which case a can have any sum, and thus has_sum nothing
def forced_sum (s : ℕ → ℝ) (H : linear_independent ℝ (λ m n : ℕ, s (n + m))) (S : ℝ) :
(ℕ → ℝ) → ℝ → Prop :=
λ t T, ∃ Ht : t ∈ span ℝ (set.range (λ m n : ℕ, s (n + m))),
T = finsupp.sum (linear_independent.repr H ⟨t, Ht⟩)
(λ n r, r * (S - (finset.range n).sum s))
-- linear algebra lemma
lemma spanning_set_subset_span
{R M : Type} [ring R] [add_comm_group M] [module R M] {s : set M} :
s ⊆ span R s :=
span_le.mp (le_refl _)
-- finsupp lemma
lemma finsupp.mul_sum'
{α : Type} {β : Type} {γ : Type} [_inst_1 : semiring β] [_inst_2 : semiring γ]
(b : γ) (s : α →₀ β) {f : α → β → γ} (Hf0 : ∀ a, f a 0 = 0)
(Hfa : ∀ a b₁ b₂, f a (b₁ + b₂) = f a b₁ + f a b₂) :
b * finsupp.sum s f = finsupp.sum s (λ (a : α) (c : β), b * f a c) :=
begin
apply finsupp.induction s,
{ rw [finsupp.sum_zero_index, finsupp.sum_zero_index, mul_zero] },
intros A B t Ht HB IH,
rw [finsupp.sum_add_index Hf0 _, finsupp.sum_add_index _ _, mul_add, IH,
finsupp.sum_single_index, finsupp.sum_single_index],
rw [Hf0, mul_zero],
exact Hf0 _,
exact λ a, by rw [Hf0, mul_zero],
intros a b₁ b₂, rw [Hfa, mul_add],
exact Hfa
end
-- finsupp lemma (another one)
lemma function_finsupp_sum (a : ℕ →₀ ℝ) (f : ℕ → ℕ → ℝ → ℝ) (k : ℕ)
(H0 : ∀ a b, f a b 0 = 0) (Hl : ∀ a b c₁ c₂, f a b (c₁ + c₂) = f a b c₁ + f a b c₂) :
(finsupp.sum a (λ m am, (λ n, f n m am))) k = finsupp.sum a (λ m am, f k m am) :=
begin
apply finsupp.induction a,
{ simp only [finsupp.sum_zero_index], refl },
intros t v a ht hv H,
rw [finsupp.sum_add_index, finsupp.sum_add_index,
finsupp.sum_single_index, finsupp.sum_single_index],
{ show f k t v + _ = f k t v + _, rw H },
{ exact H0 _ _ },
{ funext, apply H0 },
{ exact λ r, H0 _ _ },
{ exact Hl _ },
{ exact λ t, funext (λ x, by rw H0; refl) },
{ exact λ a b₁ b₂, funext (λ x, Hl _ _ _ _) }
end
-- show that forced_sum_actually does what we want
lemma forced_sum_val (s : ℕ → ℝ) (S : ℝ)
(H : linear_independent ℝ (λ m n : ℕ, s (n + m))) :
forced_sum s H S s S :=
begin
have Hs₁ : s ∈ set.range (λ m n : ℕ, s (n + m)) := set.mem_range.mpr ⟨0, rfl⟩,
have Hs₂ : s ∈ span ℝ (set.range (λ m n : ℕ, s (n + m))) :=
spanning_set_subset_span Hs₁,
have Hs₃ : (linear_independent.repr H) ⟨s, Hs₂⟩ = finsupp.single 0 1 :=
linear_independent.repr_eq_single H 0 ⟨s, Hs₂⟩ rfl,
use Hs₂, simp [Hs₃, finsupp.sum_single_index],
end
-- forced_sum is a sum: some lemmas for the hard part
noncomputable def shift_repr
(s t : ℕ → ℝ) (Ht : t ∈ span ℝ ((λ (m n : ℕ), s (n + m)) '' set.univ)) :
ℕ →₀ ℝ :=
have trep : _ := (finsupp.mem_span_iff_total ℝ).mp Ht,
finsupp.map_domain (λ x, x + 1) (classical.some trep)
def shift_repr_prop
(s t : ℕ → ℝ) (Ht : t ∈ span ℝ ((λ (m n : ℕ), s (n + m)) '' set.univ)) :
finsupp.sum (shift_repr s t Ht) (λ (m : ℕ) (am : ℝ) (n : ℕ), am * s (n + m)) =
λ (n : ℕ), t (n + 1) :=
have trep : _ := (finsupp.mem_span_iff_total ℝ).mp Ht,
let a : _ := classical.some trep in
let b : _ := shift_repr s t Ht in
have Ha : finsupp.sum a (λ (m : ℕ) (am : ℝ) (n : ℕ), am * s (n + m)) = t :=
classical.some_spec (classical.some_spec trep),
begin
have Hn : ∀ n, (finsupp.sum a (λ (m : ℕ) (am : ℝ) (n : ℕ), am * s (n + m))) n =
t n
:= by rw Ha; exact λ n, rfl,
have Hn' : ∀ (n : ℕ), finsupp.sum a (λ (m : ℕ) (am : ℝ), am * s (n + m)) = t n,
intro n,
rw [←(function_finsupp_sum a _ n _ _), Ha],
exact λ m n, zero_mul _,
exact λ m n q r, add_mul _ _ _,
have Hb : ∀ n, finsupp.sum b (λ m bm, bm * s (n + m)) =
finsupp.sum a (λ m am, am * s (n + 1 + m))
:= by
{ intro n,
convert @finsupp.sum_map_domain_index ℕ ℝ _ ℕ _ _ (λ x, x + 1) a _ _ _,
exact funext (λ m, funext (λ am, by rw [add_assoc, add_comm 1 m])),
exact λ a, zero_mul _,
exact λ n r s, add_mul _ _ _ },
have YAY := λ n, Hn' (n + 1),
have YAY' :
∀ (n : ℕ), finsupp.sum b (λ (m : ℕ) (am : ℝ), am * s (n + m)) = t (n + 1)
:= λ n, by rw [Hb, YAY],
have YAY'' : (λ n, finsupp.sum b (λ (m : ℕ) (am : ℝ), am * s (n + m))) =
(λ n, t (n + 1))
:= funext (λ n, YAY' n),
have primr : (λ n, finsupp.sum b (λ m am, am * s (n + m))) =
(finsupp.sum b (λ m am n, am * s (n + m)))
:= by
{ apply funext, intro n, apply (function_finsupp_sum b _ n _ _).symm,
exact λ m n, zero_mul _,
exact λ m n q r, add_mul _ _ _ },
rw primr at YAY'',
exact YAY'',
end
lemma shift_mem_span_shifts
(s t : ℕ → ℝ) (Ht : t ∈ span ℝ (set.range (λ (m n : ℕ), s (n + m)))) :
(λ n, t (n + 1)) ∈ span ℝ (set.range (λ (m n : ℕ), s (n + m))) :=
begin
rw set.image_univ.symm at Ht ⊢,
let b := shift_repr s t Ht,
have Hb := shift_repr_prop s t Ht,
exact (finsupp.mem_span_iff_total _).mpr
⟨b, ⟨(by rw finsupp.supported_univ; exact submodule.mem_top), by rw ←Hb; refl⟩⟩,
end
lemma forced_sum_shift
(s : ℕ → ℝ) (S : ℝ) (H : linear_independent ℝ (λ m n : ℕ, s (n + m))) :
∀ {t T}, (forced_sum s H S) t T → (forced_sum s H S) (λ n, t (n + 1)) (T - t 0) :=
λ t T ⟨Ht, HT⟩,
begin
use shift_mem_span_shifts s t Ht,
end
-- forced_sum is a sum
lemma is_sum_forced_sum (s : ℕ → ℝ) (S : ℝ)
(H : linear_independent ℝ (λ m n : ℕ, s (n + m))) :
is_sum (forced_sum s H S) :=
⟨ λ t T₁ T₂ ⟨Ht₁, HT₁⟩ ⟨Ht₂, HT₂⟩, by rw [HT₁, HT₂],
λ t₁ t₂ T₁ T₂ ⟨Ht₁, HT₁⟩ ⟨Ht₂, HT₂⟩,
begin
use add_mem _ Ht₁ Ht₂,
change _ = finsupp.sum ((linear_independent.repr H) ⟨t₁ + t₂, _⟩) _,
have Hadd
: (linear_independent.repr H) ⟨t₁ + t₂, _⟩ =
(linear_independent.repr H) _ + (linear_independent.repr H) _
:= (linear_independent.repr H).add ⟨t₁, Ht₁⟩ ⟨t₂, Ht₂⟩,
rw [Hadd, HT₁, HT₂, ←finsupp.sum_add_index],
{ intro a, apply zero_mul },
{ intros a b c, apply add_mul }
end,
λ t T c ⟨Ht, HT⟩,
begin
use smul_mem _ c Ht,
have Hsmul
: (linear_independent.repr H) ⟨λ n, c * t n, _⟩ =
c • (linear_independent.repr H) _
:= (linear_independent.repr H).smul c ⟨t, Ht⟩,
rw [Hsmul, finsupp.sum_smul_index], simp only [mul_assoc],
rw [←finsupp.mul_sum', HT],
exact λ i, (zero_mul _),
exact λ a b c, add_mul _ _ _,
exact λ i, (zero_mul _)
end,
-- we've already done the hard part
λ t T, forced_sum_shift s S H ⟩
theorem no_sum_of_lin_ind_shifts
(s : ℕ → ℝ) (H : linear_independent ℝ (λ m n : ℕ, s (n + m))) :
∀ S : ℝ, ¬ has_sum s S :=
λ S HS,
have X : _ := HS (forced_sum s H (S + 1)) (is_sum_forced_sum s (S + 1) H) (S + 1)
(forced_sum_val s (S + 1) H),
by linarith
-- CHALLENGE: formalise the proof here:
-- https://leanprover.zulipchat.com/#narrow/stream/116395-maths/
-- topic/Axiomatised.20summations/near/178884724
-- REQUIRES GENERATING FUNCTIONS, TAYLOR SERIES -- not currently in Lean!
theorem inv_shifts_lin_ind : linear_independent ℝ (λ m n : ℕ, 1 / (n + m + 1)) :=
begin
end
</code></pre>
<p>Feel free to <a href="https://leanprover-community.github.io/lean-web-editor/" rel="nofollow noreferrer">play with it yourself</a>. And check out the challenge (proving that there exists a sequence that does <em>not</em> have a sum (see the proof in math <a href="https://leanprover.zulipchat.com/#narrow/stream/116395-maths/topic/Axiomatised.20summations/near/178884724" rel="nofollow noreferrer">here</a>). Actually providing an example (e.g. <span class="math-container">$1/n$</span>) may be quite hard (the proof in the chat uses generating functions, which should be hard in Lean), but proving that a sequence with linearly independent shifts has no sum is almost done -- you just need to prove that the forced sum is a sum operator.</p>
<p>(<a href="https://leanprover-community.github.io/lean-web-editor/#code=import%20data.real.basic%20linear_algebra.basis%20data.finset%0Aopen%20classical%0Aopen%20finset%20%0A%0Alocal%20attribute%20%5Binstance%2C%20priority%200%5D%20prop_decidable%0A%0Astructure%20is_sum%20%28Sum%20%3A%20%28%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%E2%86%92%20%E2%84%9D%20%E2%86%92%20Prop%29%20%3A%20Prop%20%3A%3D%0A%28wd%20%3A%20%E2%88%80%20%7Bs%20S%E2%82%81%20S%E2%82%82%7D%2C%20Sum%20s%20S%E2%82%81%20%E2%86%92%20Sum%20s%20S%E2%82%82%20%E2%86%92%20S%E2%82%81%20%3D%20S%E2%82%82%29%0A%28sum_add%20%3A%20%E2%88%80%20%7Bs%20t%20S%20T%7D%2C%20Sum%20s%20S%20%E2%86%92%20Sum%20t%20T%20%E2%86%92%20Sum%20%28%CE%BB%20n%2C%20s%20n%20%2B%20t%20n%29%20%28S%20%2B%20T%29%29%0A%28sum_smul%20%3A%20%E2%88%80%20%7Bs%20S%7D%20c%2C%20Sum%20s%20S%20%E2%86%92%20Sum%20%28%CE%BB%20n%2C%20c%20*%20s%20n%29%20%28c%20*%20S%29%29%0A%28sum_shift%20%3A%20%E2%88%80%20%7Bs%20S%7D%2C%20Sum%20s%20S%20%E2%86%92%20Sum%20%28%CE%BB%20n%2C%20s%20%28n%20%2B%201%29%29%20%28S%20-%20s%200%29%29%0A%0Adef%20has_sum%20%28s%20%3A%20%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%28S%20%3A%20%E2%84%9D%29%20%3A%3D%20%E2%88%80%20Sum%2C%20is_sum%20Sum%20%E2%86%92%20%E2%88%80%20T%2C%20Sum%20s%20T%20%E2%86%92%20T%20%3D%20S%0A%0Aopen%20submodule%0A%0A--%20a%20sum%20operator%20that%20is%20%22forced%22%20to%20give%20a%20the%20sum%20s%0A--%20a%20valid%20sum%20operator%20iff%20the%20shifts%20of%20a%20are%20linearly%20independent%0A--%20in%20which%20case%20a%20can%20have%20any%20sum%2C%20and%20thus%20has_sum%20nothing%0Adef%20forced_sum%20%28s%20%3A%20%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%28H%20%3A%20linear_independent%20%E2%84%9D%20%28%CE%BB%20m%20n%20%3A%20%E2%84%95%2C%20s%20%28n%20%2B%20m%29%29%29%20%28S%20%3A%20%E2%84%9D%29%20%3A%20%0A%20%20%28%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%E2%86%92%20%E2%84%9D%20%E2%86%92%20Prop%20%3A%3D%0A%CE%BB%20t%20T%2C%20%E2%88%83%20Ht%20%3A%20t%20%E2%88%88%20span%20%E2%84%9D%20%28set.range%20%28%CE%BB%20m%20n%20%3A%20%E2%84%95%2C%20s%20%28n%20%2B%20m%29%29%29%2C%0AT%20%3D%20finsupp.sum%20%28linear_independent.repr%20H%20%E2%9F%A8t%2C%20Ht%E2%9F%A9%29%0A%20%20%28%CE%BB%20n%20r%2C%20r%20*%20%28S%20-%20%28finset.range%20n%29.sum%20s%29%29%0A%0A--%20linear%20algebra%20lemma%0Alemma%20spanning_set_subset_span%20%7BR%20M%20%3A%20Type%7D%20%5Bring%20R%5D%20%5Badd_comm_group%20M%5D%20%5Bmodule%20R%20M%5D%20%7Bs%20%3A%20set%20M%7D%20%3A%0A%20%20s%20%E2%8A%86%20span%20R%20s%20%3A%3D%20span_le.mp%20%28le_refl%20_%29%0A%0A--%20finsupp%20lemma%20%28another%20one%29%0Alemma%20function_finsupp_sum%20%28a%20%3A%20%E2%84%95%20%E2%86%92%E2%82%80%20%E2%84%9D%29%20%28f%20%3A%20%E2%84%95%20%E2%86%92%20%E2%84%95%20%E2%86%92%20%E2%84%9D%20%E2%86%92%20%E2%84%9D%29%20%28k%20%3A%20%E2%84%95%29%20%0A%20%20%28H0%20%3A%20%E2%88%80%20a%20b%2C%20f%20a%20b%200%20%3D%200%29%20%28Hl%20%3A%20%E2%88%80%20a%20b%20c%E2%82%81%20c%E2%82%82%2C%20f%20a%20b%20%28c%E2%82%81%20%2B%20c%E2%82%82%29%20%3D%20f%20a%20b%20c%E2%82%81%20%2B%20f%20a%20b%20c%E2%82%82%29%20%3A%20%0A%20%20%28finsupp.sum%20a%20%28%CE%BB%20m%20am%2C%20%28%CE%BB%20n%2C%20f%20n%20m%20am%29%29%29%20k%20%3D%20finsupp.sum%20a%20%28%CE%BB%20m%20am%2C%20f%20k%20m%20am%29%20%3A%3D%0Abegin%0A%20%20apply%20finsupp.induction%20a%2C%0A%20%20%7B%20simp%20only%20%5Bfinsupp.sum_zero_index%5D%2C%20refl%20%7D%2C%0A%20%20intros%20t%20v%20a%20ht%20hv%20H%2C%0A%20%20rw%20%5Bfinsupp.sum_add_index%2C%20finsupp.sum_add_index%2C%20finsupp.sum_single_index%2C%20finsupp.sum_single_index%5D%2C%20%20%0A%20%20%7B%20show%20f%20k%20t%20v%20%2B%20_%20%3D%20f%20k%20t%20v%20%2B%20_%2C%20rw%20H%20%7D%2C%0A%20%20%7B%20exact%20H0%20_%20_%20%7D%2C%0A%20%20%7B%20funext%2C%20apply%20H0%20%7D%2C%0A%20%20%7B%20exact%20%CE%BB%20r%2C%20H0%20_%20_%20%7D%2C%0A%20%20%7B%20exact%20Hl%20_%20%7D%2C%0A%20%20%7B%20exact%20%CE%BB%20t%2C%20funext%20%28%CE%BB%20x%2C%20by%20rw%20H0%3B%20refl%29%20%7D%2C%0A%20%20%7B%20exact%20%CE%BB%20a%20b%E2%82%81%20b%E2%82%82%2C%20funext%20%28%CE%BB%20x%2C%20Hl%20_%20_%20_%20_%29%20%7D%0Aend%0A%0Anoncomputable%20def%20shift_repr%20%0A%20%20%28s%20t%20%3A%20%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%28Ht%20%3A%20t%20%E2%88%88%20span%20%E2%84%9D%20%28%28%CE%BB%20%28m%20n%20%3A%20%E2%84%95%29%2C%20s%20%28n%20%2B%20m%29%29%20''%20set.univ%29%29%20%3A%0A%20%20%E2%84%95%20%E2%86%92%E2%82%80%20%E2%84%9D%20%3A%3D%0Ahave%20trep%20%3A%20_%20%3A%3D%20%28finsupp.mem_span_iff_total%20%E2%84%9D%29.mp%20Ht%2C%0Afinsupp.map_domain%20%28%CE%BB%20x%2C%20x%20%2B%201%29%20%28classical.some%20trep%29%0A%0Adef%20shift_repr_prop%20%0A%20%20%28s%20t%20%3A%20%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%28Ht%20%3A%20t%20%E2%88%88%20span%20%E2%84%9D%20%28%28%CE%BB%20%28m%20n%20%3A%20%E2%84%95%29%2C%20s%20%28n%20%2B%20m%29%29%20''%20set.univ%29%29%20%3A%0A%20%20finsupp.sum%20%28shift_repr%20s%20t%20Ht%29%20%28%CE%BB%20%28m%20%3A%20%E2%84%95%29%20%28am%20%3A%20%E2%84%9D%29%20%28n%20%3A%20%E2%84%95%29%2C%20am%20*%20s%20%28n%20%2B%20m%29%29%20%3D%20%CE%BB%20%28n%20%3A%20%E2%84%95%29%2C%20t%20%28n%20%2B%201%29%20%3A%3D%0Ahave%20trep%20%3A%20_%20%3A%3D%20%28finsupp.mem_span_iff_total%20%E2%84%9D%29.mp%20Ht%2C%0Alet%20a%20%3A%20_%20%3A%3D%20classical.some%20trep%20in%0Alet%20b%20%3A%20_%20%3A%3D%20shift_repr%20s%20t%20Ht%20in%0Ahave%20Ha%20%3A%20finsupp.sum%20a%20%28%CE%BB%20%28m%20%3A%20%E2%84%95%29%20%28am%20%3A%20%E2%84%9D%29%20%28n%20%3A%20%E2%84%95%29%2C%20am%20*%20s%20%28n%20%2B%20m%29%29%20%3D%20t%20%3A%3D%20%0A%20%20classical.some_spec%20%28classical.some_spec%20trep%29%2C%0Abegin%0A%20%20have%20Hn%20%3A%20%E2%88%80%20n%2C%20%28finsupp.sum%20a%20%28%CE%BB%20%28m%20%3A%20%E2%84%95%29%20%28am%20%3A%20%E2%84%9D%29%20%28n%20%3A%20%E2%84%95%29%2C%20am%20*%20s%20%28n%20%2B%20m%29%29%29%20n%20%3D%20%0A%20%20%20%20t%20n%0A%20%20%3A%3D%20by%20rw%20Ha%3B%20exact%20%CE%BB%20n%2C%20rfl%2C%0A%20%20have%20Hn'%20%3A%20%E2%88%80%20%28n%20%3A%20%E2%84%95%29%2C%20finsupp.sum%20a%20%28%CE%BB%20%28m%20%3A%20%E2%84%95%29%20%28am%20%3A%20%E2%84%9D%29%2C%20am%20*%20s%20%28n%20%2B%20m%29%29%20%3D%20t%20n%2C%0A%20%20%20%20intro%20n%2C%0A%20%20%20%20rw%20%5B%E2%86%90%28function_finsupp_sum%20a%20_%20n%20_%20_%29%2C%20Ha%5D%2C%0A%20%20%20%20exact%20%CE%BB%20m%20n%2C%20zero_mul%20_%2C%0A%20%20%20%20exact%20%CE%BB%20m%20n%20q%20r%2C%20add_mul%20_%20_%20_%2C%0A%20%20have%20Hb%20%3A%20%E2%88%80%20n%2C%20finsupp.sum%20b%20%28%CE%BB%20m%20bm%2C%20bm%20*%20s%20%28n%20%2B%20m%29%29%20%3D%20%0A%20%20%20%20finsupp.sum%20a%20%28%CE%BB%20m%20am%2C%20am%20*%20s%20%28n%20%2B%201%20%2B%20m%29%29%0A%20%20%3A%3D%20by%20%0A%20%20%20%20%7B%20intro%20n%2C%20%0A%20%20%20%20%20%20convert%20%40finsupp.sum_map_domain_index%20%E2%84%95%20%E2%84%9D%20_%20%E2%84%95%20_%20_%20%28%CE%BB%20x%2C%20x%20%2B%201%29%20a%20_%20_%20_%2C%20%0A%20%20%20%20%20%20exact%20funext%20%28%CE%BB%20m%2C%20funext%20%28%CE%BB%20am%2C%20by%20rw%20%5Badd_assoc%2C%20add_comm%201%20m%5D%29%29%2C%0A%20%20%20%20%20%20exact%20%CE%BB%20a%2C%20zero_mul%20_%2C%0A%20%20%20%20%20%20exact%20%CE%BB%20n%20r%20s%2C%20add_mul%20_%20_%20_%20%7D%2C%0A%20%20have%20YAY%20%3A%3D%20%CE%BB%20n%2C%20Hn'%20%28n%20%2B%201%29%2C%20%20%20%20%0A%20%20have%20YAY'%20%3A%20%0A%20%20%20%20%E2%88%80%20%28n%20%3A%20%E2%84%95%29%2C%20finsupp.sum%20b%20%28%CE%BB%20%28m%20%3A%20%E2%84%95%29%20%28am%20%3A%20%E2%84%9D%29%2C%20am%20*%20s%20%28n%20%2B%20m%29%29%20%3D%20t%20%28n%20%2B%201%29%20%0A%20%20%3A%3D%20%CE%BB%20n%2C%20by%20rw%20%5BHb%2C%20YAY%5D%2C%0A%20%20have%20YAY''%20%3A%20%28%CE%BB%20n%2C%20finsupp.sum%20b%20%28%CE%BB%20%28m%20%3A%20%E2%84%95%29%20%28am%20%3A%20%E2%84%9D%29%2C%20am%20*%20s%20%28n%20%2B%20m%29%29%29%20%3D%20%0A%20%20%20%20%28%CE%BB%20n%2C%20t%20%28n%20%2B%201%29%29%0A%20%20%3A%3D%20funext%20%28%CE%BB%20n%2C%20YAY'%20n%29%2C%0A%20%20have%20primr%20%3A%20%28%CE%BB%20n%2C%20finsupp.sum%20b%20%28%CE%BB%20m%20am%2C%20am%20*%20s%20%28n%20%2B%20m%29%29%29%20%3D%20%0A%20%20%20%20%28finsupp.sum%20b%20%28%CE%BB%20m%20am%20n%2C%20am%20*%20s%20%28n%20%2B%20m%29%29%29%0A%20%20%3A%3D%20by%20%0A%20%20%7B%20apply%20funext%2C%20intro%20n%2C%20apply%20%28function_finsupp_sum%20b%20_%20n%20_%20_%29.symm%2C%0A%20%20%20%20exact%20%CE%BB%20m%20n%2C%20zero_mul%20_%2C%0A%20%20%20%20exact%20%CE%BB%20m%20n%20q%20r%2C%20add_mul%20_%20_%20_%20%7D%2C%0A%20%20rw%20primr%20at%20YAY''%2C%0A%20%20exact%20YAY''%2C%0Aend%0A%20%0Alemma%20shift_mem_span_shifts%20%0A%20%20%28s%20t%20%3A%20%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%28Ht%20%3A%20t%20%E2%88%88%20span%20%E2%84%9D%20%28set.range%20%28%CE%BB%20%28m%20n%20%3A%20%E2%84%95%29%2C%20s%20%28n%20%2B%20m%29%29%29%29%20%3A%0A%20%20%28%CE%BB%20n%2C%20t%20%28n%20%2B%201%29%29%20%E2%88%88%20span%20%E2%84%9D%20%28set.range%20%28%CE%BB%20%28m%20n%20%3A%20%E2%84%95%29%2C%20s%20%28n%20%2B%20m%29%29%29%20%3A%3D%0Abegin%0A%20%20rw%20set.image_univ.symm%20at%20Ht%20%E2%8A%A2%2C%0A%20%20let%20b%20%3A%3D%20shift_repr%20s%20t%20Ht%2C%0A%20%20have%20Hb%20%3A%3D%20shift_repr_prop%20s%20t%20Ht%2C%0A%20%20exact%20%28finsupp.mem_span_iff_total%20_%29.mpr%20%0A%20%20%20%20%E2%9F%A8b%2C%20%E2%9F%A8%28by%20rw%20finsupp.supported_univ%3B%20exact%20submodule.mem_top%29%2C%20by%20rw%20%E2%86%90Hb%3B%20refl%E2%9F%A9%E2%9F%A9%2C%0Aend%0A%0Alemma%20forced_sum_shift%20%0A%20%20%28s%20%3A%20%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%28S%20%3A%20%E2%84%9D%29%20%28H%20%3A%20linear_independent%20%E2%84%9D%20%28%CE%BB%20m%20n%20%3A%20%E2%84%95%2C%20s%20%28n%20%2B%20m%29%29%29%20%3A%20%0A%20%20%E2%88%80%20%7Bt%20T%7D%2C%20%28forced_sum%20s%20H%20S%29%20t%20T%20%E2%86%92%20%28forced_sum%20s%20H%20S%29%20%28%CE%BB%20n%2C%20t%20%28n%20%2B%201%29%29%20%28T%20-%20t%200%29%20%3A%3D%0A%CE%BB%20t%20T%20%E2%9F%A8Ht%2C%20HT%E2%9F%A9%2C%20%0Abegin%0A%20%20use%20shift_mem_span_shifts%20s%20t%20Ht%2C%0A%0Aend%0A" rel="nofollow noreferrer">Draft</a>)</p>
<hr>
<p><strong>OLD ANSWER:</strong></p>
<p>Here's something to get you started -- I wrote it in Lean, a formal proof-checker, because these things are tricky and I wanted to be completely sure I was being rigorous. I suppose we also need <code>sum_con</code> for convergent sums, but I'm not sure where infinite series are in the Lean math library!</p>
<pre><code>[old code redacted]
</code></pre>Mon, 21 Oct 2019 13:47:08 GMThttps://math.stackexchange.com/questions/3402550/-/3402796#3402796Abhimanyu Pallavi Sudhir2019-10-21T13:47:08ZAnswer by Abhimanyu Pallavi Sudhir for What is the motivation of uniform continuity?
https://math.stackexchange.com/questions/457008/what-is-the-motivation-of-uniform-continuity/3402176#3402176
0<p>One motivation comes from non-standard analysis, i.e. analysis with hyperreal numbers. This view is actually very useful (makes things obvious) when looking at e.g. the uniform limit theorem (the relationship to uniform convergence).</p>
<p>Here, a real function <span class="math-container">$f$</span> is continuous at <span class="math-container">$x$</span> if <span class="math-container">$\hat{f}(x+\varepsilon)-\hat{f}(x)$</span> is infinitesimal for all infinitesimal <span class="math-container">$\varepsilon$</span>. </p>
<p>A real function is <em>uniformly continuous</em> if it is <strong>continuous for all hyperreal <span class="math-container">$x$</span></strong> -- whereas a continuous function only needs to be continuous at real values of <span class="math-container">$x$</span>. </p>
<p>So it's obvious why <span class="math-container">$x^2$</span> is not uniformly continuous -- at <span class="math-container">$\omega$</span>, it turns increments by <span class="math-container">$1/\omega$</span> into increments by <span class="math-container">$1$</span>. Or why <span class="math-container">$1/x$</span> isn't uniformly continuous on the positive reals -- at <span class="math-container">$\varepsilon$</span>, it turns increments by <span class="math-container">$\varepsilon$</span> into increments by <span class="math-container">$1/\varepsilon$</span>. It also explains why <span class="math-container">$\sqrt{x}$</span> <em>is</em> continuous on the positive reals -- although it turns <span class="math-container">$\varepsilon$</span> into <span class="math-container">$\sqrt{\varepsilon}$</span>, which has "higher order" -- <em>that's still an infinitesimal</em>.</p>
<p>In real number speak, this just says that for any two sequences st. <span class="math-container">$x_n-y_n\to 0$</span>, <span class="math-container">$f(x_n)-f(y_n)\to 0$</span> (which is really the "sequential" form of stating uniform continuity). By contrast for continuity, this is only required with constant sequences <span class="math-container">$y_n$</span>.</p>Mon, 21 Oct 2019 00:06:46 GMThttps://math.stackexchange.com/questions/457008/-/3402176#3402176Abhimanyu Pallavi Sudhir2019-10-21T00:06:46ZAnswer by Abhimanyu Pallavi Sudhir for If $S$ is an infinite $\sigma$ algebra on $X$ then $S$ is not countable
https://math.stackexchange.com/questions/320035/if-s-is-an-infinite-sigma-algebra-on-x-then-s-is-not-countable/3396962#3396962
0<p><strong><a href="https://thewindingnumber.blogspot.com/2019/10/sigma-fields-are-venn-diagrams.html" rel="nofollow noreferrer">Sigma algebras are just Venn diagrams.</a></strong> (with some caveats because of all the "<em>countable</em> union" business)</p>
<p>A sigma field <span class="math-container">$\mathcal{F}$</span> on <span class="math-container">$X$</span> defines an equivalence relation on <span class="math-container">$X$</span> where <span class="math-container">$x\sim y$</span> iff <span class="math-container">$\forall E\in \mathcal{F},x\in E\iff y\in E$</span>. This partition is just the partition defined by the Venn diagram -- by the little intersection regions. The important point is that there is a bijection <span class="math-container">$\mathcal{F}\leftrightarrow \mathcal{P}(X/\sim)$</span> -- this should also be obvious with the Venn diagrams.</p>
<p>So what are the possible values for the cardinality of a power set?</p>Thu, 17 Oct 2019 01:10:41 GMThttps://math.stackexchange.com/questions/320035/-/3396962#3396962Abhimanyu Pallavi Sudhir2019-10-17T01:10:41ZComment by Abhimanyu Pallavi Sudhir on Schrödinger equation derivation and Diffusion equation
https://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217#145217
But in any case, this is going off a tangent -- my point is that the claim in your answer that "the Schrodinger equation is a wave equation" is not a useful one, especially for this question, which explicitly asks if the formal relation between the diffusion equation and the Schrodinger equation. The observation that the Schrodinger equation admits sinusoidal solutions is not a particularly enlightening one, nor is it very revealing to point out that the classical diffusion equation doesn't.Sun, 13 Oct 2019 08:35:01 GMThttps://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217?cid=1144660#145217Abhimanyu Pallavi Sudhir2019-10-13T08:35:01ZComment by Abhimanyu Pallavi Sudhir on Schrödinger equation derivation and Diffusion equation
https://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217#145217
Sorry, but your definition makes no sense -- e.g. linear combinations of such solutions are also waves. But I don't deny that you <i>can</i> make a definition in your sense, just that it's very conceptually useful. It may be conceptually useful to classify the "higher-order derivatives in $x$" cases as waves if they are to be understood as "corrections" of an ordinary wave of sorts, I don't know. You can replace my definition with $\partial_\mu\partial_\nu\boldsymbol{\Psi}=0$ if you like.Sun, 13 Oct 2019 08:31:30 GMThttps://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217?cid=1144657#145217Abhimanyu Pallavi Sudhir2019-10-13T08:31:30ZComment by Abhimanyu Pallavi Sudhir on Schrödinger equation derivation and Diffusion equation
https://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217#145217
What are you talking about? What's your definition of a wave? You can invent an obfuscated definition of a "wave" under which the Schrodinger equation is a "wave equation", but it would still be <i>conceptually different</i> from the wave equation $\partial^2\psi/\partial x^2=\partial^2\psi/\partial t^2$. Physically <i>fundamentally different</i> equations ought to be called different names, even if some specific solutions appear similar to you -- this isn't "arbitrary".Sat, 12 Oct 2019 16:00:14 GMThttps://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217?cid=1144458#145217Abhimanyu Pallavi Sudhir2019-10-12T16:00:14ZNon-surjectivity of exponential map: how to understand?
https://math.stackexchange.com/questions/3383612/non-surjectivity-of-exponential-map-how-to-understand
0<p>I'm given to understand the exponential map is not generally surjective -- the standard example is <span class="math-container">$\mathrm{SL}(\mathbb{R}^2)$</span> <a href="https://math.stackexchange.com/questions/643216/non-surjectivity-of-the-exponential-map-to-sl2-mathbbc">[ 1 ]</a>. </p>
<p>I can clearly see why this is so in the non-connected case -- the tangent space is a tangent space to the connected component alone, so its image must be contained in the connected component. <strong>I do not see why the map isn't surjective in the connected case.</strong></p>
<p>I also don't see why the map is then <em>again</em> surjective in the compact case -- <a href="https://en.wikipedia.org/wiki/Maximal_torus#Properties" rel="nofollow noreferrer">wikipedia</a> claims that this is a special case of "the exponential map is surjective if every element is contained in a maximal torus". Is this right? Is there a good way to understand why this is true?</p>
<hr>
<p>Note that I am not looking for counter-examples: I'm aware of them. I'm looking for intuition -- perhaps a clever look at what the image of the exponential map actually looks like in the non-surjective case (how it "misses" some of the points in the group). </p>
<p>As an analogy, if asked to explain smooth non-analytic functions, it would be more instructive (than simply providing the example of <span class="math-container">$e^{-1/x}$</span>) to explain that a function may grow slower than all polynomials near zero -- and provide the construction as <span class="math-container">$1/f(1/x)$</span> from any function <span class="math-container">$f$</span> that grows faster than all polynomials as <span class="math-container">$x\to\infty$</span>.</p>
<p>(See <a href="https://math.stackexchange.com/questions/3368390/developing-intuition-for-lie-groups-and-lie-algebras">here</a> for more examples of the kind of intuition I'm looking for, within the context of Lie theory.)</p>lie-groupslie-algebrasMon, 07 Oct 2019 00:43:08 GMThttps://math.stackexchange.com/q/3383612Abhimanyu Pallavi Sudhir2019-10-07T00:43:08ZIntuition for the Killing form as "automorphism-invariant symmetric bilinear form"
https://math.stackexchange.com/questions/3369402/intuition-for-the-killing-form-as-automorphism-invariant-symmetric-bilinear-for
1<p>Here's my idea for motivating the Killing form: the <em>only notion</em> we have of magnitudes and angles in a Lie algebra comes from conjugations, as they can be understood to be the "natural" transformations on the Lie algebra. So it's natural to ask for a norm map that satisfies <span class="math-container">$\forall g\in G$</span>,</p>
<p><span class="math-container">$$\|X\|=\|\mathrm{Ad}_gX\|$$</span>
And hopefully we can then use symmetry to pin down a bilinear form. The idea is that we can already compare two vectors in the same line, and this condition creates <em>contours</em> that are precisely the <em>orbits of conjugation</em>, which means allowing us to compare vectors in the same ideal. </p>
<p>So in a simple Lie algebra, the bilinear form would then be completely determined up to scaling.</p>
<p>Am I on a sensible track? I guess what I'm asking is:</p>
<ol>
<li>Am I right to believe that "bilinear, symmetric and automorphism-invariant" uniquely determine the Killing form (up to scaling) for simple Lie algebras? </li>
<li>If so, how can I prove the <span class="math-container">$\mathrm{tr}(\mathrm{ad}(x)\mathrm{ad}(y))$</span> formula from this characterisation?</li>
<li>How might I extend this intuition to non-simple Lie algebras? I think I can "see" why the "semisimple equivalent to non-degenerate" property is true, though.</li>
</ol>
<p>(See <a href="https://math.stackexchange.com/questions/3368390/developing-intuition-for-lie-groups-and-lie-algebras">here</a> for examples of the kind of intuition, motivation I'm looking for. Based on advice there, I'm splitting my "intuition for Lie algebras" questions.)</p>lie-groupslie-algebrasautomorphism-groupWed, 25 Sep 2019 13:10:24 GMThttps://math.stackexchange.com/q/3369402Abhimanyu Pallavi Sudhir2019-09-25T13:10:24ZDeveloping intuition for Lie groups and Lie algebras
https://math.stackexchange.com/questions/3368390/developing-intuition-for-lie-groups-and-lie-algebras
6<p><strong>Background:</strong> Until now, I've been able to <em>motivate</em> everything I've learned in mathematics, and develop some solid insights for everything. But I learned some Lie theory this summer, and while I have a good grasp of the elementary aspects and strong intuition for <em>some</em> or even <em>most</em> of what I've learned, there are some "holes" in my understanding of Lie algebras.</p>
<p>To give you an idea of what I'm looking for, I'll list some examples of things in Lie theory I <strong>DO understand</strong> and am able to motivate:</p>
<ul>
<li>The notion of a <strong>Lie group</strong> itself -- the idea comes from wanting to generalise what we know about discrete groups to more complicated contexts where the "manifold" structure of the group allows us to do so. Examples: <strong>compactness</strong> generalises finiteness, <strong>one-parameter groups</strong> generalise cyclic groups, etc.</li>
<li>The <strong>exponential map</strong> -- For one-parameter groups to generalise cyclic groups, we need a "generalisation" of the group power to allow "real-index powers". The general way to define a <strong>real power</strong> is through the exponential map. Well, this real power stuff isn't <em>always</em> defined as it turns out (you need the exponential map to be surjective), but our motivation does explain why it "makes sense" that the <strong>exponential map is surjective in the connected abelian case</strong> (because then, the Lie algebra is basically a co-ordinate system on the Lie group -- I'm aware exponential co-ordinates are defined in more generality, but it's certainly more well-behaved here).</li>
<li>The <strong>Lie algebra</strong>, i.e. "why is the logarithm/parameter space the tangent space?" We'd like to generalise the notion of a generator to a Lie group -- consider e.g. the circle group on the complex plane. An element near the identity generates a cyclic group, and as the element goes nearer to the identity -- as it becomes an <strong>infinitesimal generator</strong>, the cyclic group it approaches the entire group. Well, an element close to the identity is of the form <span class="math-container">$1+\varepsilon t X$</span>, and generates a group element as <span class="math-container">$(1+\varepsilon tX)^{1/\varepsilon}=e^{tX}$</span>. This is also intuition for the compound-interest limit, and for Euler's identity.</li>
<li>The <strong>Lie bracket</strong> is the second-derivative of the commutator curve <span class="math-container">$\gamma(t)=e^{tX}e^{tY}e^{-tX}e^{-tY}$</span>. Well, it's also the derivative of <span class="math-container">$\gamma(\sqrt{t})$</span>, which proves <strong>closure under the Lie bracket</strong>. </li>
<li>The real justification for the Lie bracket, however, comes from the fundamental fact that <span class="math-container">$\mathrm{ad}:\mathfrak{g}\to\mathrm{Der}(\mathfrak{g}):=X\to[X,\cdot]$</span> is the differential of the adjoint map <span class="math-container">$\mathrm{Ad}:G\to\mathrm{Aut}(G):=g\mapsto\lambda x, gxg^{-1}$</span>, which is a group homomorphism. In particular, the preservation of the Lie Bracket by the differential of a group homomorphism is precisely the <strong>Jacobi identity</strong>: <span class="math-container">$\mathrm{ad}([x,y])=[\mathrm{ad}(x),\mathrm{ad}(y)]$</span>. The basic point is that we are trying to reduce Lie group problems to Lie algebra ones as much as possible, and conjugation is an important idea that we'd like to see the map induced by on the Lie algebra -- we are seeing the result of the obvious fact that <span class="math-container">$T\mathrm{Aut}(G)\subseteq\mathrm{Der}(TG)$</span> (and also <span class="math-container">$T\mathrm{Aut}(M)=\mathrm{Der}(M)$</span> -- the fact that the automorphisms of an object form a group is equivalent to the derivations on an object forming a Lie algebra). Some more examples of the "study the Lie algebra approach":
<ul>
<li>The uniqueness of the determinant as a map from <span class="math-container">$G\to \mathbb{R}-\{0\}$</span>.</li>
<li>An <strong>ideal</strong> is a subalgebra "induced" on the Lie algebra by a normal subgroup of the Lie group. This immediately provides the interpretation as "kernels of Lie algebra homomorphisms" as well as the condition <span class="math-container">$[\mathfrak{g},\mathfrak{i}]\subseteq\mathfrak{i}$</span>. </li>
</ul></li>
<li>The idea behind the manifold-structure of a Lie group is that the flows are produced by left-multiplication by group elements, so those must be homeomorphisms. This motivation can be confirmed through various topological consequences, e.g.
<ul>
<li><strong>A neighbourhood of the identity generates the connected component.</strong> The idea behind the proof is this: if an entire open neighbourhood of the identity is contained in the subgroup, it means you can "flow in any direction" from the subgroup -- but to bring these flows to an arbitrary point of the manifold, you need left-multiplication to be a homeomorphism. </li>
<li><strong>The identity component is a (normal) subgroup.</strong> Because left-multiplication and inversion are continuous, they cannot tear the connected component apart (generalised "intermediate value theorem"), so it is closed under multiplication.</li>
<li><strong>Compact Lie groups</strong> -- How can a Lie group possibly "close in on itself"? Surely we keep "extending" an open neighbourhood <span class="math-container">$W$</span> of the identity by observing that <span class="math-container">$xW$</span> must be in the subgroup? The idea is that these translations of <span class="math-container">$W$</span> form an <strong>open cover of the group, if it has a finite subcover</strong>, then it makes sense for the group to close in on itself. By playing around with different open neighbourhoods <span class="math-container">$W$</span> and taking some suitable unions, one can see that this is equivalent to the condition that every open cover has a finite subcover, i.e. the group is compact.</li>
</ul></li>
<li><strong>Characterisation of Abelian Lie groups</strong> -- "Compact Connected Abelian Lie Group is a torus" is a generalisation of "finite Abelian group is a product of cyclic groups" -- the idea is that the exponential map "wraps" the Lie algebra around into the Lie group -- this just gives the quotient of the Lie algebra by the kernel of the exponential map, which is topologically <span class="math-container">$\mathbb{R}^n/\mathbb{Z}^n$</span>. The characterisation of a connected Abelian Lie group as a cylinder <span class="math-container">$\mathbb{R}^{n+k}/\mathbb{Z}^k$</span> follows similarly.</li>
</ul>
<p>With that said, here are some stuff I <strong>DON'T (completely) understand</strong>, and would like to have a similar level of understanding for:</p>
<ul>
<li>Why is the <strong>structure of a Lie group characterised by its second-order structure</strong>? I know that this follows from the <strong>BCH formula</strong>, the local diffeomorphism nature of the exponential map and the fact that an open neighbourhood of 1 generates the group, but I have no intuition at all why the BCH formula "should" be true.</li>
<li>What's the deal with <strong>simply-connected groups</strong>? I can certainly see why the Lie algebra cannot detect disconnectedness in a group -- I had expected that it could not detect compactness either (whether the group closes in on itself eventually), so the statement of Lie's third theorem would be "every Lie algebra has a corresponding unique connected, compact Lie group". Instead, the statement is "every Lie algebra has a corresponding <em>simply connected</em> Lie group".</li>
<li><strong>Non-surjectivity of the exponential map</strong> even in the connected case -- I'm not asking for counter-examples, I'm asking "what exactly goes wrong in groups like <span class="math-container">$SL_\mathbb{R}(2)$</span>?", perhaps a hint about "what does the image of the exponential map look like?" (as an analogy, I would explain smooth functions failing to be analytic as "they are flatter than every polynomial at 0, and can be constructed as <span class="math-container">$1/f(1/x)$</span> where <span class="math-container">$f$</span> is any function that grows faster than every polynomial")</li>
<li><strong>Surjectivity when every element is contained in a maximal torus</strong> -- I read this <a href="https://en.wikipedia.org/wiki/Maximal_torus#Properties" rel="nofollow noreferrer">here</a> as a generalisation of "the exponential map is surjective in the connected compact case". Even if the generalisation isn't true, is there an intuitive way to understand why compactness makes the problem in the previous point go away.</li>
<li><strong>Characterisation of non-Abelian Lie groups</strong> -- Tell me if my understanding of simple and semisimple Lie algebras makes sense -- we want to classify non-Abelian Lie groups as products like we do Abelian Lie groups, and the only way to do so is as "semidirect products of simple Lie groups and Abelian Lie groups". A <strong>reductive</strong> Lie group is basically when this semidirect product is a direct product, and a <strong>semi-simple</strong> one is a reductive Lie group where there are no Abelian groups in the product. Is this right?</li>
<li><strong>Various abstract algebraic things</strong> -- I have no idea how to interpret things like nilpotent and solvable Lie algebras, radicals and so on in the context of Lie theory. </li>
<li>At first when I heard of the <strong>Killing form</strong>, I presumed it would be some "natural" way to define a dot product on the Lie algebra -- but I honestly don't see how it is natural. Is it the <em>only</em> dot product that is invariant under Lie algebra automorphisms? </li>
</ul>
<p>I've thought very hard about the theory, but I just can't seem to figure out how to fill these "holes". <strong>Am I missing some important central insight into Lie theory that are crucial to some of these questions?</strong></p>lie-groupslie-algebrasintuitionmatrix-exponentialTue, 24 Sep 2019 17:22:34 GMThttps://math.stackexchange.com/q/3368390Abhimanyu Pallavi Sudhir2019-09-24T17:22:34ZAnswer by Abhimanyu Pallavi Sudhir for How to develop intuition in topology?
https://math.stackexchange.com/questions/576593/how-to-develop-intuition-in-topology/3364031#3364031
0<p>Let's do an example: let's say we want to know when limits are unique in a topological space. Here's the proof of the theorem in a metric space:</p>
<blockquote>
<p>Let <span class="math-container">$(a_n)$</span> be a sequence with limits <span class="math-container">$L_1$</span> and <span class="math-container">$L_2$</span>. Then <span class="math-container">$a_n$</span> is eventually within every neighbourhood of <span class="math-container">$L_1$</span> and every neighbourhood of <span class="math-container">$L_2$</span>. If <span class="math-container">$L_1\ne L_2$</span>, we can choose the neighbourhoods to be disjoint. Contradiction.</p>
</blockquote>
<p>This is completely equivalent to the proof you've probably seen, but I've phrased everything in terms of neighbourhoods, which are fundamentally topological concepts. The only fact we used is the existence of disjoint neighbourhoods of distinct points. Limits being unique is pretty important, so we call a space where distinct points allow disjoint neighbourhoods a <strong>Hausdorff space</strong> or <strong>T2 space</strong>.</p>
<p>(It's also worth thinking about why the generalisation goes for limits of <em>nets</em>, rather than limits of <em>sequences</em>)</p>
<p>The trick I'm suggesting is to "work backwards" from theorems you can tell are important (as opposed to some inane statement about open sets): (1) start with an important theorem in analysis, (2) go through its proof, (3) work out what axioms you need and simplify them to a form involving just open sets.</p>
<p>Some more examples of such generalisations:</p>
<ul>
<li>Every open neighbourhood of a limit point of <span class="math-container">$S$</span> contains an infinite number of points in <span class="math-container">$S$</span>. (T1 space)</li>
<li>Finite sets are closed. (T1 space)</li>
<li>Continuous extension theorem. (T4 space)</li>
<li>Bolzano-Weierstrass theorem (compact sets)</li>
<li>Intermediate value theorem (connected sets)</li>
</ul>
<p>You may find this series of articles I wrote illuminating to this end: <a href="https://thewindingnumber.blogspot.com/p/2204.html" rel="nofollow noreferrer">https://thewindingnumber.blogspot.com/p/2204.html</a></p>Sat, 21 Sep 2019 04:32:49 GMThttps://math.stackexchange.com/questions/576593/-/3364031#3364031Abhimanyu Pallavi Sudhir2019-09-21T04:32:49ZAnswer by Abhimanyu Pallavi Sudhir for Adjoint map is Lie homomorphism
https://math.stackexchange.com/questions/1339289/adjoint-map-is-lie-homomorphism/3355482#3355482
0<p><span class="math-container">$\mathrm{ad}_X$</span> is not a Lie Homomorphism, but <span class="math-container">$\mathrm{ad}$</span> is. We can define a map <span class="math-container">$\mathrm{Ad}:G\to\mathrm{Aut}(G):=\lambda x.\lambda y.\ xyx^{-1}$</span>, whose differential is then <span class="math-container">$\mathrm{ad}:TG\to T\mathrm{Aut}(G):=\lambda X.\lambda Y.\ [X,Y]$</span>. The homomorphism property on this map is then precisely the Jacobi identity.</p>Fri, 13 Sep 2019 17:49:45 GMThttps://math.stackexchange.com/questions/1339289/-/3355482#3355482Abhimanyu Pallavi Sudhir2019-09-13T17:49:45ZAnswer by Abhimanyu Pallavi Sudhir for A subset of a compact set is compact?
https://math.stackexchange.com/questions/212181/a-subset-of-a-compact-set-is-compact/3346082#3346082
0<p>Here's an alternate proof (for closed subsets, obviously): any net on <span class="math-container">$S$</span> is a net on <span class="math-container">$T$</span> and thus has a convergent subnet with limit is in <span class="math-container">$T$</span> -- but its limit must also be in <span class="math-container">$S$</span> because <span class="math-container">$S$</span> is closed.</p>
<p>It's a little tricky because the notion of closed sets and compact sets are intuitively very similar.</p>Fri, 06 Sep 2019 09:17:08 GMThttps://math.stackexchange.com/questions/212181/-/3346082#3346082Abhimanyu Pallavi Sudhir2019-09-06T09:17:08ZComment by Abhimanyu Pallavi Sudhir on Why is a topology made up of 'open' sets?
https://mathoverflow.net/questions/19152/why-is-a-topology-made-up-of-open-sets/19173#19173
Ah wait, no it's fine -- your axiom 4 implies the converse of axiom 3, and preservation of binary unions leads to $A\subseteq B\Rightarrow \mathrm{cl}(A)\subseteq\mathrm{cl}(B)$.Wed, 21 Aug 2019 20:16:16 GMThttps://mathoverflow.net/questions/19152/why-is-a-topology-made-up-of-open-sets/19173?cid=847795#19173Abhimanyu Pallavi Sudhir2019-08-21T20:16:16ZComment by Abhimanyu Pallavi Sudhir on Why is a topology made up of 'open' sets?
https://mathoverflow.net/questions/19152/why-is-a-topology-made-up-of-open-sets/19173#19173
Wait -- so in a pre-topology, it's no longer true true that "if $x$ touches $A\subset B$, then $x$ touches $B$"?Wed, 21 Aug 2019 07:32:27 GMThttps://mathoverflow.net/questions/19152/why-is-a-topology-made-up-of-open-sets/19173?cid=847627#19173Abhimanyu Pallavi Sudhir2019-08-21T07:32:27ZComment by Abhimanyu Pallavi Sudhir on Schrödinger equation derivation and Diffusion equation
https://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217#145217
What? Under the standard definition of a "wave equation", it must be second-order in time, which the Schrodinger equation is not. It may allow wave-like solutions, but it's fundamentally a (Wick rotated) diffusion equation.Tue, 20 Aug 2019 03:53:25 GMThttps://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217?cid=1121832#145217Abhimanyu Pallavi Sudhir2019-08-20T03:53:25ZIs a (finite) group determined by its subgroups?
https://math.stackexchange.com/questions/3323761/is-a-finite-group-determined-by-its-subgroups
6<p><strong>Motivation</strong></p>
<p>I think of the "structure" of a topological space <span class="math-container">$X$</span> as being the limit operator on functions <span class="math-container">$I\to X$</span> where <span class="math-container">$I$</span> could be the natural numbers or another topological space -- in this sense, a topological homomorphism (continuous function) <span class="math-container">$f$</span> is a function that commutes with the limit operation <span class="math-container">$f(\lim x)=\lim f(x)$</span>, similar to how a group homomorphism commutes with group multiplication <span class="math-container">$f(\mathrm{mult}(x,y))=\mathrm{mult}(f(x),f(y))$</span> and a linear transformation commutes with linear combination.</p>
<p>Nonetheless, it can be shown that this structure can be determined uniquely by the set of open sets on <span class="math-container">$X$</span>. One may also understand these open sets to be the "sub-(topological spaces)" of <span class="math-container">$X$</span> as the topology of <span class="math-container">$X$</span> is inherited by them exactly (well, the closed sets are also a "dual" kind of sub-topological spaces). </p>
<p>Similarly, given a set <span class="math-container">$V$</span> and a list of subsets that we call "subspaces" (which would have to satisfy some properties), one can determine the vector space up to isomorphism (i.e. we can find its dimension).</p>
<hr>
<p>I wonder if something like this can be done with groups. Given a set <span class="math-container">$G$</span> and a list of subsets we call its "subgroups", can we determine the group up to isomorphism? At least for finite sets?</p>
<p>Example given the set <span class="math-container">$\{0, 1, 2, 3\}$</span>, we'd be given the following "subgroup structure" on it: <span class="math-container">$\{\{0\},\{0,2\},\{0,1,2,3\}\}$</span>, and the group being described is <span class="math-container">$C_4$</span>. The positions of 1 and 3 aren't determined, but the group is still determined to isomorphism.</p>group-theoryfinite-groupsThu, 15 Aug 2019 05:43:06 GMThttps://math.stackexchange.com/q/3323761Abhimanyu Pallavi Sudhir2019-08-15T05:43:06ZComment by Abhimanyu Pallavi Sudhir on How is it possible that consciousness-causes-collapse interpretations of QM are not falsified by the Quantum Zeno effect?
https://physics.stackexchange.com/questions/495125/how-is-it-possible-that-consciousness-causes-collapse-interpretations-of-qm-are/495137#495137
@Wolphramjonny Just write down the state vector for the combined system of the (not yet measured) "non-conscious" apparatus and the system being measured. This represents the "knowledge of the system according to an external observer". As you can see, metaphysical questions about the "knowledge of the apparatus" are not involved in the expression.Mon, 05 Aug 2019 07:46:28 GMThttps://physics.stackexchange.com/questions/495125/how-is-it-possible-that-consciousness-causes-collapse-interpretations-of-qm-are/495137?cid=1115461#495137Abhimanyu Pallavi Sudhir2019-08-05T07:46:28ZAnswer by Abhimanyu Pallavi Sudhir for How is it possible that consciousness-causes-collapse interpretations of QM are not falsified by the Quantum Zeno effect?
https://physics.stackexchange.com/questions/495125/how-is-it-possible-that-consciousness-causes-collapse-interpretations-of-qm-are/495137#495137
1<p>If you accept positivism, it becomes obvious that "consciousness causes collapse" cannot possibly be distinguished experimentally from the Copahangen principle as long as you accept that <em>you</em> are conscious. </p>
<p>This "interpretation" makes claims about the knowledge of <em>another</em> (non-conscious) observer, claiming that it does not alter the state of other systems. But this is fundamentally a metaphysical claim -- it's like asking "what if my red is your blue and my blue is your red?" Whatever your metaphysical belief on whether a non-conscious observer "already" caused a wavefunction collapse, your knowledge only changes when you observe the system, be it of that non-conscious observer.</p>Sun, 04 Aug 2019 05:52:21 GMThttps://physics.stackexchange.com/questions/495125/-/495137#495137Abhimanyu Pallavi Sudhir2019-08-04T05:52:21ZComment by something on Was "Crook's algorithm" for Sudoku really only developed in the 21st century?
https://puzzling.stackexchange.com/questions/86805/was-crooks-algorithm-for-sudoku-really-only-developed-in-the-21st-century
@ArnaudMortier How so?Fri, 02 Aug 2019 16:54:03 GMThttps://puzzling.stackexchange.com/questions/86805/was-crooks-algorithm-for-sudoku-really-only-developed-in-the-21st-century?cid=252361something2019-08-02T16:54:03ZComment by something on Was "Crook's algorithm" for Sudoku really only developed in the 21st century?
https://puzzling.stackexchange.com/questions/86805/was-crooks-algorithm-for-sudoku-really-only-developed-in-the-21st-century
@GarethMcCaughan Note that (1) and (2) are special cases of (3) for K = 1, K = total number of empty squares - 1.Fri, 02 Aug 2019 16:53:28 GMThttps://puzzling.stackexchange.com/questions/86805/was-crooks-algorithm-for-sudoku-really-only-developed-in-the-21st-century?cid=252360something2019-08-02T16:53:28ZComment by Abhimanyu Pallavi Sudhir on Why would ReLU work as an activation function at all?
https://stats.stackexchange.com/questions/297947/why-would-relu-work-as-an-activation-function-at-all/298159#298159
Except the standard proof of the universal approximation theorem relies on the boundedness of the activation functions. There are <a href="https://arxiv.org/pdf/1505.03654.pdf" rel="nofollow noreferrer">extensions</a>, but the fact that ReLU works is not obvious to me.Fri, 02 Aug 2019 16:50:58 GMThttps://stats.stackexchange.com/questions/297947/why-would-relu-work-as-an-activation-function-at-all/298159?cid=784263#298159Abhimanyu Pallavi Sudhir2019-08-02T16:50:58ZWas "Crook's algorithm" for Sudoku really only developed in the 21st century?
https://puzzling.stackexchange.com/questions/86805/was-crooks-algorithm-for-sudoku-really-only-developed-in-the-21st-century
4<p>The following algorithm for simplifying (and very often completely solving) Sudoku puzzles:</p>
<ol>
<li>Label each cell with the set of all possible values it could take.</li>
<li>Pick a row/column/block and for a value of <span class="math-container">$K\in[1, 9)$</span>, look for "<span class="math-container">$K$</span>-partnerships" -- <span class="math-container">$K$</span>-tuples of cells that satisfy "the union of labels of each cell in the tuple has cardinality <span class="math-container">$K$</span>". Call the "union of labels of each cell in a partnership" the "banned set" of the partnership.</li>
<li>For each such partnership, for all cells in that row/column/block <em>not</em> in the partnership remove any element in its label that are in the banned set of the partnership.</li>
<li>Repeat Steps 2-3 for all values of <span class="math-container">$K$</span> and all rows, columns and blocks.</li>
</ol>
<p>(i.e. "if you have three cells labeled as (4, 5), (4, 7), (4, 5, 7), no other cell in that row can be 4, 5 or 7") </p>
<p>... has always seemed obvious to me, but I'm now informed from some sources that it has a name called "Crook's algorithm":</p>
<ul>
<li><a href="http://pi.math.cornell.edu/~mec/Summer2009/meerkamp/Site/Solving_any_Sudoku_II.html" rel="nofollow noreferrer">http://pi.math.cornell.edu/~mec/Summer2009/meerkamp/Site/Solving_any_Sudoku_II.html</a></li>
<li><a href="https://www.ams.org/notices/200904/tx090400460p.pdf" rel="nofollow noreferrer">https://www.ams.org/notices/200904/tx090400460p.pdf</a></li>
</ul>
<p>The latter (by Crook) attributes the algorithm to texts written in 2005 and 2006. Are these really the earliest references? I'm pretty sure this must have been well-known for decades, but I'm not sure what to search for to find older references.</p>sudokupuzzle-historyFri, 02 Aug 2019 07:02:08 GMThttps://puzzling.stackexchange.com/q/86805something2019-08-02T07:02:08ZComment by Abhimanyu Pallavi Sudhir on Is there go up line character? (Opposite of \n)
https://stackoverflow.com/questions/11474391/is-there-go-up-line-character-opposite-of-n/11474509#11474509
Doesn't work with Windows/Python 3.Thu, 01 Aug 2019 19:49:57 GMThttps://stackoverflow.com/questions/11474391/is-there-go-up-line-character-opposite-of-n/11474509?cid=101123890#11474509Abhimanyu Pallavi Sudhir2019-08-01T19:49:57ZComment by Abhimanyu Pallavi Sudhir on Output to the same line overwriting previous output?
https://stackoverflow.com/questions/4897359/output-to-the-same-line-overwriting-previous-output/27023394#27023394
Using <code>end = '\r'</code> instead fixes the problem in Python 3.Wed, 31 Jul 2019 19:10:36 GMThttps://stackoverflow.com/questions/4897359/output-to-the-same-line-overwriting-previous-output/27023394?cid=101088967#27023394Abhimanyu Pallavi Sudhir2019-07-31T19:10:36ZAnswer by Abhimanyu Pallavi Sudhir for Neural Networks vs. Polynomial Regression/Other techniques for curve fitting?
https://math.stackexchange.com/questions/2901209/neural-networks-vs-polynomial-regression-other-techniques-for-curve-fitting/3308606#3308606
0<p>Polynomial regression is just usually the wrong Bayesian prior. You need functions with highly "non-local" effects which require high-degree polynomials, but polynomial regression gives zero prior probabilities to high-degree polynomials. As it turns out, neural networks happen to provide a reasonably good prior (perhaps that's why our brains work that way -- if they even do).</p>Tue, 30 Jul 2019 18:31:31 GMThttps://math.stackexchange.com/questions/2901209/-/3308606#3308606Abhimanyu Pallavi Sudhir2019-07-30T18:31:31ZOrthochronous indefinite orthogonal group $O^+(m, n)$ forms a group
https://physics.stackexchange.com/questions/494260/orthochronous-indefinite-orthogonal-group-om-n-forms-a-group
1<p>My question is based on Qmechanic's answer <a href="https://physics.stackexchange.com/a/36425/23119">here</a> which proves that <span class="math-container">$O^+(m, 1)$</span> forms a group -- that if two Lorentz transformations have positive time-time co-ordinate, so does their product. The key is that with the Lorentz transformation written in the form:</p>
<p><span class="math-container">$$\Lambda = \left[\begin{array}{cc}\Lambda_a & \Lambda_b^t \cr \Lambda_c &\Lambda_R \end{array} \right].$$</span></p>
<p>We can show that <span class="math-container">$|(\Lambda\tilde{\Lambda})_a-\Lambda_a\tilde{\Lambda}_a|\le \sqrt{(\Lambda_a^2-1)(\tilde{\Lambda}_a^2-1)}$</span> which implies that positive <span class="math-container">$\Lambda_a,\tilde{\Lambda_a}$</span> imply positive <span class="math-container">$(\Lambda\tilde{\Lambda})_a$</span>.</p>
<p>Well, the trouble is that this uses the Cauchy-Schwarz inequality in Step 6, and therefore doesn't work for the general case of <span class="math-container">$O^+(m, n)$</span>. How would one generalise the proof to <strong>prove the orthochronous indefinite orthogonal group <span class="math-container">$O^+(m, n)$</span> is a group</strong>?</p>
<p>Here's what I've tried so far: defining <span class="math-container">$O^{+}(m,n)$</span> as the subset of <span class="math-container">$O(m,n)$</span> with elements <span class="math-container">$\Lambda$</span> which satisfy <span class="math-container">$\det(\Lambda_a)>0$</span> (and in fact <span class="math-container">$\ge 1$</span>), </p>
<ol>
<li><p>As before, <span class="math-container">$(\Lambda\tilde{\Lambda})_a=\Lambda_a\tilde{\Lambda}_a+\Lambda_b^T\tilde{\Lambda}_c$</span>. </p></li>
<li><p>From multiplying out <span class="math-container">$\Lambda^T\eta \Lambda=\eta$</span> and <span class="math-container">$\Lambda\eta \Lambda^T=\eta$</span>, we see that <span class="math-container">$\Lambda_a^2-\Lambda_c^T\Lambda_c=\Lambda_a^2-\Lambda_b^T\Lambda_b=I$</span> and analogous for <span class="math-container">$\tilde{\Lambda}$</span>.</p></li>
<li><p>So <span class="math-container">$\det\left((\Lambda\tilde{\Lambda})_a-\Lambda_a\tilde{\Lambda}_a\right)=\det\left(\Lambda_b^T\tilde{\Lambda}_c^T\right)=\sqrt{\det\left(\Lambda_a^2-I\right)\det\left(\tilde{\Lambda}_a^2-I\right)}$</span>.</p></li>
</ol>
<p>Well, I'm not sure how to proceed at this point. Does <span class="math-container">$\det(X-PQ)=\det((P^2-I)(Q^2-I))^{1/2}$</span> imply that <span class="math-container">$\det P\ge 1\land\det Q\ge 1\Rightarrow \det X>0$</span> <em>in general</em>?</p>
<p>The <a href="https://physics.stackexchange.com/a/36388/23119">"topological proof" from Ron Maimon</a> does not work either, as the orbit of the unit time vector is <a href="https://math.stackexchange.com/questions/2022156/how-many-sheets-can-a-hyperboloid-have-in-n-dimensions">connected when <span class="math-container">$n>1$</span></a>. I suspect that a more powerful technique than looking at the orbit of the unit time vector would be to look at the topology of the Lie group itself -- but I'm not that familiar with this stuff.</p>special-relativitymathematical-physicsgroup-theorylorentz-symmetrytopologyMon, 29 Jul 2019 20:30:31 GMThttps://physics.stackexchange.com/q/494260Abhimanyu Pallavi Sudhir2019-07-29T20:30:31ZAnswer by Abhimanyu Pallavi Sudhir for What is the frequency of white light?
https://physics.stackexchange.com/questions/494081/what-is-the-frequency-of-white-light/494085#494085
1<p>It doesn't have a specific frequency -- it has a frequency distribution.</p>
<p>You don't even need to go as far as white light -- just consider a "camel hump" wave, like <span class="math-container">$\sin ax+\sin bx$</span> -- what's the frequency of a light wave that looks like this? The answer is that its frequency isn't a fixed value, but a distribution, taking values <span class="math-container">$a/2\pi$</span> and <span class="math-container">$b/2\pi$</span> with half probability each. In general, if you have some function <span class="math-container">$f(x)$</span>, the way to obtain this <strong>frequency distribution</strong> is to decompose <span class="math-container">$f(x)$</span> in terms of sinusoids -- this is precisely the <strong>Fourier transform</strong>.</p>
<p>In the specific case you mentioned, position and momentum ("frequency") are "Fourier duals" of each other. If you have a sinusoid (by which I mean <span class="math-container">$e^{2\pi i\xi x}$</span>), you have complete uncertainty about the position, but have a precise value for the momentum: <span class="math-container">$h\xi$</span>. On the other hand, if you had localised your position completely (to a Dirac delta function), you would find a sinusoid in momentum-space.</p>
<p>These distributions are called the "wavefunctions" in position and momentum basis respectively, and this duality is the "uncertainty principle" -- read more about this in my <a href="https://thewindingnumber.blogspot.com/p/2103.html" rel="nofollow noreferrer">quantum mechanics articles here</a> (specifically article 4). In the specific case of white light, white light isn't really a well-defined concept in physics -- it has to do with human eyesight and what visible light entails, but nonetheless the frequency of white light is indeed a distribution with non-zero variance.</p>Sun, 28 Jul 2019 18:23:58 GMThttps://physics.stackexchange.com/questions/494081/-/494085#494085Abhimanyu Pallavi Sudhir2019-07-28T18:23:58ZHow does a covariance intensity function measure clustering?
https://stats.stackexchange.com/questions/418046/how-does-a-covariance-intensity-function-measure-clustering
0<p>I was taught in a class on spatial statistics that the covariance intensity function (defined below) measured clustering and inhibition in a point process, but isn't used because good test statistics for it don't exist.</p>
<p><span class="math-container">$$c(\mathbf{x},\mathbf{y})=\lim_{|d\mathbf{x}|\to0,|d\mathbf{y}|\to 0} \frac{\mathrm{Cov}\{N(d\mathbf{x}), N(d\mathbf{y})\}}{|d\mathbf{x}||d\mathbf{y}|}$$</span></p>
<p>Where <span class="math-container">$\mathbf{x}, \mathbf{y}$</span> are positions on the domain and <span class="math-container">$d\mathbf{x}, d\mathbf{y}$</span> are regions around them with areas given by <span class="math-container">$|d\mathbf{x}|, |d\mathbf{y}|$</span>, and <span class="math-container">$N(R)$</span> represents the random variable corresponding to the number of events in a region.</p>
<p>But I can't see how this measures non-homogeneity at all -- if one starts with a process that is described by an intensity function -- any intensity function -- this function should necessarily be zero, as the existence of an intensity function means a point turning up at point <span class="math-container">$\mathbf{x}$</span> is independent of a point turning up at intensity <span class="math-container">$\mathbf{y}$</span>. And you can certainly have intensity functions that exhibit clustering.</p>
<p>The only way that this function can be non-zero as I see it is if there are correlations within a realisation, e.g. if "everything is clustered to one side" and "everything is clustered to the other side" are the possibilities, or something i.e. if you don't have an intensity function at all, but rather some sort of "entangled state".</p>
<p>What am I missing?</p>correlationcovariancespatialpoint-processThu, 18 Jul 2019 10:50:54 GMThttps://stats.stackexchange.com/q/418046Abhimanyu Pallavi Sudhir2019-07-18T10:50:54ZAnswer by Abhimanyu Pallavi Sudhir for How does non-commutativity lead to uncertainty?
https://physics.stackexchange.com/questions/10362/how-does-non-commutativity-lead-to-uncertainty/491378#491378
0<p>When first learning about wavefunction collapse, I was surprised by the idea that the wavefunction would just <em>become</em> an eigenstate of the observable -- losing all other components of the state vector. Well, it's not as bad as you'd first expect, because the Hilbert space is really big. </p>
<p>But if two operators <em>do not have a common eigenbasis</em> -- i.e. if they don't commute, you do "lose information" about one observable when measuring the other one. This is precisely what the uncertainty principle codifies.</p>Sat, 13 Jul 2019 10:56:00 GMThttps://physics.stackexchange.com/questions/10362/-/491378#491378Abhimanyu Pallavi Sudhir2019-07-13T10:56:00ZAnswer by Abhimanyu Pallavi Sudhir for Should it be obvious that independent quantum states are composed by taking the tensor product?
https://physics.stackexchange.com/questions/54896/should-it-be-obvious-that-independent-quantum-states-are-composed-by-taking-the/489138#489138
0<p>I think it is pretty obvious. Correct me if my argument is wrong somewhere.</p>
<p>In the classical case, if you want to describe e.g. the x-positions of two particles, you have a two-dimensional phase space to show the possible states -- and two is the sum of one and one. But a quantum state space is very different -- every point in the <span class="math-container">$x_1$</span> "axis" is a basis vector of its own, and likewise for <span class="math-container">$x_2$</span> -- the state vectors we speak of are vectors in the Hilbert space, and can be shown as distributions mapped on this <span class="math-container">$(x_1,x_2)$</span> plane, representing them as superpositions of these basis vectors.</p>
<p>So it makes perfect sense that the dimension of the product space is the product of the dimensions and not the sum. The total number of points in the <span class="math-container">$(x_1,x_2)$</span> plane -- which is the dimension of this new Hilbert space -- is the product of the number of points on the <span class="math-container">$x_1$</span> axis and the <span class="math-container">$x_2$</span> axis.</p>
<p>It's clear that the <em>probabilities</em> are multiplicative. Given states <span class="math-container">$|\phi\rangle=\sum p(x)|x\rangle$</span> and <span class="math-container">$|\psi\rangle=\sum q(y)|y\rangle$</span> in bases <span class="math-container">$|x\rangle$</span> and <span class="math-container">$|y\rangle$</span>, it is clear that the <em>magnitudes</em> of the components of the state <span class="math-container">$$|\phi\rangle\otimes|\psi\rangle=\sum r(x,y)|x\rangle\otimes|y\rangle$$</span> where <span class="math-container">$\otimes$</span> (is the desired product representing composition) of the combined system are <span class="math-container">$|r(x,y)|^2=|p(x)q(y)|^2$</span>. But -- as you ask in your question -- how do we know that <span class="math-container">$r(x,y)=p(x)q(y)$</span>?</p>
<p>The idea is quite simple, though -- suppose we have a state like </p>
<p><span class="math-container">$$\left( {\frac{1}{{\sqrt 2 }}\left| x \right\rangle + \frac{1}{{\sqrt 2 }}\left| y \right\rangle } \right) \otimes \left| z \right\rangle = \frac{u}{{\sqrt 2 }}\left| x \right\rangle \otimes \left| z \right\rangle + \frac{v}{{\sqrt 2 }}\left| y \right\rangle \otimes \left| z \right\rangle $$</span></p>
<p>Because we're representing two independent systems, we can just observe the first system, collapsing it to <span class="math-container">$|x\rangle$</span>: then the combined state based on the left-hand-side is collapsed to <span class="math-container">$|x\rangle\otimes|z\rangle$</span>. But based on the right-hand-side, this is <span class="math-container">$u|x\rangle\otimes|z\rangle$</span>, and thus <span class="math-container">$u=1$</span> and similarly for <span class="math-container">$v$</span>.</p>Mon, 01 Jul 2019 09:43:29 GMThttps://physics.stackexchange.com/questions/54896/-/489138#489138Abhimanyu Pallavi Sudhir2019-07-01T09:43:29ZAnswer by Abhimanyu Pallavi Sudhir for Closure under Lie Bracket -- how is $c''(0)$ promoted to $(f\circ c)''(0)$
https://math.stackexchange.com/questions/3264841/closure-under-lie-bracket-how-is-c0-promoted-to-f-circ-c0/3268462#3268462
0<p>Ah, never mind, it's obvious -- I just got confused because it's not true for all curves. Given <span class="math-container">$(f\circ c)''(t)$</span>, it's clearly equal to</p>
<p><span class="math-container">$$c''(t)\cdot\nabla f(t)+c'(t)\cdot\frac{d}{dt}\nabla f(t)$$</span></p>
<p>And since <span class="math-container">$c'(0)=0$</span> for the given curve, this is just equal to the first term.</p>Thu, 20 Jun 2019 09:53:36 GMThttps://math.stackexchange.com/questions/3264841/closure-under-lie-bracket-how-is-c0-promoted-to-f-circ-c0/3268462#3268462Abhimanyu Pallavi Sudhir2019-06-20T09:53:36ZClosure under Lie Bracket -- how is $c''(0)$ promoted to $(f\circ c)''(0)$
https://math.stackexchange.com/questions/3264841/closure-under-lie-bracket-how-is-c0-promoted-to-f-circ-c0
1<p>I've seen numerous different proofs that the tangent space to a Lie group is closed under <span class="math-container">$[\cdot,\cdot]$</span>, i.e. that the Lie Bracket of two derivations is a derivation -- e.g. considering and differentiating the curve <span class="math-container">$e^{\sqrt{t}X}e^{\sqrt{t}Y}e^{-\sqrt{t}X}e^{-\sqrt{t}Y}$</span>, or just showing that <span class="math-container">$[D_1,D_2]$</span> follows the product rule.</p>
<p>But one derivation I don't get comes from Timothy Goldberg's set of lecture notes <em><a href="http://pi.math.cornell.edu/~goldberg/Talks/Flows-Olivetti.pdf" rel="nofollow noreferrer">The Lie Bracket and the Commutator of Flows</a></em>. Here's the process:</p>
<ol>
<li>Define the curve <span class="math-container">$c(t)=\Phi_X^t\Phi_Y^t\Phi_X^{-t}\Phi_Y^{-t}(e)$</span>.</li>
<li>Show that <span class="math-container">$[X,Y]=\frac12c''(0)$</span>.</li>
<li>Define an operation <span class="math-container">$D:f(t)\mapsto (f\circ c)''(0)$</span>.</li>
<li>Show that <span class="math-container">$D$</span> is a derivation.</li>
</ol>
<p>It's Step 3 I don't get. How do we know this operator <span class="math-container">$D$</span> is what "upgrades" <span class="math-container">$[X,Y]$</span> into a vector field? How can we show that <span class="math-container">$[X,Y]$</span> is the direction in which <span class="math-container">$D$</span> differentiates <span class="math-container">$f$</span>?</p>lie-groupslie-algebraslie-derivativeMon, 17 Jun 2019 00:39:06 GMThttps://math.stackexchange.com/q/3264841Abhimanyu Pallavi Sudhir2019-06-17T00:39:06ZAnswer by Abhimanyu Pallavi Sudhir for Is there a mathematical basis for Born rule?
https://physics.stackexchange.com/questions/215602/is-there-a-mathematical-basis-for-born-rule/483618#483618
1<p>One motivation comes from looking at light waves and polarisation -- when light passes through some filter, the energy of a light wave is scaled by <span class="math-container">$\cos^2\theta$</span> -- for a single photon, this means (as you can't have <span class="math-container">$\cos^2\theta$</span> of a photon) there is a probability of <span class="math-container">$\cos^2\theta$</span> that the number of photons passing through is "1". This <span class="math-container">$\cos\theta$</span> is simply the dot product of the "state vector" (polarisation vector) and the eigenvector of the number operator associated with polarisation filter with eigenvalue 1 -- i.e. the probability of observing "1" is <span class="math-container">$|\langle\psi|1\rangle|^2$</span>, and the probability of observing "0" is <span class="math-container">$|\langle\psi|0\rangle|^2$</span>, which is Born's rule.</p>
<p>So if you're motivating the state vector based on the polarisation vector, you can motivate Born's rule from <span class="math-container">$E=|A|^2$</span>, as above.</p>
<p>More abstractly, if you accept the other axioms of quantum mechanics, Born's rule is sort of the "only way" to encode probabilities, as you want probability of the union of disjoint events to be additive (equivalent to the Pythagoras theorem) and the total probability to be one (the length of the state vector is one). </p>
<p>But there is no way to "derive" the Born rule, it is an axiom. Quantum mechanics is fundamentally quite different to e.g. relativity, in the sense that it develops a whole new abstract mathematical theory to connect to the real world. So unlike in relativity, you don't have two axioms that are literally the result of observation and everything is derived from it -- instead, you have an axiomatisation of the mathematical theory, and then a way to connect the theory with observation, which is what Born's rule is. Certainly the <em>motivation</em> for quantum mechanics comes from wave-particle duality, but this is not an axiomatisation.</p>Fri, 31 May 2019 23:33:47 GMThttps://physics.stackexchange.com/questions/215602/-/483618#483618Abhimanyu Pallavi Sudhir2019-05-31T23:33:47ZAnswer by Abhimanyu Pallavi Sudhir for Why do we use Hermitian operators in QM?
https://physics.stackexchange.com/questions/39602/why-do-we-use-hermitian-operators-in-qm/482816#482816
0<p>The point of eigenstates and the entire linear algebra of quantum mechanics is that the projections <span class="math-container">$\langle\phi|\psi\rangle$</span> of the state <span class="math-container">$|\psi\rangle$</span> onto each eigenstate <span class="math-container">$|\phi\rangle$</span> represent the probability amplitudes of each eigenstate. In particular, this means:</p>
<p><span class="math-container">$$\sum |\langle\phi|\psi\rangle|^2 = 1=|\langle\psi|\psi\rangle|^2$$</span></p>
<p>Where the summation is taken over all the eigenstates of an operator. As this must be true for all states <span class="math-container">$|\psi\rangle$</span>, the thing on the left must be a Pythagoran sum, so the <span class="math-container">$|\phi\rangle$</span>s must form an orthogonal basis. Alternatively, one may just note that we must have <span class="math-container">$\langle \phi_1|\phi_2\rangle=0$</span> if the eigenvalues corresponding are distinct, as two distinct observations must be mutually exclusive.</p>
<hr>
<p>That shows that the matrices must be normal. That they are chosen to be Hermitian is non-essential, but useful, as has already been discussed.</p>Mon, 27 May 2019 22:20:33 GMThttps://physics.stackexchange.com/questions/39602/-/482816#482816Abhimanyu Pallavi Sudhir2019-05-27T22:20:33ZComment by Abhimanyu Pallavi Sudhir on What are your favorite instructional counterexamples?
https://mathoverflow.net/questions/16829/what-are-your-favorite-instructional-counterexamples/17285#17285
@ManfredWeis Would you recall the title of the post you meant to link to? Your link is an actively updated feed -- is it <a href="https://calculus7.org/2014/12/07/tossing-a-continuous-coin/" rel="nofollow noreferrer">this</a>?Sat, 25 May 2019 13:30:55 GMThttps://mathoverflow.net/questions/16829/what-are-your-favorite-instructional-counterexamples/17285?cid=829577#17285Abhimanyu Pallavi Sudhir2019-05-25T13:30:55ZAnswer by Abhimanyu Pallavi Sudhir for Are there other kinds of bump functions than $e^\frac1{x^2-1}$?
https://math.stackexchange.com/questions/101480/are-there-other-kinds-of-bump-functions-than-e-frac1x2-1/3236066#3236066
3<p>Here's how you can generate as many different kinds of bump functions as you want, for whatever definition of "kind" you may have:</p>
<ol>
<li>Start with any function <span class="math-container">$f(x)$</span> that <strong>grows faster than all polynomials</strong>, i.e. <span class="math-container">$\forall N, \ \lim_{x\to\infty}\frac{x^N}{f(x)}=0$</span>. Example: <span class="math-container">$e^x$</span>.</li>
<li>Then consider the function <span class="math-container">$g(x)=\frac1{f(1/x)}$</span>. This is a function that is flatter than all polynomials near zero, i.e. <span class="math-container">$\forall N,\ \lim_{x\to0}\frac{g(x)}{x^N}=0$</span>. This is a a <strong>smooth non-analytic</strong> function. For our example, we get <span class="math-container">$e^{-1/x}$</span>.</li>
<li>Consider the function <span class="math-container">$h(x)=g(1+x)g(1-x)$</span>. This, after zeroing out stuff outside the interval <span class="math-container">$(-1,1)$</span>, is a <strong>bump function</strong>. For our example, <span class="math-container">$e^{2/(x^2-1)}$</span>.</li>
<li>Scale and transform to your liking.</li>
</ol>
<p>Just do this with different "kinds" of growth functions <span class="math-container">$f$</span>, and you'll get different "kinds" of bump functions <span class="math-container">$h$</span>. So here are some functions I could generate with this method -- try to guess which functions they're from:</p>
<p><span class="math-container">$$\begin{array}{l}
h(x) = {e^{2/({x^2} - 1)}} \\
h(x) = (1 + x)^{1/(1 + x)}(1 - x)^{1/(1 - x)} \\
h(x) = \frac1{\frac1{1 + x}!\frac1{1-x}!} \\
h(x)=e^{-[\ln^2(1+x)+\ln^2(1-x)]}
\end{array}$$</span></p>
<p>And the more rapidly your <span class="math-container">$f(x)$</span> grows, the nicer your bump function <span class="math-container">$h(x)$</span> looks.</p>
<hr>
<p>Here's a Desmos applet to try this with different functions <span class="math-container">$f$</span>: <a href="https://www.desmos.com/calculator/ccf2goi9bj" rel="nofollow noreferrer"><strong>desmos.com/calculator/ccf2goi9bj</strong></a>. </p>
<p>If you're interested in smooth non-analytic functions, have a look at my post <a href="https://thewindingnumber.blogspot.com/2019/05/whats-with-e-1x-on-smooth-non-analytic.html" rel="nofollow noreferrer"><strong>What's with e^(-1/x)? On smooth non-analytic functions: part I</strong></a>.</p>Wed, 22 May 2019 18:36:36 GMThttps://math.stackexchange.com/questions/101480/are-there-other-kinds-of-bump-functions-than-e-frac1x2-1/3236066#3236066Abhimanyu Pallavi Sudhir2019-05-22T18:36:36ZAnswer by Abhimanyu Pallavi Sudhir for Why does Taylor’s series “work”?
https://physics.stackexchange.com/questions/480163/why-does-taylor-s-series-work/481556#481556
0<p>Adding to <a href="https://physics.stackexchange.com/a/480187/">Sympathiser's answer</a> -- one can see why the existence of functions like <span class="math-container">$e^{-1/x}$</span> is not surprising by rephrasing them as "<strong>functions that approach zero near zero faster than any polynomial</strong>". This is not fundamentally more surprising than e.g. functions that grow faster than every polynomial -- in fact, for any function <span class="math-container">$f(x)$</span> that grows faster than every polynomial, the function <span class="math-container">$\frac1{f(1/x)}$</span> approaches zero near zero faster than any polynomial.</p>
<p>So for rapidly growing <span class="math-container">$f(x)=e^x$</span>, one gets the corresponding smooth non-analytic <span class="math-container">$e^{-1/x}$</span>. For <span class="math-container">$x^x$</span>, one gets <span class="math-container">$x^{1/x}$</span>. For <span class="math-container">$x!$</span>, one gets <span class="math-container">$\frac{1}{(1/x)!}$</span>, and so on.</p>
<p>See my post <a href="https://thewindingnumber.blogspot.com/2019/05/whats-with-e-1x-on-smooth-non-analytic.html" rel="nofollow noreferrer"><strong>What's with e^(-1/x)? On smooth non-analytic functions: part I</strong></a> for a fuller explanation.</p>Wed, 22 May 2019 00:11:33 GMThttps://physics.stackexchange.com/questions/480163/-/481556#481556Abhimanyu Pallavi Sudhir2019-05-22T00:11:33ZData formats of inputs to arrange function in dplyr
https://stackoverflow.com/questions/56158114/data-formats-of-inputs-to-arrange-function-in-dplyr
0<p>Given a table <code>monkeys</code> with column <code>brain_size</code>, one can write something like <strong><code>arrange(monkeys, brain_size)</code></strong>. </p>
<p>I don't understand how this makes sense -- <strong><code>brain_size</code> isn't a declared variable</strong> (if I refer to it, I get an error). It's just the name of a column -- shouldn't you rather have <code>arrange(monkeys, 'brain_size')</code>? <strong><em>Isn't</em> the column name just a string?</strong></p>
<p>Another related weirdness -- </p>
<pre><code>arrange(monkeys, desc(brain_size))
</code></pre>
<p>Once again, what exactly is the <strong><code>desc</code> function</strong>? How can it take <code>brain_size</code> as an input? Shouldn't you have something like <code>arrange(monkeys, 'brain_size', desc = true)</code>?</p>
<p>Am I missing something? Perhaps <code>brain_size</code> is a variable in some way but can only be accessed when you're unambiguously "inside" <code>monkeys</code>.</p>rfunctiontypesdplyrWed, 15 May 2019 21:51:42 GMThttps://stackoverflow.com/q/56158114Abhimanyu Pallavi Sudhir2019-05-15T21:51:42ZAnswer by Abhimanyu Pallavi Sudhir for Geometrical Interpretation of Cauchy Riemann equations?
https://math.stackexchange.com/questions/1026134/geometrical-interpretation-of-cauchy-riemann-equations/3197879#3197879
1<p>One might think that being differentiable on <span class="math-container">$\mathbb{R}^2$</span> is sufficient for differentiability on <span class="math-container">$\mathbb{C}$</span>. But the Jacobian of an arbitrary such function doesn't have a natural complex number representation.</p>
<p><span class="math-container">$$
\left[ {\begin{array}{*{20}{c}}
{\partial u/\partial x} & {\partial u/\partial y} \\
{\partial v/\partial x} & {\partial v/\partial y}
\end{array}} \right]
$$</span></p>
<p>Another way of putting this is that no complex-valued derivative (see below for an example) you can define for an arbitrary function fully captures the local behaviour of the function that is represented by the Jacobian.</p>
<p><span class="math-container">$$
\frac{df}{dz} = \left(\frac{\partial u}{\partial x} + \frac{\partial v}{\partial y} \right) + i\left(\frac{\partial v}{\partial x}-\frac{\partial v}{\partial y}\right)
$$</span></p>
<p>The idea is that we should be able to define a complex-valued derivative "purely" for the value <span class="math-container">$z$</span>, without considering directions, i.e. we want to consider <span class="math-container">$\mathbb{C}$</span> one-dimensional in some sense (the sense being "as a vector space"). More precisely, the derivative in some direction in <span class="math-container">$\mathbb{C}$</span> should determine the derivative in all other directions in a natural manner -- whereas on <span class="math-container">$\mathbb{R}^2$</span>, the derivatives in <em>two</em> directions (i.e. the gradient) determines the directional derivatives in all directions. </p>
<p>If you think about it, this is quite a reasonable idea -- it's analogous to how not every linear transformation on <span class="math-container">$\mathbb{R}^2$</span> is a linear transformation on <span class="math-container">$\mathbb{C}$</span> -- only spiral transformations are.</p>
<p><span class="math-container">$$
\left[ {\begin{array}{*{20}{c}}
{a} & {-b} \\
{b} & {a}
\end{array}} \right]
$$</span></p>
<p>How would we generalise differentiability to an arbitrary manifold? Here's an idea: <strong>a function is differentiable if it is locally a linear transformation</strong>. So on <span class="math-container">$\mathbb{R}^2$</span>, any Jacobian matrix is a linear transformation. But on <span class="math-container">$\mathbb{C}$</span>, only Jacobians of the above form are linear transformations -- i.e. the only linear transformation on <span class="math-container">$\mathbb{C}$</span> is <strong>multiplication by a complex number</strong>, i.e. a spiral/amplitwist. So a complex differentiable function is one that is locally an amplitwist (geometrically), which can be stated in terms of the components of the Jacobian as:</p>
<p><span class="math-container">$$
\begin{align}
\frac{\partial u}{\partial x} & = \frac{\partial v}{\partial y} \\
\frac{\partial u}{\partial y} & = - \frac{\partial v}{\partial x} \\
\end{align}
$$</span></p>
<p>This is precisely why you shouldn't (and can't) view complex differentiability as some basic first-degree smoothness -- there is a much richer structure to these functions, and it's better to think of them via the transformations they have on grids.</p>Tue, 23 Apr 2019 05:55:35 GMThttps://math.stackexchange.com/questions/1026134/-/3197879#3197879Abhimanyu Pallavi Sudhir2019-04-23T05:55:35ZAnswer by Abhimanyu Pallavi Sudhir for Computing the Lie bracket on the Lie group $GL(n, \mathbb{R})$
https://math.stackexchange.com/questions/1884253/computing-the-lie-bracket-on-the-lie-group-gln-mathbbr/3193887#3193887
1<p>I think the sensible way to get an intuition for this is to just look at the Taylor expansion of the group commutator:</p>
<p><span class="math-container">$$e^{\varepsilon x} e^{\varepsilon y} e^{-\varepsilon x} e^{-\varepsilon y}$$</span></p>
<p>Which to second order is <span class="math-container">$1+\varepsilon^2(xy-yx)$</span>. Presumably you know how to prove that the second derivative of the above expression is equivalent to the derivative-of-the-adjoint definition.</p>Fri, 19 Apr 2019 18:26:17 GMThttps://math.stackexchange.com/questions/1884253/-/3193887#3193887Abhimanyu Pallavi Sudhir2019-04-19T18:26:17ZAnswer by Abhimanyu Pallavi Sudhir for Determinant-like expression for non-square matrices
https://math.stackexchange.com/questions/903028/determinant-like-expression-for-non-square-matrices/3191959#3191959
0<p>See <a href="https://arxiv.org/abs/1904.08097" rel="nofollow noreferrer">1904.08097</a> for a review I authored of generalised determinant functions of tall matrices, and their properties -- this should provide a self-contained introduction to three different generalised determinants. </p>
<p>The function mentioned by Joonas Ilmavirta is the square of the "determinant-like function" that I first wrote about in 2013, albeit with an erroneous factor of <span class="math-container">$\sqrt{|m-n|!}$</span> at the front, which is corrected in the above review. It is also the norm-squared of the vector determinant, and the product of the singular values of the matrix.</p>
<p>If you want a non-trivial determinant for "wide matrices", i.e. flattenings, you will need to be a bit creative in the definition of the determinant, such as by defining it as the scaling of <span class="math-container">$m$</span>-volumes where <span class="math-container">$m$</span> is the dimension of the flattened space.</p>Thu, 18 Apr 2019 03:38:38 GMThttps://math.stackexchange.com/questions/903028/-/3191959#3191959Abhimanyu Pallavi Sudhir2019-04-18T03:38:38ZAnswer by Abhimanyu Pallavi Sudhir for Intuitive explanation of a positive semidefinite matrix
https://math.stackexchange.com/questions/9758/intuitive-explanation-of-a-positive-semidefinite-matrix/3181937#3181937
1<p>Positive-definite matrices are matrices that are <strong>congruent to the identity matrix</strong>, i.e. that can be written as <span class="math-container">$P^HP$</span> for invertible <span class="math-container">$P$</span> (for some reason, a lot of authors define congruence as <span class="math-container">$N=P^TMP$</span>, but here we go by the Hermitian definition <span class="math-container">$N=P^HMP$</span>). </p>
<p>One reason this is useful is that if two forms <span class="math-container">$M$</span> and <span class="math-container">$N$</span> are congruent, their corresponding "generalised unitary groups" <span class="math-container">$\{A^HMA=M\}$</span> and <span class="math-container">$\{B^HNB=N\}$</span> are isomorphic (via conjugation by <span class="math-container">$P$</span>). So positive-definite matrices (as well as negative-definite matrices, because <span class="math-container">$-I$</span> is preserved by the unitary group as well) define a dot product whose geometry is isomorphic to Euclidean geometry.</p>
<p>Similarly, a <strong>positive semidefinite matrix</strong> defines a geometry that Euclidean geometry is <em>homeomorphic</em> to -- to put it in slightly imprecisely, such a geometry has all the symmetries of Euclidean geometry, and perhaps then some.</p>
<p>See a fuller treatment <strong><a href="https://thewindingnumber.blogspot.com/2019/04/geometry-positive-definiteness-and.html" rel="nofollow noreferrer">here</a></strong>.</p>Wed, 10 Apr 2019 06:48:08 GMThttps://math.stackexchange.com/questions/9758/-/3181937#3181937Abhimanyu Pallavi Sudhir2019-04-10T06:48:08ZAnswer by Abhimanyu Pallavi Sudhir for Can non-linear transformations be represented as Transformation Matrices?
https://math.stackexchange.com/questions/450/can-non-linear-transformations-be-represented-as-transformation-matrices/3177854#3177854
0<p>The point of transformation matrices is that the images of the <span class="math-container">$n$</span> basis vectors is sufficient to determine the action of the entire transformation -- this is true for linear transformations, but not an arbitrary transformation.</p>
<p>However, nonlinear transformations (the smooth ones, anyway), can be locally approximated as linear transformations. With a bit of calculus, you get the "Jacobian matrix", which acts on the tangent vector space at every point on a manifold. This is a generalisation of transformation matrices in the sense that linear transformation's Jacobian is equal to its matrix representation, i.e. in the same sense that the derivative generalises the slope (which completely determines a linear function <span class="math-container">$y=mx$</span>)</p>Sun, 07 Apr 2019 06:48:31 GMThttps://math.stackexchange.com/questions/450/-/3177854#3177854Abhimanyu Pallavi Sudhir2019-04-07T06:48:31ZAnswer by Abhimanyu Pallavi Sudhir for Why does $A^TA=I, \det A=1$ mean $A$ is a rotation matrix?
https://math.stackexchange.com/questions/68119/why-does-ata-i-det-a-1-mean-a-is-a-rotation-matrix/3177807#3177807
2<p>You could just write out the components to confirm that this is so -- a much more interesting way to understand things, however, is to write down the condition as:</p>
<p><span class="math-container">$$A^TIA=I$$</span></p>
<p>The idea is that the matrix <span class="math-container">$A$</span> <em>preserves the identity quadratic form</em> -- note that <span class="math-container">$I$</span> is a quadratic form here and not a linear transformation, as this is the transformation law for quadratic forms (<span class="math-container">$A^TMA$</span> instead of <span class="math-container">$A^{-1}MA$</span>).</p>
<p>The hyperconic section corresponding to the identity quadratic form is the unit sphere -- thus the orthogonal transformations are all those that preserve the unit sphere. Another way of putting this is that <span class="math-container">$(Ax)^TI(Ay)=x^TA^TIAy=x^TIy$</span>, i.e. the Euclidean dot product <span class="math-container">$I$</span> is preserved by <span class="math-container">$A$</span>. This is equivalent to preserving the unit sphere, because the unit sphere is determined by the dot product on the given space.</p>
<p>What sort of transformations preserve the unit sphere? </p>
<hr>
<p>The reason this is a good way of understanding things is that there are plenty of other "dot products" you can define. One elementary one from physics is the Minkowski dot product in special relativity, <span class="math-container">$\mathrm{diag}(-1,1,1,1)$</span> -- the corresponding quadric surface is a hyperboloid, and the transformations that preserve it, forming the Lorentz group, are boosts (skews between time and a spatial dimension), spatial rotations and reflections.</p>
<hr>
<p>As for discriminating between rotations and reflections, suppose we define rotations in a completely geometric way -- for a matrix to be a rotation, all its eigenvalues are either 1 or in pairs of unit complex conjugates. </p>
<p>What do the eigenvalues of orthogonal matrices look like? For each eigenvalue, you need <span class="math-container">$\overline{\lambda}\lambda=1$</span>, i.e. all the eigenvalues are unit complex numbers. If a complex eigenvalue isn't paired with a corresponding conjugate, you will not get a real-valued transformation on <span class="math-container">$\mathbb{R}^n$</span>. Meanwhile if an eigenvalue of -1 isn't paired with another -1 -- i.e. if there are an odd number of reflections -- you get a reflection. The orthogonal (or rather unitary) transformations that do not behave this way are precisely the rotations.</p>
<p>The similarity between unpaired unit complex eigenvalues and unpaired -1's is interesting, by the way -- when thinking about reflections, you might have gotten the idea that reflections are <span class="math-container">$\pi$</span>-angle rotations in a higher-dimensional space -- like the vector was rotated through a higher-dimensional space and then landed on its reflection -- like it was a discrete snapshot of a process as smooth as any rotation. </p>
<p>Well, now you know what this higher-dimensional space is -- precisely <span class="math-container">$\mathbb{C}^n$</span>. And the determinant of a unitary matrix also takes a continuous spectrum -- the entire unit circle. In this sense (among other senses) complex linear algebra is more "complete" than real linear algebra.</p>Sun, 07 Apr 2019 05:55:12 GMThttps://math.stackexchange.com/questions/68119/why-does-ata-i-det-a-1-mean-a-is-a-rotation-matrix/3177807#3177807Abhimanyu Pallavi Sudhir2019-04-07T05:55:12ZAnswer by Abhimanyu Pallavi Sudhir for Reasoning about Lie theory and the Exponential Map
https://math.stackexchange.com/questions/19575/reasoning-about-lie-theory-and-the-exponential-map/3177348#3177348
0<p>The identity element <em>does</em> have significance, in the sense that it is the only natural way to think of the elements of the Lie Algebra as infinitesimal generators.</p>
<p>As I explain <a href="https://thewindingnumber.blogspot.com/2019/04/introduction-to-lie-groups.html" rel="nofollow noreferrer">here</a>, the idea is that with elements of the form <span class="math-container">$1+\varepsilon\vec\theta$</span>, elements of the group are generated as </p>
<p><span class="math-container">$$g(\vec\theta)=(1+\varepsilon\vec\theta)^{1/\varepsilon}=\exp\vec\theta$$</span></p>
<p>This map only exists when elements close to the identity are taken, as every element other than the identity is itself a generator (thus elements of the group can simply be generated via real-powers, not infinitesimally).</p>
<p><img src="https://i.stack.imgur.com/0AC5rm.png" width="500" /></p>Sat, 06 Apr 2019 19:21:58 GMThttps://math.stackexchange.com/questions/19575/-/3177348#3177348Abhimanyu Pallavi Sudhir2019-04-06T19:21:58ZAnswer by Abhimanyu Pallavi Sudhir for Binomial product expansion
https://math.stackexchange.com/questions/1331401/binomial-product-expansion/3172053#3172053
0<p>It is not a generalisation of the Binomial theorem because the exponent of <span class="math-container">$c$</span> isn't really handled -- they just took it outside. If you were to expand out the right-hand-side, you would have a generalisation of the Binomial theorem.</p>Tue, 02 Apr 2019 16:08:25 GMThttps://math.stackexchange.com/questions/1331401/-/3172053#3172053Abhimanyu Pallavi Sudhir2019-04-02T16:08:25ZAnswer by Abhimanyu Pallavi Sudhir for Intuition for the exponential of a matrix
https://math.stackexchange.com/questions/1213264/intuition-for-the-exponential-of-a-matrix/3165551#3165551
1<p>When I first learned about cyclic groups, the picture that I always had in my head was of the unit circle in the complex plane -- imagine my shock when I realised it wasn't a cyclic group at all! But I really <em>wanted</em> it to be cyclic, because it shared some really interesting properties with cyclic groups (see my post <em><a href="https://thewindingnumber.blogspot.com/2018/12/intuition-analogies-and-abstraction.html" rel="nofollow noreferrer">Intuition, analogies and abstraction</a></em>).</p>
<p>The solution to the problem can be seen directly from the quickest proof that the unit circle isn't cyclic -- the fact that it isn't countable (while the integers are). So here's an idea: let's admit <em>real powers on groups</em>!</p>
<p>Ok, but how? We know the construction of integer powers on an arbitrary group, and we know how real powers work on the unit circle, or the real line (which is also real-power cyclic*, by the way), and it's conventionally equal to <span class="math-container">$x^r=\exp(r\log x)$</span> with <span class="math-container">$\exp$</span> given by its power series expansion.</p>
<p>But sticking just to our intuition for now, it would seem like the natural way to define a real power is to introduce a real-number parameterisation to our group -- for example, the circle group can be parameterised by <span class="math-container">$\theta$</span> and each element of the group is given by some <span class="math-container">$g(\theta)$</span>. Then real powers would look like <span class="math-container">$g(\theta)^r=g(r\theta)$</span>. In the case of a one-parameter group, we also have <span class="math-container">$g(\theta_1+\theta_2)=g(\theta_1)g(\theta_2)$</span>, but don't get too attached to this.</p>
<p>If you think about it, we've now just given some <em>additional structure</em> to our group -- a geometric structure in addition to the group structure.</p>
<p>But frankly, introducing a parameterisation in this way is a bit hand-wavy. We knew what parameterisation to introduce for the circle group because we already have a picture of its geometry in our heads, but in principle, we could've introduced really any kind of ridiculous parameterisation and given it a really ugly structure and an ugly real-power. What we need is a sensible, systematic way to introduce this parameterisation -- i.e. to think about what this parameter space really <em>is</em>.</p>
<p>The answer to the question comes from Euler's formula, which relates addition on the imaginary line to multiplication on the unit circle. </p>
<p><span class="math-container">$$\exp(i\theta)=g(\theta)$$</span></p>
<p>What significance does the imaginary line have to the unit circle? Well, something interesting is that the tangent to the unit circle at 1 is parallel to the imaginary line, i.e. all its elements are of the form <span class="math-container">$1+it$</span>. So an idea for the parameterisation is that the parameter space is the tangent space at the identity of the group -- this is the Lie algebra of the group.</p>
<p>(You still need to prove that this actually works in general -- this has to do with proving that all derivatives of the exponential map at the identity can be recovered as <span class="math-container">$g^{(k)}(0)=(g'(0))^k$</span> -- this is a property of exponential functions of the form <span class="math-container">$g(t)=e^{bt}$</span>, and is part of the "exponential structure" of the Lie Algebra/Lie Group correspondence.)</p>
<p>This is not too bad! It's not completely absurd to think about the "vicinity of the identity" of at least matrix groups, so it's not absurd to think about tangent spaces to these groups. This is where you see arguments like <span class="math-container">$(1+\varepsilon t)^T(1+\varepsilon t)=1+\varepsilon(t+t^T)$</span> implying the tangent space to an Orthogonal Group is an algebra of antisymmetric matrices, etc. -- if you have some notion of perturbing an element in your group, you can construct a Lie algbera parameterisation of it.</p>
<hr>
<p>*To the best of my knowledge, "real-power cyclic" is not a real word -- the conventional term is "one-parameter Lie group".</p>
<p>See my post <a href="https://thewindingnumber.blogspot.com/2019/04/introduction-to-lie-groups.html" rel="nofollow noreferrer">Introduction to Lie groups</a> for a more complete treatment.</p>Thu, 28 Mar 2019 06:29:55 GMThttps://math.stackexchange.com/questions/1213264/-/3165551#3165551Abhimanyu Pallavi Sudhir2019-03-28T06:29:55ZAnswer by Abhimanyu Pallavi Sudhir for What's the generalisation of the quotient rule for higher derivatives?
https://math.stackexchange.com/questions/5357/whats-the-generalisation-of-the-quotient-rule-for-higher-derivatives/3131947#3131947
1<p>I'm checking @Mohammad Al Jamal's formula with SymPy, and I can verify it's true (barring a missing <span class="math-container">$(-1)^k$</span> term) for up to <span class="math-container">$n = 16$</span>, at least (it gets really slow after that).</p>
<pre>
import sympy as sp
k = sp.Symbol('k'); x = sp.Symbol('x'); f = sp.Function('f'); g = sp.Function('g')
n = 0
while True:
fgn = sp.diff(f(x) / g(x), x, n)
guess = sp.summation((-1) ** k * sp.binomial(n + 1, k + 1) \
* sp.diff(f(x) * (g(x)) ** k, x, n)/(g(x) ** (k + 1)), (k, 0, n))
print("{} for n = {}".format(sp.expand(guess - fgn) == 0, n))
n += 1
</pre>
<p>This is quite surprising to me -- I didn't expect there to be such a simple and straightforward expression for <span class="math-container">$(f(x)/g(x))^{(n)}$</span>, and haven't seen his formula anywhere before. I tried some inductive proofs, but I haven't succeeded in proving it yet.</p>Sat, 02 Mar 2019 00:15:06 GMThttps://math.stackexchange.com/questions/5357/-/3131947#3131947Abhimanyu Pallavi Sudhir2019-03-02T00:15:06ZAnswer by Abhimanyu Pallavi Sudhir for Why didn't Lorentz conclude that no object can go faster than light?
https://physics.stackexchange.com/questions/461833/why-didnt-lorentz-conclude-that-no-object-can-go-faster-than-light/461863#461863
12<p>Because typically if you find an expression that seems to break down at some value of <span class="math-container">$v$</span>, you would conclude that the expression simply loses its validity for that value of <span class="math-container">$v$</span>, not that the value isn't attainable. Presumably this was the conclusion of Lorentz and others.</p>
<p>The reason Einstein concluded otherwise is that special relativity gives a physical argument for "superluminal speeds are equivalent to time running backwards" -- the argument is "does a superluminal ship hit the iceberg before or after its headlight does?" </p>
<p>This depends on the observer, and because the headlight would melt the iceberg, the consequences of each observation are noticeably different. The only possible conclusions are "superluminal ships don't exist", "time runs backwards for superluminal observers", or "iceberg-melting headlights don't exist".</p>Wed, 20 Feb 2019 10:43:02 GMThttps://physics.stackexchange.com/questions/461833/-/461863#461863Abhimanyu Pallavi Sudhir2019-02-20T10:43:02ZAnswer by Abhimanyu Pallavi Sudhir for Relativity from a basic assumption
https://physics.stackexchange.com/questions/455712/relativity-from-a-basic-assumption/455753#455753
1<p>I will give <em>a</em> derivation of the Lorentz boosts requiring (what at least seem to be) minimal assumptions, and we will look at what assumptions we used, and see if some of them can be derived from each other, etc. Note that by "the Lorentz transformations", I mean the Lorentz transformation of spacetime position -- Lorentz transformations of other four-touples (i.e. proving that they are Lorentz vectors) would require other assumptions, of course. I've given a more full explanation of the derivation <a href="https://thewindingnumber.blogspot.com/2017/09/introduction-to-special-relativity.html" rel="nofollow noreferrer">here</a>.</p>
<p><strong>(a)</strong> The first important fact you need to prove anything about the Lorentz transformations is that they are linear. Linearity is logically equivalent to the following conditions: (under the transformation),</p>
<ul>
<li><strong>all straight lines remain straight lines</strong> -- the physical interpretation of this is that if an object's velocity is constant in one inertial reference frame, it is constant in all inertial reference frames. This follows from the <em>principle of relativity</em>.</li>
<li><strong>the origin remains fixed</strong> -- this is true by definition of the transformations we are considering -- boosts passing through the same origin.</li>
</ul>
<p>With this, we know that we can use a matrix to write down the Lorentz transformations. Which matrix?</p>
<p><strong>(b)</strong> The tilt/angle of the <span class="math-container">$t'$</span>, <span class="math-container">$x'$</span> axes with respect to the <span class="math-container">$t$</span>, <span class="math-container">$x$</span> axes. The tilt of the <span class="math-container">$t'$</span> axes follows from the definition of velocity as the gradient of the worldline. To prove the tilt of the <span class="math-container">$x'$</span> axis is equal to this tilt, we need to first define the <span class="math-container">$x'$</span> axis within the unprimed co-ordinate system. </p>
<p>This is possible by considering invariant features under a boost, i.e. from the principle of relativity -- the obvious invariant is as follows: if you had emitted a light ray <span class="math-container">$a$</span> seconds in the past, it reflects off some object and returns to you <span class="math-container">$a$</span> seconds in the future, it was on your x-axis at time 0.</p>
<p><a href="https://i.stack.imgur.com/zC7TS.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/zC7TS.png" alt="enter image description here"></a></p>
<p>By the principle of relativity, this should apply in the primed reference frame as well. By the invariance of the speed of light, the slope of the light ray is the same in the primed reference frame. Now figuring out the angle of tilt of the <span class="math-container">$x'$</span> axis becomes an exercise in geometry.</p>
<p><a href="https://i.stack.imgur.com/QvRjN.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/QvRjN.png" alt="enter image description here"></a></p>
<p>And it's easy to prove, by drawing an appropriate circle, that the two tilts are equal.</p>
<p><strong>(c)</strong> We now know the lines the column vectors of our matrix land on -- they are multiples of <span class="math-container">$(1, v)$</span> and <span class="math-container">$(v, 1)$</span>, but which vector on that line exactly? In other words, what's the scale on the axes? This requires one extra assumption: if you boost into the frame with velocity <span class="math-container">$v$</span>, then boost <span class="math-container">$-v$</span> back, that's equivalent to not boosting at all, i.e. <span class="math-container">$L(v)L(-v)=I$</span>. Then it's just computation:</p>
<p><span class="math-container">\begin{gathered}
\left[ {\begin{array}{*{20}{c}}
1&0 \\
0&1
\end{array}} \right] = \left[ {\begin{array}{*{20}{c}}
\alpha &{\beta v} \\
{\alpha v}&\beta
\end{array}} \right]\left[ {\begin{array}{*{20}{c}}
\alpha &{ - \beta v} \\
{ - \alpha v}&\beta
\end{array}} \right] = \left[ {\begin{array}{*{20}{c}}
{{\alpha ^2} - \alpha \beta {v^2}}&{{\beta ^2}v - \alpha \beta v} \\
{{\alpha ^2}v - \alpha \beta v}&{{\beta ^2} - \alpha \beta {v^2}}
\end{array}} \right] \hfill \\
{\alpha ^2}v - \alpha \beta v = 0 = {\beta ^2}v - \alpha \beta v \Rightarrow {\alpha ^2} = \alpha \beta = {\beta ^2} \Rightarrow \alpha = \beta \hfill \\
{\alpha ^2} - \alpha \beta {v^2} = 1 = {\beta ^2} - \alpha \beta {v^2} \Rightarrow {\alpha ^2} = 1 + \alpha \beta {v^2} = {\beta ^2} \Rightarrow {\alpha ^2} = 1 + {\alpha ^2}{v^2} \hfill \\
\Rightarrow \alpha = \beta = \frac{1}{{\sqrt {1 - {v^2}} }} \hfill \\
\end{gathered}</span></p>
<p>Then the change of basis matrix is simply the inverse of this matrix, which is:</p>
<p><span class="math-container">$$\Lambda=\gamma \left[ {\begin{array}{*{20}{c}}
1&-v \\
-v&1
\end{array}} \right]$$</span></p>
<p>Or:</p>
<p><span class="math-container">\begin{gathered}
x' = \gamma \left( {x - vt} \right) \\
t' = \gamma \left( {t - vx} \right) \\
\end{gathered}</span></p>
<p><strong>(d)</strong> There's still one final step, however -- we need to verify that <span class="math-container">$y$</span> and <span class="math-container">$z$</span> aren't transformed under the Lorentz boost. To prove this, consider two twins with paintbrushes running towards each other, painting the wall at waist level -- if the orthogonal axis were transformed in any way, each twin would see his paint-streak as above the other's -- the fact that the paint-streaks' relative positioning can't be different can be seen, e.g. from supposing that the two paints cause an explosion in the mix. The fact that the presence of explosions (or any boolean quantity) is invariant under Lorentz transformations is a consequence of the principle of relativity.</p>
<hr>
<p>We used three physical assumptions:</p>
<ul>
<li>The principle of relativity</li>
<li>The invariance of the speed of light</li>
<li><span class="math-container">$L(v)L(-v)=L(0)$</span>, or "if I see you moving at <span class="math-container">$v$</span>, you see me moving at <span class="math-container">$-v$</span>"</li>
</ul>
<p>The first two are the assumptions you wanted. As far as I can see, the last assumption can't really be proven from the other two -- it requires some sort of symmetry principle. But that's okay.</p>Mon, 21 Jan 2019 22:48:47 GMThttps://physics.stackexchange.com/questions/455712/-/455753#455753Abhimanyu Pallavi Sudhir2019-01-21T22:48:47ZAnswer by Abhimanyu Pallavi Sudhir for Varying constants in special relativity
https://physics.stackexchange.com/questions/455159/varying-constants-in-special-relativity/455176#455176
1<blockquote>
<p>(presumably) everything has mass, there is no such thing as a perfect inertial frame of reference</p>
</blockquote>
<p>This isn't right. "There isn't generally a perfectly flat co-ordinate system" does not imply everything has mass, and being an inertial reference frame has nothing to do with the associated observer's mass (in fact, the Lorentz transformation associated with a photon's "co-ordinate system" is singular, so there isn't really a co-ordinate system/reference frame associated with it).</p>
<p>I guess your concern is with the fact that photons are affected by spacetime curvature -- this is true, but the/a point of general relativity is that this doesn't imply anything about the mass.</p>Fri, 18 Jan 2019 20:30:09 GMThttps://physics.stackexchange.com/questions/455159/-/455176#455176Abhimanyu Pallavi Sudhir2019-01-18T20:30:09ZAnswer by Abhimanyu Pallavi Sudhir for What is really curved, spacetime, or simply the coordinate lines?
https://physics.stackexchange.com/questions/290906/what-is-really-curved-spacetime-or-simply-the-coordinate-lines/452416#452416
0<p>Curved co-ordinates on flat spacetime correspond to accelerating observers, not gravity. </p>
<p>The first physical insight of general relativity is that when you have gravity, you have <em>no</em> globally inertial frames -- contrast this with flat space, where you can always construct a linear co-ordinate system. The second physical insight is that you do have locally inertial frames, specifically the freefalling ones -- this is the "equivalence principle" -- so the manifold you use to model spacetime must necessarily have local flatness. Consequently, (pseudo-)Riemannian manifolds become the right way to model spacetime in general relativity.</p>
<p>This is why Christoffel symbols exist for accelerating observers on flat spacetime too -- they're first-order in the derivatives of the metric, and so can be eliminated by transforming into a flat co-ordinate system where the metric is constant (this is okay because the Christoffel symbols aren't tensors). The Riemann curvature tensor, on the other hand, is second-order in the derivatives of the metric and cannot be eliminated by a co-ordinate transformation.</p>Sun, 06 Jan 2019 12:13:03 GMThttps://physics.stackexchange.com/questions/290906/-/452416#452416Abhimanyu Pallavi Sudhir2019-01-06T12:13:03ZAnswer by Abhimanyu Pallavi Sudhir for Relative velocity greater than speed of light
https://physics.stackexchange.com/questions/452078/relative-velocity-greater-than-speed-of-light/452100#452100
0<p>Velocity is definitonally the same as "relative velocity". This is the point of the first postulate of relativity.</p>Fri, 04 Jan 2019 15:11:44 GMThttps://physics.stackexchange.com/questions/452078/-/452100#452100Abhimanyu Pallavi Sudhir2019-01-04T15:11:44ZAnswer by Abhimanyu Pallavi Sudhir for Does spacetime position not form a four-vector?
https://physics.stackexchange.com/questions/192886/does-spacetime-position-not-form-a-four-vector/450137#450137
1<p>Right -- vectors in general relativity live in some tangent space. This is the point of differential geometry, and of calculus in general -- you approximate non-linear things, which are <em>not</em> vector spaces (like curvy manifolds) with linear things (like their tangent spaces), which are vector spaces. This is exactly the motivation for defining the basis vectors as <span class="math-container">$\partial_\mu$</span>, as you describe.</p>Mon, 24 Dec 2018 07:04:21 GMThttps://physics.stackexchange.com/questions/192886/-/450137#450137Abhimanyu Pallavi Sudhir2018-12-24T07:04:21ZAnswer by Abhimanyu Pallavi Sudhir for What is an event in Special Relativity?
https://physics.stackexchange.com/questions/389488/what-is-an-event-in-special-relativity/444892#444892
1<p>It is perfectly reasonable to say that an event is a point in spacetime and that spacetime is a collection of events -- it is not "circular" as you claim in the comments. This is just the physics version of "a vector is an element of a vector space" and "a vector space is a set of vectors". You have axioms in math, and you have axioms in physics. The only difference is that in math, the objects are abstract, but in physics, they have a physical interpretation.</p>Mon, 03 Dec 2018 16:51:40 GMThttps://physics.stackexchange.com/questions/389488/-/444892#444892Abhimanyu Pallavi Sudhir2018-12-03T16:51:40ZAnswer by Abhimanyu Pallavi Sudhir for Why is the scalar product of two four-vectors Lorentz-invariant?
https://physics.stackexchange.com/questions/442119/why-is-the-scalar-product-of-two-four-vectors-lorentz-invariant/442164#442164
3<p>Here's the way to think about this -- why is the standard Euclidean dot product, <span class="math-container">$\sum x_iy_i$</span> interesting? Well, it is interesting primarily from the perspective of rotations, due to the fact that rotations leave dot products invariant. The reason this is so is that this dot product can be written as <span class="math-container">$|x||y|\cos\Delta\theta$</span>, and rotations leave magnitudes and relative angles invariant.</p>
<p>Is the standard Euclidean norm <span class="math-container">$|x|$</span> invariant under Lorentz transformations? Of course not -- for instance, <span class="math-container">$\Delta t^2+\Delta x^2$</span> is clearly not invariant, but <span class="math-container">$\Delta t^2-\Delta x^2$</span> is. Similarly, <span class="math-container">$E^2+p^2$</span> is not important, but <span class="math-container">$E^2-p^2$</span> is. The reason this is the case is that Lorentz boosts are fundamentally skew transformations, which means the invariant locus is a hyperbola, not a circle. So you have <span class="math-container">$\cosh^2 \xi - \sinh^2 \xi = 1$</span>, and <span class="math-container">$x_0^2-x_1^2$</span> is the right way to think of the norm on Minkowski space.</p>
<p>Similarly, Lorentz boosts change the rapidity <span class="math-container">$\xi$</span> by a simple displacement, so <span class="math-container">$\Delta \xi$</span> is invariant. From this point, it's a simple exercise to show that </p>
<p><span class="math-container">$$|x||y|\cosh\xi=x_0y_0-x_1y_1$$</span></p>
<p>(as for the remaining dimensions -- remember that the standard Euclidean dot product is still relevant in <em>space</em>, so you just need to write <span class="math-container">$x_0y_0-x\cdot y=x_0y_0-x_1y_1-x_2y_2-x_3y_3$</span>.)</p>Tue, 20 Nov 2018 15:59:18 GMThttps://physics.stackexchange.com/questions/442119/-/442164#442164Abhimanyu Pallavi Sudhir2018-11-20T15:59:18ZComment by something on Mate in 0 moves
https://puzzling.stackexchange.com/questions/74086/mate-in-0-moves/74093#74093
@FabianRöling Pawns have directions.Mon, 22 Oct 2018 09:27:58 GMThttps://puzzling.stackexchange.com/questions/74086/mate-in-0-moves/74093?cid=221467#74093something2018-10-22T09:27:58ZAnswer by Abhimanyu Pallavi Sudhir for Newton's Third Law and conservation of momentum
https://physics.stackexchange.com/questions/435941/newtons-third-law-and-conservation-of-momentum/436015#436015
3<p>As far as the actual physics is concerned, it is meaningless to talk of whether conservation of momentum is "more fundamental" than Newton's third law -- you can axiomatise classical physics in either way -- from Newton's laws, from conservation laws, from symmetry laws, from an action principle, whatever. You can prove the resulting theories are equivalent, in the sense that all the alternative axiomatic systems imply each other.</p>
<p>In terms of understanding, it makes sense to have multiple different frameworks in your head -- a symmetry-based framework is really good intuitively, especially once you understand Noether's theorem, while an action principle is the most powerful and also more useful when you leave the realm of classical physics. Treating Newton's laws as axioms isn't a great idea -- it's mostly just historically relevant.</p>
<p>When you learn more advanced physics, conservation of momentum <em>will</em> start "feeling" more fundamental -- this is simply because momentum is an interesting quantity to talk about.</p>Sun, 21 Oct 2018 21:20:57 GMThttps://physics.stackexchange.com/questions/435941/-/436015#436015Abhimanyu Pallavi Sudhir2018-10-21T21:20:57ZAnswer by Abhimanyu Pallavi Sudhir for If force is a vector, then why is pressure a scalar?
https://physics.stackexchange.com/questions/429998/if-force-is-a-vector-then-why-is-pressure-a-scalar/430008#430008
4<p>Pressure is a scalar because it does not behave as a vector -- specifically, you can't take the "components" of pressure and take their Pythagorean sum to obtain its magnitude. Instead, pressure is actually proportional to the <em>sum</em> of the components, <span class="math-container">$(P_x+P_y+P_z)/3$</span>.</p>
<p>The way to understand pressure is in terms of the stress tensor, and pressure is equal to the trace of the stress tensor. Once you understand this, the question becomes equivalent to questions like "why is the dot product a scalar?" (trace of the tensor product), "why is the divergence of a vector field a scalar?" (trace of the tensor derivative), etc. </p>
<p>There is no physical significance to taking the diagonal components of a tensor and putting them in a vector -- there <em>is</em> a physical significance to adding them up, and the invariance properties of the result tells you that it is a scalar.</p>
<p>See also: <a href="https://physics.stackexchange.com/questions/186045/why-do-we-need-both-dot-product-and-cross-product/419873#419873">Why do we need both dot product and cross product?</a></p>Fri, 21 Sep 2018 08:57:17 GMThttps://physics.stackexchange.com/questions/429998/-/430008#430008Abhimanyu Pallavi Sudhir2018-09-21T08:57:17ZAnswer by Abhimanyu Pallavi Sudhir for How can the solutions to equations of motion be unique if it seems the same state can be arrived at through different histories?
https://physics.stackexchange.com/questions/426445/how-can-the-solutions-to-equations-of-motion-be-unique-if-it-seems-the-same-stat/426453#426453
1<p>"The jar is empty at present" just tells you $f(0)$. You also need $f'(0)$, $f''(0)$, etc.</p>Mon, 03 Sep 2018 09:46:25 GMThttps://physics.stackexchange.com/questions/426445/-/426453#426453Abhimanyu Pallavi Sudhir2018-09-03T09:46:25ZAnswer by Abhimanyu Pallavi Sudhir for From the speed of light being an invariant to being the maximum possible speed
https://physics.stackexchange.com/questions/331119/from-the-speed-of-light-being-an-invariant-to-being-the-maximum-possible-speed/423423#423423
0<p>A simple thought experiment does the trick -- consider a train moving faster than light, and it has headlights (it's a glass train). According to a stationery observer (stationery in a reference frame where the train is faster than light), the train must always be in front of the light, but according to an observer hanging out of the train, the light must be in front of him, since light speed is still $c$.</p>
<p>It might not seem like this relativeness of the order of the two objects is a problem, but it is -- say, for instance, the train is moving towards a high-tech wall which is trained to do this when switched ON:
(1) if hit by a train, make world explode
(2) if light is incident, switch OFF.
The wall is currently switched ON. According to one observer, the world explodes, whereas according to another, it doesn't. This is an inconsistency.</p>
<p>Why wouldn't this argument apply to <em>any</em> speed and prohibit all motion? For example, why can't the wall be programmed to switch off a certain amount of time after which light is incident? Relativity says this is okay, because time can dilate and transform scale between reference frames. </p>
<p>But in order to make FTL speeds okay, you need to allow time to flip direction -- this is why the real condition is "to go faster than light, you must forgo causality", or simply, "locality = causality".</p>Sat, 18 Aug 2018 12:28:48 GMThttps://physics.stackexchange.com/questions/331119/-/423423#423423Abhimanyu Pallavi Sudhir2018-08-18T12:28:48ZAnswer by Abhimanyu Pallavi Sudhir for Link between Special relativity and Newtons gravitational law
https://physics.stackexchange.com/questions/123243/link-between-special-relativity-and-newtons-gravitational-law/423379#423379
0<p>Consider three theories:</p>
<p>$$L_A=1$$
$$L_B=1+h$$
$$L_C=1+h+h^2$$</p>
<p>Theory A is a special case of Theory C when $h$ is small, Theory B is a special case of C when $h$ is small, doesn't this mean A and B are the same?</p>
<p>This is not a perfect analogy, but an example as to why this sort of reasoning breaks down.</p>Sat, 18 Aug 2018 07:13:36 GMThttps://physics.stackexchange.com/questions/123243/-/423379#423379Abhimanyu Pallavi Sudhir2018-08-18T07:13:36ZAnswer by Abhimanyu Pallavi Sudhir for Why is velocity defined as 4-vector in relativity?
https://physics.stackexchange.com/questions/423360/why-is-velocity-defined-as-4-vector-in-relativity/423364#423364
5<p>"It should transform like a four-vector under a Lorentz transformation" is a generalisation of several intuitions you typically have regarding how natural objects/tensors should behave in special relativity -- an obvious one is "no special status to any individual dimension, since space and time are inherently symmetric. That $dx^\mu/dx^0$ doesn't transform like a four-vector is obvious from the fact that it gives special preference to time.</p>
<p>The conventional way to define four-velocity in relativity is as $dx^\mu/ds$. Your 2-tensor idea is cute -- it is similar to the angle tensor generalised to four-dimensions -- but it doesn't satisfy the uses we have of the standard four-velocity (e.g. how would the four-momentum be defined? $m\,dx^\mu/dx^\nu$? That wouldn't be conserved.)</p>Sat, 18 Aug 2018 06:11:14 GMThttps://physics.stackexchange.com/questions/423360/why-is-velocity-defined-as-4-vector-in-relativity/423364#423364Abhimanyu Pallavi Sudhir2018-08-18T06:11:14ZComment by Abhimanyu Pallavi Sudhir on Ubuntu 17.04 Chromium Browser quietly provides full access to Google account
https://askubuntu.com/questions/915556/ubuntu-17-04-chromium-browser-quietly-provides-full-access-to-google-account
Me too. This is weird. Even if it's just the Chrome browser, I don't see why they'd need <i>full</i> access to my Google account. Windows doesn't do this.Sat, 14 Jul 2018 17:11:46 GMThttps://askubuntu.com/questions/915556/ubuntu-17-04-chromium-browser-quietly-provides-full-access-to-google-account?cid=1726608Abhimanyu Pallavi Sudhir2018-07-14T17:11:46ZComment by Abhimanyu Pallavi Sudhir on How to create folder shortcut in Ubuntu 14.04?
https://askubuntu.com/questions/486461/how-to-create-folder-shortcut-in-ubuntu-14-04/691976#691976
@jave.web Yes -- use the application menu (either at the top left of your screen or a colourful icon next to the window controls) to go to your Nautilus preferences, then under "Behavior" enable link creation.Fri, 13 Jul 2018 11:49:02 GMThttps://askubuntu.com/questions/486461/how-to-create-folder-shortcut-in-ubuntu-14-04/691976?cid=1724793#691976Abhimanyu Pallavi Sudhir2018-07-13T11:49:02ZComment by Abhimanyu Pallavi Sudhir on How to customize (add/remove folders/directories) the "Places" menu of Ubuntu 13.04 "Files" application?
https://askubuntu.com/questions/285313/how-to-customize-add-remove-folders-directories-the-places-menu-of-ubuntu-13/292727#292727
This works. If you also want to remove the folders from the home directory, edit user-dirs.defaults, or make a copy of it in .config and edit there (for your local user).Sat, 30 Jun 2018 09:00:13 GMThttps://askubuntu.com/questions/285313/how-to-customize-add-remove-folders-directories-the-places-menu-of-ubuntu-13/292727?cid=1716388#292727Abhimanyu Pallavi Sudhir2018-06-30T09:00:13ZComment by Abhimanyu Pallavi Sudhir on How to safely remove default folders?
https://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964#140964
See <a href="https://askubuntu.com/questions/285313/how-to-customize-add-remove-folders-directories-the-places-menu-of-ubuntu-13">here</a> for a working solution. If you also want to remove the folders from the home directory, edit user-dirs.defaults, or make a copy of it in .config and edit there (for your local user).Mon, 25 Jun 2018 06:08:26 GMThttps://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964?cid=1713336#140964Abhimanyu Pallavi Sudhir2018-06-25T06:08:26ZComment by Abhimanyu Pallavi Sudhir on How to safely remove default folders?
https://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964#140964
Doesn't work -- even if you don't run the update command, it gets updated upon the next reboot. There must be a more fundamental file in which these directory names are kept.Mon, 25 Jun 2018 05:32:16 GMThttps://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964?cid=1713326#140964Abhimanyu Pallavi Sudhir2018-06-25T05:32:16ZComment by Abhimanyu Pallavi Sudhir on Explaining the Main Ideas of Proof before Giving Details
https://mathoverflow.net/questions/301085/explaining-the-main-ideas-of-proof-before-giving-details
Because good proofs are just a formalisation of the intuitive understanding -- rather than wasting space explaining the insights, you can just give them the proof, and an even somewhat experienced reader can re-create the details.Sun, 27 May 2018 04:28:36 GMThttps://mathoverflow.net/questions/301085/explaining-the-main-ideas-of-proof-before-giving-details?cid=750004Abhimanyu Pallavi Sudhir2018-05-27T04:28:36ZComment by Abhimanyu Pallavi Sudhir on reference for higher spin - not gravitational nor stringy
https://mathoverflow.net/questions/195125/reference-for-higher-spin-not-gravitational-nor-stringy
On <a href="http://www.physicsoverflow.org/27048/reference-for-higher-spin-not-gravitational-nor-stringy?show=27499#a27499" rel="nofollow noreferrer">PhysicsOverflow</a>, there is a link to <a href="http://inspirehep.net/record/265411" rel="nofollow noreferrer">this paper</a> for the same question.Sun, 01 Mar 2015 02:25:25 GMThttps://mathoverflow.net/questions/195125/reference-for-higher-spin-not-gravitational-nor-stringy?cid=493513Abhimanyu Pallavi Sudhir2015-03-01T02:25:25ZComment by Abhimanyu Pallavi Sudhir on Classical and Quantum Chern-Simons Theory
https://mathoverflow.net/questions/159695/classical-and-quantum-chern-simons-theory
This has received an answer on PhysicsOverflow if you're still interested: <a href="http://www.physicsoverflow.org/22251/classical-and-quantum-chern-simons-theory#c22256" rel="nofollow noreferrer">Classical and Quantum Chern-Simons Theory</a>Thu, 14 Aug 2014 13:14:02 GMThttps://mathoverflow.net/questions/159695/classical-and-quantum-chern-simons-theory?cid=447277Abhimanyu Pallavi Sudhir2014-08-14T13:14:02ZComment by Abhimanyu Pallavi Sudhir on What is convolution intuitively?
https://mathoverflow.net/questions/5892/what-is-convolution-intuitively
<a href="http://en.wikipedia.org/wiki/File:Convolution_of_spiky_function_with_box2.gif" rel="nofollow noreferrer">Wikipedia</a>Fri, 17 Jan 2014 16:20:39 GMThttps://mathoverflow.net/questions/5892/what-is-convolution-intuitively?cid=396721Abhimanyu Pallavi Sudhir2014-01-17T16:20:39ZComment by Abhimanyu Pallavi Sudhir on Embedding of F(4) in OSp(8|4)?
https://mathoverflow.net/questions/111110/embedding-of-f4-in-osp84
Cross-posted to: <a href="http://physics.stackexchange.com/q/41155/23119">physics.stackexchange.com/q/41155/23119</a>Mon, 23 Dec 2013 04:35:50 GMThttps://mathoverflow.net/questions/111110/embedding-of-f4-in-osp84?cid=391443Abhimanyu Pallavi Sudhir2013-12-23T04:35:50ZComment by Abhimanyu Pallavi Sudhir on How to compare Unicode characters that "look alike"?
https://stackoverflow.com/questions/20674577/how-to-compare-unicode-characters-that-look-alike
I compared every single pixel of it, and it looks the same.Thu, 19 Dec 2013 09:26:53 GMThttps://stackoverflow.com/questions/20674577/how-to-compare-unicode-characters-that-look-alike?cid=30963612Abhimanyu Pallavi Sudhir2013-12-19T09:26:53ZComment by Abhimanyu Pallavi Sudhir on What is the definition of picture changing operation?
https://mathoverflow.net/questions/152295/what-is-the-definition-of-picture-changing-operation
Related: <a href="http://physics.stackexchange.com/q/12595/23119">physics.stackexchange.com/q/12595/23119</a>Thu, 19 Dec 2013 07:26:36 GMThttps://mathoverflow.net/questions/152295/what-is-the-definition-of-picture-changing-operation?cid=390438Abhimanyu Pallavi Sudhir2013-12-19T07:26:36ZComment by Abhimanyu Pallavi Sudhir on Understanding the intermediate field method for the $\phi^4$ interaction
https://mathoverflow.net/questions/149564/understanding-the-intermediate-field-method-for-the-phi4-interaction
@DanielSoltész: Nope, high-level questions generally get largely ignored there these days.Tue, 26 Nov 2013 14:40:20 GMThttps://mathoverflow.net/questions/149564/understanding-the-intermediate-field-method-for-the-phi4-interaction?cid=384774Abhimanyu Pallavi Sudhir2013-11-26T14:40:20ZComment by Abhimanyu Pallavi Sudhir on Intuition behind the ricci flow
https://mathoverflow.net/questions/143144/intuition-behind-the-ricci-flow/143146#143146
I was about to post the same thing, I think this is very illustrative.Tue, 19 Nov 2013 16:05:08 GMThttps://mathoverflow.net/questions/143144/intuition-behind-the-ricci-flow/143146?cid=383288#143146Abhimanyu Pallavi Sudhir2013-11-19T16:05:08ZComment by Abhimanyu Pallavi Sudhir on What is the relationship between complex time singularities and UV fixed points?
https://mathoverflow.net/questions/134939/what-is-the-relationship-between-complex-time-singularities-and-uv-fixed-points
This actually got twice the number of views here than on Physics.SE.Sun, 10 Nov 2013 14:50:44 GMThttps://mathoverflow.net/questions/134939/what-is-the-relationship-between-complex-time-singularities-and-uv-fixed-points?cid=381229Abhimanyu Pallavi Sudhir2013-11-10T14:50:44ZAnswer by Abhimanyu Pallavi Sudhir for The Fuchsian monodromy problem
https://mathoverflow.net/questions/146099/the-fuchsian-monodromy-problem/148462#148462
1<p>Equation 6.2 is just the Liovelle Action, the action principle for the <em>Liouville Field</em>, which is well-known from the familiar conformal gauge. </p>
<p>$$S_L=\frac{c}{96\pi}\int_\mathcal{M}\left(\dot\varphi^2-\frac{16\varphi}{\left(1-\lvert t\rvert^2\right)^2}\right)\mathrm{d}^2t$$ </p>
<p>... along with some trivial facts about partition functions. </p>
<p>You could of course think of it as the $Z_\mathcal{M}$'s (partition functions) of the metrics being related by the $S_L$'s in the same way that the metrics are related by the Liouvelle field. </p>
<p>And yes, I don't know how to spell "Lioivulle" properly. </p>Sun, 10 Nov 2013 06:53:28 GMThttps://mathoverflow.net/questions/146099/-/148462#148462Abhimanyu Pallavi Sudhir2013-11-10T06:53:28ZComment by Abhimanyu Pallavi Sudhir on Modular Arithmetic in LaTeX
https://mathoverflow.net/questions/18813/modular-arithmetic-in-latex
Haha, I thought this question was about typsetting a paper in $\LaTeX$Fri, 08 Nov 2013 11:34:52 GMThttps://mathoverflow.net/questions/18813/modular-arithmetic-in-latex?cid=379817Abhimanyu Pallavi Sudhir2013-11-08T11:34:52ZAnswer by Abhimanyu Pallavi Sudhir for String theory "computation" for math undergrad audience
https://mathoverflow.net/questions/47770/string-theory-computation-for-math-undergrad-audience/147307#147307
2<p>Derive the Casimir Energy in Bosonic String Theory. </p>
<p>You start with the $\hat L_0$ operator and get rid of the non-vacuum $\displaystyle\frac{\alpha_0^2}{2}+\sum_{n=1}^\infty\alpha_{-n}\cdot\alpha_n$, then you use a Ramanujam sum to do $\zeta$-function renormalisation, from which you find out that the vacuum energy denoted by $\varepsilon_0$ is </p>
<p>$$\varepsilon_0=-\frac{d-2}{24}$$ </p>
<p>However, the most interesting part comes when you go around <a href="https://mathoverflow.net/a/140354/36148">deriving</a> the critical dimension of Bosonic String Theory. </p>
<p>After which, the expression surprisingly simplifyies to a $-1$. </p>
<p>For a more detailed derivation of the above stuff, see <a href="http://arxiv.org/pdf/hep-th/0207142v1.pdf" rel="nofollow noreferrer">these</a> lecture notes/. (Section 4) (Equation 4.5-4.10) </p>Fri, 08 Nov 2013 04:33:41 GMThttps://mathoverflow.net/questions/47770/-/147307#147307Abhimanyu Pallavi Sudhir2013-11-08T04:33:41ZComment by Abhimanyu Pallavi Sudhir on Book on mathematical "rigorous" String Theory?
https://mathoverflow.net/questions/71909/book-on-mathematical-rigorous-string-theory/71998#71998
I don't think that BBS falls into the category of "mathematically rigorous". It's a very good, intuitive book.Fri, 08 Nov 2013 04:17:49 GMThttps://mathoverflow.net/questions/71909/book-on-mathematical-rigorous-string-theory/71998?cid=379753#71998Abhimanyu Pallavi Sudhir2013-11-08T04:17:49ZComment by Abhimanyu Pallavi Sudhir on About the massless supermultiplets in $2+1$ dimensional supersymmetry
https://mathoverflow.net/questions/103392/about-the-massless-supermultiplets-in-21-dimensional-supersymmetry
@S.Carnahan: The OP has voluntarily deleted it, which is weird... I have flagged this as unclear what you're asking.Wed, 06 Nov 2013 16:49:00 GMThttps://mathoverflow.net/questions/103392/about-the-massless-supermultiplets-in-21-dimensional-supersymmetry?cid=379331Abhimanyu Pallavi Sudhir2013-11-06T16:49:00ZAnswer by Abhimanyu Pallavi Sudhir for Does $SO(32) \sim_T E_8 \times E_8$ relate to some group theoretical fact?
https://mathoverflow.net/questions/57529/does-so32-sim-t-e-8-times-e-8-relate-to-some-group-theoretical-fact/147129#147129
5<p>The answer to this question can be found in Lubos Motl's answer to <a href="https://physics.stackexchange.com/q/65092/23119">this question of mine on Physics.SE</a>. </p>
<p>The key here are the weight lattices bosonic representations $\Gamma$ of these gauge groups.</p>
<p>As I understand it, the weight lattice of $E(8)$ is $\Gamma^8$, whereas the weight lattice of $\frac{\operatorname{Spin}\left(32\right)}{\mathbb{Z}_2}$^ is $\Gamma^{16}$. The first fact means that the weight lattice of $E(8)\times E(8)$ is $\Gamma^{8}\oplus\Gamma^8$, </p>
<p>Now, an identity, that $\Gamma^{8}\oplus\Gamma^8\oplus\Gamma^{1,1}=\Gamma^{16}\oplus\Gamma^{1,1} $ , which actually allows this T-Duality. Now, this means that it is <em>this very identity</em> which allows the identity mentioned in the original post. </p>
<p>So, the answer to your question is "<strong>Yes</strong>", there <em>is</em> a group-theoretical fact, and that is that $ \Gamma^{8}\oplus\Gamma^8\oplus\Gamma^{1,1}= \Gamma^{16}\oplus\Gamma^{1,1} $. </p>Wed, 06 Nov 2013 16:46:03 GMThttps://mathoverflow.net/questions/57529/does-so32-sim-t-e-8-times-e-8-relate-to-some-group-theoretical-fact/147129#147129Abhimanyu Pallavi Sudhir2013-11-06T16:46:03ZAnswer by Abhimanyu Pallavi Sudhir for Why does bosonic string theory require 26 spacetime dimensions?
https://mathoverflow.net/questions/99643/why-does-bosonic-string-theory-require-26-spacetime-dimensions/140354#140354
5<p>$$$$</p>
<p><em>Note, that here, the $\hat L_n$ are operators on the state given by the sums of the dots of the mode operators, i.e. $\hat L_0=\sum_{k=-\infty}^\infty\hat\alpha_{-n}\cdot\hat\alpha_n$.</em> </p>
<p>Also note that The Virasoro Algebra is the central extension of the Witt/Conformal Algebra so that explains why we have a $D$, it is equivalent to the central charge. </p>
<p>I'll expand on Chris Gerig's answer. </p>
<p>Not only do we need $D=26$, we also need the normal ordering constant $a=1$. The normal ordering constant is the eigenvalue of $\hat L_0$ with the eigenvector the state. </p>
<p>We want to promote the time-like states to spurious, zero-norm states, right? So, we impose the (level 1) spurious state conditions on the state as ffollows ($|\chi\rangle$ are the basis vectors to build the spurious state $\Phi\rangle$ on.) </p>
<p>$$ \begin{gathered}
0 = {{\hat L}_1}\left| \Phi \right\rangle \\
{\text{ }} = {{\hat L}_1}{{\hat L}_{ - 1}}\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left[ {{{\hat L}_{ - 1}},{{\hat L}_1}} \right]\left| {{\chi _1}} \right\rangle + {{\hat L}_{ - 1}}{{\hat L}_1}\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left[ {{{\hat L}_{ - 1}},{{\hat L}_1}} \right]\left| {{\chi _1}} \right\rangle \\
{\text{ }} = 2{{\hat L}_0}\left| {{\chi _1}} \right\rangle \\
{\text{ }} = 2{c_0}\left( {a - 1} \right)\left| {{\chi _1}} \right\rangle \\
\end{gathered} $$</p>
<p>That means that $a=1$. </p>
<p>Now, for a level 2 spurious state, </p>
<p>$$\begin{gathered}
\left[ {{{\hat L}_1},{{\hat L}_{ - 2}} + k{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right]\left| \psi \right\rangle = \left( {3{{\hat L}_{ - 1}} + 2k{{\hat L}_0}{{\hat L}_{ - 1}} + 2k{{\hat L}_{ - 1}}{{\hat L}_0}} \right)\left| \psi \right\rangle {\text{ }} \\
{\text{ }} = \left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_0}{{\hat L}_{ - 1}}{\text{ }}\left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_0}{{\hat L}_{ - 1}}{\text{ }} \\
0 = {{\hat L}_1}\left| \psi \right\rangle = {{\hat L}_1}\left( {{{\hat L}_{ - 2}} + k{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right)\left| {{\chi _1}} \right\rangle = \left( {\left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_0}{{\hat L}_{ - 1}}} \right)\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left( {\left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_{ - 1}}\left( {{{\hat L}_0} + 1} \right)} \right)\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left( {3 - 2k} \right){{\hat L}_{ - 1}}\left| {{\chi _1}} \right\rangle \\
2k = 3 \\
k = \frac{3}{2} \\
\end{gathered} $$ </p>
<p>Since this level 2 spurious state can be written as: </p>
<p>$$ {\left| \Phi \right\rangle = {{\hat L}_{ - 2}}\left| {{\chi _1}} \right\rangle + k{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}\left| {{\chi _2}} \right\rangle }$$ ## </p>
<p>So, then, </p>
<p>$$ \begin{gathered}
{{\hat L}_2}\left| \Phi \right\rangle = 0 \\
{{\hat L}_2}\left( {{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right)\left| {{\chi _2}} \right\rangle = 0 \\
\left[ {{{\hat L}_2},{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right]\left| {{\chi _2}} \right\rangle + \left( {{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right){{\hat L}_2}\left| {{\chi _2}} \right\rangle = 0 \\
\left[ {{{\hat L}_2},{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right]\left| {{\chi _2}} \right\rangle = 0 \\
\left( {13{{\hat L}_0} + 9{{\hat L}_{ - 1}}{{\hat L}_{ + 1}} + \frac{D}{2}} \right)\left| {{\chi _2}} \right\rangle = 0 \\
\frac{D}{2} = 13 \\
\text{Since $L_0|\chi_2\rangle = -|\chi_2\rangle$ and $L_{+1}|\chi_2\rangle=0$, we have }
D = 26 \\
\end{gathered} $$ \ </p>
<p>And then, finally,</p>
<p>Q.E.D. </p>
<p>So, this was done essentially to remove the imaginary norm ghost states and using the Canonical / Gupta - Bleuer formalism. </p>
<p>It's also possible to use , say, e.g. Light Cone Gauge (LCG) quantisation. However, in other quantisation methods, the conformal anomaly is manifest in other forms. E.g., in LCG quantisationn, it is manifest as a failure of lorentz symmetry. A good overview of this method can be found in <strong>Kaku</strong> <em>Strings, Conformal fields, and M-theory</em> (it's the only part of the book that I liked, actually. The rest of the book is too rigorous, without much physical intuition.). </p>Sun, 25 Aug 2013 09:40:17 GMThttps://mathoverflow.net/questions/99643/why-does-bosonic-string-theory-require-26-spacetime-dimensions/140354#140354Abhimanyu Pallavi Sudhir2013-08-25T09:40:17ZAnswer by Abhimanyu Pallavi Sudhir for How can you weigh your own head in an accurate way?
https://physics.stackexchange.com/questions/70839/how-can-you-weigh-your-own-head-in-an-accurate-way/71143#71143
11<p><strong>High school chemistry method</strong></p>
<p>Take pictures of your head from at least 129600 different equally spaced out angles. Samples:<br/><span class="math-container">$\hspace{100px}$</span><a href="https://i.stack.imgur.com/Sm7ud.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/Sm7ud.png" width="400"/></a>.</p>
<p>Load them onto the computer, use Mathematica to intersect images at different angles for an approximate 3D model of your face. Find a surface of best fit.</p>
<p>Sample parameterisation:
<span class="math-container">$$
\begin{alignat}{7}
x&=0.09 \, \cos\theta \, && \sin\varphi \\
y&=0.09 \, \sin\theta \, && \cos\varphi \\
z&=0.09 \, && \cos\varphi
\end{alignat}
$$</span></p>
<p>Use a dropper and a knife to take some small, random samples of your head – the flesh, the cavities (if you're actually following me on this, I'd assume a pretty big one near the top), etc. Your head will become perforated, but that's a small price to pay for the pleasure of finding things out.</p>
<p><span class="math-container">$\hspace{250px}$</span><a href="https://i.stack.imgur.com/BwC0e.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/BwC0e.png" width="100"/></a></p>
<p>Measure the densities of these samples – and do it <em>quickly</em>, before they dry up – and plot them as a scalar field across the 3D model.</p>
<p>Now use your perforated brain to calculate a simple volume integral:</p>
<p><span class="math-container">$$M=\iiint_V \rho\left(x,y,z\right) ~ \mathrm{d}V$$</span></p>
<p><em>Quod erat demonstrandum</em>. Also: <em>mortuus es</em>.</p>
<p><strong>High school physics method</strong></p>
<p>Bang your head on a table. Once it stabilizes, stick a spring balance to it and measure the force required to drag it. Apply <span class="math-container">$F_k=\mu_kgm$</span>.</p>
<p><span class="math-container">$\hspace{100px}$</span><a href="https://i.stack.imgur.com/Qb6Gu.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/Qb6Gu.png" width="400"/></a></p>
<p><strong>Just-learnt-quantum-mechanics method</strong></p>
<p>Pluck out your head. Launch it into free space, and keep observing it. Once you have a good data set (this might take a while), apply Schrodinger's equation.</p>
<p><span class="math-container">$\hspace{175px}$</span><a href="https://i.stack.imgur.com/9SJxw.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/9SJxw.png" width="250"/></a></p>
<p><em>The classic way to launch a head into space is via a space shuttle, but there's some probability of a mere human throw achieving escape velocity.</em></p>
<p><strong>Just-learned-general-relativity method</strong></p>
<p>Same as above, but calculate the metric tensor near your head and compare to the Kerr metric. Here it is, for reference:</p>
<p><span class="math-container">$$ c_0^{2} d\tau^{2}
= c_0^{2}\left( 1 - \frac{r_{s} r}{\rho^{2}} \right) \mbox{d}t^{2} - \frac{\rho^{2}}{\Delta} \mbox{d}r^{2} - \rho^{2} \mbox{d}\theta^{2}
- \left( r^{2} + \alpha^{2} + \frac{r_{s} r \alpha^{2}}{\rho^{2}} \sin^{2} \theta \right) \sin^{2} \theta \mbox{d} \phi^{2}
+ \frac{2r_{s} r\alpha \sin^{2} \theta }{\rho^{2}} c_0 \mbox{d}t \mbox{d}\phi $$</span></p>
<p><strong>Just-learnt-special-relativity method</strong> </p>
<p>Get into an insulated environment in a power station with 50% efficiency. Bang your head against your antiparticle's to free yourself of fermionic matter. Get someone to use the generated electricity to power a donkey's motion.</p>
<p><span class="math-container">$$m=\frac12 (\gamma_{\mathrm{donkey}}-1)m_{\mathrm{donkey}}$$</span></p>
<p><strong>This method suddenly looks normal</strong></p>
<p>Swap your neck to a spring balance and hang upside down. </p>
<p><span class="math-container">$\hspace{225px}$</span><a href="https://i.stack.imgur.com/q4uWz.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/q4uWz.png" width="150"/></a></p>Mon, 15 Jul 2013 09:35:33 GMThttps://physics.stackexchange.com/questions/70839/-/71143#71143Abhimanyu Pallavi Sudhir2013-07-15T09:35:33ZAnswer by Abhimanyu Pallavi Sudhir for Coincidence, purposeful definition, or something else in formulas for energy
https://physics.stackexchange.com/questions/71119/coincidence-purposeful-definition-or-something-else-in-formulas-for-energy/71121#71121
4<p>Most of them (all of your examples except <span class="math-container">$E=c^2m$</span>, which is really just <span class="math-container">$E=m$</span> anyway) arise from integrating a linear equation like <span class="math-container">$p=mv$</span> as <span class="math-container">$E=\int v\,dp$</span>, and it is often just a convention that we choose the linear relation to have a constant of proportionality of 1, so the integral has a constant of 1/2 (for example, we could've instead chosen, like we do with areas of circles, to have <span class="math-container">$c=2\pi r$</span> and <span class="math-container">$A=\pi r^2$</span>). </p>Mon, 15 Jul 2013 04:01:14 GMThttps://physics.stackexchange.com/questions/71119/-/71121#71121Abhimanyu Pallavi Sudhir2013-07-15T04:01:14ZSince when were Loop Quantum Gravity (LQG) and Einstein-Cartan (EC) theories experimentally proven?
https://physics.stackexchange.com/questions/69098/since-when-were-loop-quantum-gravity-lqg-and-einstein-cartan-ec-theories-exp
-2<p><a href="http://en.wikipedia.org/w/index.php?title=Template:Theories_of_gravitation&oldid=558771017" rel="nofollow noreferrer">A wikipedia template</a> lists under the heading "fully compatible with observation": <a href="http://en.wikipedia.org/wiki/Einstein%E2%80%93Cartan_theory" rel="nofollow noreferrer">Einstein-Cartan theory</a>, <a href="http://en.wikipedia.org/wiki/Gauge_theory_gravity" rel="nofollow noreferrer">Gauge theory gravity</a>, <a href="http://en.wikipedia.org/wiki/Teleparallelism" rel="nofollow noreferrer">Teleparalleism</a> and <a href="http://en.wikipedia.org/wiki/Euclidean_quantum_gravity" rel="nofollow noreferrer">Euclidean Quantum Gravity</a>.</p>
<p>Does this claim generally mean that:</p>
<ul>
<li>the specific, <em>new</em> predictions (disagreeing with classical general relativity) these theories make are confirmed experimentally? </li>
<li>or just that they reduce to classical general relativity in all the tests done so far/do not contradict any experimental evidence?</li>
</ul>
<p>Certainly What I find particularly odd is that string theory is <em>not</em> listed under "fully compatible with observation" but rather "disputed" -- which would be at odds with the second, weaker interpretation (since general relativity arises as an effective theory of string theory at the low-energy classical limit).</p>
<p>To add to the confusion, Loop Quantum gravity is listed under "Experimentally constrained", although it contradicts Lorentz symmetry <a href="http://motls.blogspot.in/2004/10/objections-to-loop-quantum-gravity.html#Clash_with_special_relativity" rel="nofollow noreferrer"><a href="http://en.wikipedia.org/w/index.php?title=Template:Theories_of_gravitation&oldid=558771017" rel="nofollow noreferrer">1</a></a>. I suppose similar comments apply to "<a href="http://en.wikipedia.org/wiki/Superfluid_vacuum_theory" rel="nofollow noreferrer">BEC vacuum theory</a>".</p>
<p>Can anyone make any sense out of what these terms mean here?</p>string-theoryexperimental-physicsquantum-gravitytheory-of-everythingloop-quantum-gravityTue, 25 Jun 2013 07:10:15 GMThttps://physics.stackexchange.com/q/69098Abhimanyu Pallavi Sudhir2013-06-25T07:10:15ZAnswer by Abhimanyu Pallavi Sudhir for Is velocity of light constant?
https://physics.stackexchange.com/questions/66856/is-velocity-of-light-constant/68513#68513
1<p>There are two questions here -- is the velocity of light <em>constant</em>, and is it <em>invariant</em>?</p>
<p>The direction/velocity of light changes whenever it interacts with something. This includes gravitational deflection, since things have to change direction in curved spacetime in one sense or another. The velocity isn't constant.</p>
<p>Is it invariant under Lorentz boosts in perpendiculal directions? <em>No.</em> The speed is invariant, but the velocity isn't. This should be fairly clear, but you can prove it with brute force --</p>
<p>We need to apply a boost to light's four-velocity, but the four-velocity of light is actually infinite -- it's (infinity, infinity, 0, 0), except the infinities satisfy a certain relation in the sense of being related through a limit. So we consider an object traveling at speed $w$ in the $x$-direction, boost $v$ in the $y$-direction and let $w\to c$. The four-velocity transforms under this boost as:</p>
<p>$$\left[ {\begin{array}{*{20}{c}}{\gamma (w)}\\{w\gamma (w)}\\0\\0\end{array}} \right] \to \left[ {\begin{array}{*{20}{c}}{\gamma (v)\gamma (w)}\\{w\gamma (w)}\\{ - v\gamma (v)\gamma (w)}\\0\end{array}} \right]$$</p>
<p>The conventional 3-velocity can be extracted here by considering $dx/dt$, $dy/dt$:</p>
<p>$$\frac{{dx}}{{dt}} = \frac{{dx/d\tau }}{{dt/d\tau }} = \frac{{w\gamma (w)}}{{\gamma (v)\gamma (w)}} = \frac{w}{{\gamma (v)}}$$
$$\frac{{dy}}{{dt}} = \frac{{dy/d\tau }}{{dt/d\tau }} = \frac{{ - v\gamma (v)\gamma (w)}}{{\gamma (v)\gamma (w)}} = - v$$</p>
<p>Taking the limit as $w\to 1$, you get a 3-velocity of $(1/\gamma(v),-v, 0)$ -- one may confirm that this is not equivalent to the original three-velocity that was $(1,0,0)$, but nonetheless has the same magnitude (speed is invariant).</p>Wed, 19 Jun 2013 04:17:58 GMThttps://physics.stackexchange.com/questions/66856/-/68513#68513Abhimanyu Pallavi Sudhir2013-06-19T04:17:58ZAnswer by Abhimanyu Pallavi Sudhir for A change in the gravitational law
https://physics.stackexchange.com/questions/41109/a-change-in-the-gravitational-law/68326#68326
5<p>Such a change requires a 4+1-dimensional spacetime instead of a 3+1-dimensional one -- this would have several serious implications --</p>
<ol>
<li><p>The Riemann curvature tensor gains new "parts" with interesting physical implications with each new spacetime dimension -- 1-dimensional manifolds have no curvature in this sense, 2-dimensional manifolds have a scalar curvature, 3-dimensional manifolds gain the full Ricci tensor, 4-dimensional manifolds get components corresponding to a new Weyl tensor and 5-dimensional geometry gets even more components, and general relativity in this spacetime is capable of explaining electromagnetism, too, so electromagnetism (along with the radion field) starts behaving as a part of gravity.</p></li>
<li><p>Apparently a 5-dimensional spacetime is unstable, according to wikipedia's "privileged character of 3+1-dimensional spacetime"<a href="http://en.wikipedia.org/wiki/Spacetime#Privileged_character_of_3.2B1_spacetime" rel="nofollow noreferrer">[1]</a> (now a transclusion of <a href="https://en.wikipedia.org/wiki/Anthropic_principle#Dimensions_of_spacetime" rel="nofollow noreferrer">[2]</a>).</p></li>
<li><p>The string theory landscape would be a bit smaller, since there are less dimensions to compactify.</p></li>
<li><p>The Ricci curvature in a vacuum on an Einstein Manifold would no longer be exactly $\Lambda g_{ab}$. There will be a coefficient of 2/3.</p></li>
<li><p>The magnetic field, among other things "cross product-ish", could not be written as a vector, unlike the electric field. This is because it would have 6 components whereas the spatial dimension is only 4. So, perhaps humans would become familiar with exterior algebras earlier than us who live in 3+1 dimensions. Either that or we would be trying to find out how magnetism works. Or we would just die out, for all the other reasons.</p></li>
<li><p>In string theory (see e.g. <a href="http://arxiv.org/abs/hep-th/0207249v1" rel="nofollow noreferrer">[3]</a>), gravitational constants in successively higher dimensions are calculated as $G_{n+1}=l_sG_n$, where $l_s$ is the string length (the units must be different in order to accomodate the extra factor of $r$ in Newton's gravitational law). For distance scales greater than the string length, this causes gravity to be much weaker than in our number of dimensions, but stronger for length scales shorter than the string length. It's interesting how gravity's long-range ability peaks at 4 dimensions (it is a contact force below 4 dimensions).</p></li>
</ol>
<p>See also some recent tests of the inverse square law at short length scales (to check for compactification -- <a href="http://arxiv.org/abs/hep-ph/0011014" rel="nofollow noreferrer">[4]</a>.</p>Mon, 17 Jun 2013 10:12:52 GMThttps://physics.stackexchange.com/questions/41109/-/68326#68326Abhimanyu Pallavi Sudhir2013-06-17T10:12:52ZAnswer by Abhimanyu Pallavi Sudhir for Mass of a superstring between two branes?
https://physics.stackexchange.com/questions/46118/mass-of-a-superstring-between-two-branes/68240#68240
2<p>It's similar -- </p>
<p>$${m^2} = \left( {N - a} \right) + {\left( {\frac{y}{{2\pi }}} \right)^2}$$</p>
<p>The important difference is that the number operator and normal ordering constant change for a superstring, and vary by sector.</p>Sun, 16 Jun 2013 11:12:27 GMThttps://physics.stackexchange.com/questions/46118/-/68240#68240Abhimanyu Pallavi Sudhir2013-06-16T11:12:27ZAnswer by Abhimanyu Pallavi Sudhir for Can someone please explain magnetic vs electric fields?
https://physics.stackexchange.com/questions/53916/can-someone-please-explain-magnetic-vs-electric-fields/65091#65091
3<p>The electric and magnetic fields arise as Lorentz duals of each other, with them mixing and transforming between each other through Lorentz boosts. The full picture of the field comes from the electromagnetic field tensor</p>
<p>$$F_{\mu\nu} = \begin{bmatrix}
0 & E_x/c & E_y/c & E_z/c \\
-E_x/c & 0 & -B_z & B_y \\
-E_y/c & B_z & 0 & -B_x \\
-E_z/c & -B_y & B_x & 0
\end{bmatrix}$$</p>
<p>Which satisfies simple identities (see <a href="https://en.wikipedia.org/wiki/Electromagnetic_tensor#Significance" rel="nofollow noreferrer">[1]</a>) equivalent to Maxwell's equations. The electric and magnetic fields are different components of this tensor, placed in similar positions as e.g. the momemtnum and shear stress in the 4d stress tensor.</p>Sun, 19 May 2013 05:01:31 GMThttps://physics.stackexchange.com/questions/53916/-/65091#65091Abhimanyu Pallavi Sudhir2013-05-19T05:01:31Z